
An underlying common requirement is to annotate the rows of Web tables with semantically rich descriptions of entities published in Web KBs. Web tables constitute valuable sources of information for various applications, ranging from Web search to Knowledge Base (KB) augmentation.

We employ a feature-based approach for entity ranking and schema determination, combining deep semantic features with task-speciic signals. This problem is decomposed into three speciic subtasks: (i) core column entity ranking, (ii) schema determination, and (iii) value lookup. We introduce and address the task of on-they table generation: given a query, generate a relational table that contains relevant entities (as rows) along with their key properties (as columns). Unlike previous work, which is limited to retrieving existing tables, we aim to answer queries by automatically compiling a table in response to a query. Many information needs revolve around entities, which would be better answered by summarizing results in a tabular format, rather than presenting them as a ranked list. When evaluated using three purpose-built test collections, we find that our proposed approaches obtain a marked improvement in terms of precision over our baselines whilst keeping recall stable. Our method identifies not only out-of-KB (``novel'') information but also novel aliases for in-KB (``known'') entities. We refer to this process as novel entity discovery and, to the best of our knowledge, it is the first endeavor on mining the unlinked cells in web tables.

Then second task builds upon these linked entities and properties to not only identify novel ones in the same table but also to bootstrap their type and additional relationships. This first task aims to infer table semantics by linking table cells and heading columns to elements of a KB. Because web tables typically only contain raw textual content we first need to determine which cells refer to which known entities-a task we dub table-to-KB matching.

In particular, we can leverage the content in such tables to discover new entities, properties, and relationships. Tables on the Web, on the other hand, are abundant and have the distinct potential to assist with these tasks. As such they require a significant amount of labor. Both tasks are non-trivial as they require recall-oriented efforts to determine which entities and relationships are missing from the KB. When working with any sort of knowledge base (KB) one has to make sure it is as complete and also as up-to-date as possible.
