Skip to Main content Skip to Navigation

Advanced information extraction by example

Abstract : Searching for information on the Web is generally achieved by constructing a query from a set of keywords and firing it to a search engine. This traditional method requires the user to have a relatively good knowledge of the domain of the targeted information to come up with the correct keywords. The search results, in the form of Web pages, are ranked based on the relevancy of each Web page to the given keywords. For the same set of keywords, the Web pages returned by the search engine would be ranked differently depending on the user. Moreover, finding specific information such as a country and its capital city would require the user to browse through all the documents and reading its content manually. This is not only time consuming but also requires a great deal of effort. We address in this thesis an alternative method of searching for information, i.e. by giving examples of the information in question. First, we try to improve the accuracy of the search by example systems by expanding the given examples syntactically. Next, we use truth discovery paradigm to rank the returned query results. Finally, we investigate the possibility of expanding the examples semantically through labelling each group of elements of the examples.
Complete list of metadata
Contributor : ABES STAR :  Contact
Submitted on : Friday, April 9, 2021 - 4:13:07 PM
Last modification on : Wednesday, November 3, 2021 - 6:18:17 AM
Long-term archiving on: : Monday, July 12, 2021 - 9:22:58 AM


Version validated by the jury (STAR)


  • HAL Id : tel-03194624, version 1



Ngurah Agus Sanjaya Er. Advanced information extraction by example. Information Retrieval [cs.IR]. Télécom ParisTech, 2018. English. ⟨NNT : 2018ENST0060⟩. ⟨tel-03194624⟩



Record views


Files downloads