Gestion de l'incertitude dans le processus d'extraction de connaissances à partir de textes

Abstract : The increase of textual sources over the Web offers an opportunity for knowledge extraction and knowledge base creation. Recently, several research works on this topic have appeared or intensified. They generally highlight that to extract relevant and precise information from text, it is necessary to define a collaboration between linguistic approaches, e.g., to extract certain concepts regarding named entities, temporal and spatial aspects, and methods originating from the field of semantics' processing. Moreover, successful approaches also need to qualify and quantify the uncertainty present in the text. Finally, in order to be relevant in the context of the Web, the linguistic processing need to be consider several sources in different languages. This PhD thesis tackles this problematic in its entirety since our contributions cover the extraction, representation of uncertain knowledge as well as the visualization of generated graphs and their querying. This research work has been conducted within a CIFRE funding involving the Laboratoire d'Informatique Gaspard Monge (LIGM) of the Université Paris-Est Marne la Vallée and the GEOLSemantics start-up. It was leveraging from years of accumulated experience in natural language processing (GeolSemantics) and semantics processing (LIGM).In this context, our contributions are the following:- the integration of a qualifation of different forms of uncertainty, based on ontology processing, within the knowledge extraction processing,- the quantification of uncertainties based on a set of heuristics,- a representation, using RDF graphs, of the extracted knowledge and their uncertainties,- an evaluation and an analysis of the results obtained using our approach
Document type :
Theses
Complete list of metadatas

Cited literature [135 references]  Display  Hide  Download

https://pastel.archives-ouvertes.fr/tel-01306866
Contributor : Abes Star <>
Submitted on : Monday, April 25, 2016 - 4:45:07 PM
Last modification on : Thursday, July 5, 2018 - 2:45:56 PM
Long-term archiving on : Tuesday, July 26, 2016 - 1:10:14 PM

File

TH2015PESC1160.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01306866, version 1

Citation

Fadhela Kerdjoudj. Gestion de l'incertitude dans le processus d'extraction de connaissances à partir de textes. Informatique et langage [cs.CL]. Université Paris-Est, 2015. Français. ⟨NNT : 2015PESC1160⟩. ⟨tel-01306866⟩

Share

Metrics

Record views

613

Files downloads

815