Skip to Main content Skip to Navigation
Theses

Annotation sémantique floue de tableaux guidée par une ontologie

Abstract : This thesis presents a new method for annotating data tables using the know- ledge of an application domain described in an ontology. We first present our applicative context and a bibliographic study of other works about semantic an- notation and information extraction. Then we present the different steps of our annotation process, in which we annotate the cells, the columns and the relations of a given data table. Data are not annotated in the same way according to whe- ther they are symbolic or numeric. Thus, our first step is to distinguish between columns containing numeric or symbolic data. Symbolic data are annotated with the terms of the ontology, using a word to word comparison between the terms used in the data table and the terms defined in the ontology. Numeric data are extracted, along with the units in which those data are expressed : they are com- pared with the units and range defined in the ontology for numeric data types. The data type for each column is then identified using both the column contents (in a different way according to whether the column is symbolic or numeric) and the column title. When the data type of each column has been recognized, the se- mantic relations represented by the table are found using both the table title and the table signature which is compared to the signature of the relations defined in the ontology. The relations that are recognized in the table are then instanciated for each line in the table. Our annotation is fuzzy, that is, instead of linking a part of the table directly to its correspondant in the ontology, we give several values for the annotation, each with a confidence degree. The different steps of our annotation method have been evaluated during an experiment on the food microbiology domain.
Document type :
Theses
Domain :
Complete list of metadatas

Cited literature [63 references]  Display  Hide  Download

https://pastel.archives-ouvertes.fr/pastel-00003799
Contributor : Ecole Agroparistech <>
Submitted on : Thursday, June 5, 2008 - 8:00:00 AM
Last modification on : Monday, October 19, 2020 - 11:07:53 AM
Long-term archiving on: : Monday, July 26, 2010 - 8:22:33 PM

Identifiers

  • HAL Id : pastel-00003799, version 1

Collections

Citation

Gaëlle Hignette. Annotation sémantique floue de tableaux guidée par une ontologie. domain_other. AgroParisTech, 2007. English. ⟨NNT : 2007AGPT0052⟩. ⟨pastel-00003799⟩

Share

Metrics

Record views

304

Files downloads

843