Skip to Main content Skip to Navigation
Theses

Extension et interrogation de résumés de flux de données

Abstract : In the last few years, a new environment, in which data have to be collected and processed instantly when arriving, has emerged. To handle the large volume of data associated with this environment, new data processing model and techniques have to be set up ; they are referred as data stream management. Data streams are usually continuous, voluminous, and cannot be registered integrally as persistent data. Many research works have handled this issue. Therefore, new systems called DSMS (Data Stream Management Systems) appeared. The DSMS evaluates continuous queries on a stream or a window (finite subset of streams). These queries have to be specified before the stream's arrival. Nevertheless, in case of some applications, some data could be required after their expiration from the DSMS in-memory. In this case, the system cannot treat the queries as such data are definitely lost. To handle this issue, it is essential to keep a ummary of data stream. Many summaries algorithms have been developed. The selection of a summarizing method depends on the kind of data and the associated issue. In this thesis, we are first interested with the elaboration of a generic summary structure while coming to a compromise between the summary elaboration time and the quality of the summary. We introduce a new summary approach which is more efficient for querying very old data. Then, we focus on the uerying methods for these summaries. Our objective is to integrate the structure of generic summaries in the architecture of the existing DSMS. By this way, we extend the range of the possible queries. Thus, the processing of the queries on old stream data (expired data) becomes possible as well as queries on new stream data. To this end, we introduced two approaches. The difference between them is the role played by summary module when the query is evaluated.
Document type :
Theses
Complete list of metadatas

Cited literature [168 references]  Display  Hide  Download

https://pastel.archives-ouvertes.fr/pastel-00613122
Contributor : Nesrine Gabsi <>
Submitted on : Tuesday, August 2, 2011 - 8:05:43 PM
Last modification on : Tuesday, October 20, 2020 - 10:23:21 AM

Identifiers

  • HAL Id : pastel-00613122, version 1

Collections

Citation

Nesrine Gabsi. Extension et interrogation de résumés de flux de données. Base de données [cs.DB]. Télécom ParisTech, 2011. Français. ⟨pastel-00613122⟩

Share

Metrics

Record views

891

Files downloads

1103