Structured machine learning methods for microbiology : mass spectrometry and high-throughput sequencing

Abstract : Using high-throughput technologies is changing scientific practices and landscape in microbiology. On one hand, mass spectrometry is already used in clinical microbiology laboratories. On the other hand, the last ten years dramatic progress in sequencing technologies allows cheap and fast characterization of microbial diversity in complex clinical samples. Consequently, the two technologies are approached in future diagnostics solutions. This thesis aims to play a part in new in vitro diagnostics (IVD) systems based on high-throughput technologies, like mass spectrometry or next generation sequencing, and their applications in microbiology.Because of the volume of data generated by these new technologies and the complexity of measured parameters, we develop innovative and versatile statistical learning methods for applications in IVD and microbiology. Statistical learning field is well-suited for tasks relying on high-dimensional raw data that can hardly be used by medical experts, like mass-spectrum classification or affecting a sequencing read to the right organism. Here, we propose to use additional known structures in order to improve quality of the answer. For instance, we convert a sequencing read (raw data) into a vector in a nucleotide composition space and use it as a structuredinput for machine learning approaches. We also add prior information related to the hierarchical structure that organizes the reachable micro-organisms (structured output).
Document type :
Theses
Complete list of metadatas

Cited literature [183 references]  Display  Hide  Download

https://pastel.archives-ouvertes.fr/tel-01336560
Contributor : Abes Star <>
Submitted on : Thursday, June 23, 2016 - 11:59:47 AM
Last modification on : Tuesday, November 13, 2018 - 10:10:09 AM
Long-term archiving on: Saturday, September 24, 2016 - 12:15:41 PM

File

2015ENMP0081_archivage.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01336560, version 1

Citation

Kevin Vervier. Structured machine learning methods for microbiology : mass spectrometry and high-throughput sequencing. Bioinformatics [q-bio.QM]. Ecole Nationale Supérieure des Mines de Paris, 2015. English. ⟨NNT : 2015ENMP0081⟩. ⟨tel-01336560⟩

Share

Metrics

Record views

1258

Files downloads

635