Learning from multiple genomic information in cancer for diagnosis and prognosis

Abstract : Several initiatives have been launched recently to investigate the molecular characterisation of large cohorts of human cancers with various high-throughput technologies in order to understanding the major biological alterations related to tumorogenesis. The information measured include gene expression, mutations, copy-number variations, as well as epigenetic signals such as DNA methylation. Large consortiums such as “The Cancer Genome Atlas” (TCGA) have already gathered publicly thousands of cancerous and non-cancerous samples. We contribute in this thesis in the statistical analysis of the relationship between the different biological sources, the validation and/or large scale generalisation of biological phenomenon using an integrative analysis of genetic and epigenetic data.Firstly, we show the role of DNA methylation as a surrogate biomarker of clonality between cells which would allow for a powerful clinical tool for to elaborate appropriate treatments for specific patients with breast cancer relapses.In addition, we developed systematic statistical analyses to assess the significance of DNA methylation variations on gene expression regulation. We highlight the importance of adding prior knowledge to tackle the small number of samples in comparison with the number of variables. In return, we show the potential of bioinformatics to infer new interesting biological hypotheses.Finally, we tackle the existence of the universal biological phenomenon related to the hypermethylator phenotype. Here, we adapt regression techniques using the similarity between the different prediction tasks to obtain robust genetic predictive signatures common to all cancers and that allow for a better prediction accuracy.In conclusion, we highlight the importance of a biological and computational collaboration in order to establish appropriate methods to the current issues in bioinformatics that will in turn provide new biological insights.
Document type :
Theses
Complete list of metadatas

https://pastel.archives-ouvertes.fr/tel-01449202
Contributor : Abes Star <>
Submitted on : Monday, January 30, 2017 - 11:06:07 AM
Last modification on : Tuesday, November 13, 2018 - 10:10:40 AM

File

2015ENMP0086_archivage.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01449202, version 1

Citation

Matahi Moarii. Learning from multiple genomic information in cancer for diagnosis and prognosis. Quantitative Methods [q-bio.QM]. Ecole Nationale Supérieure des Mines de Paris, 2015. English. ⟨NNT : 2015ENMP0086⟩. ⟨tel-01449202⟩

Share

Metrics

Record views

535

Files downloads

275