Skip to Main content Skip to Navigation

Deciphering splicing with sparse regression techniques in the era of high-throughput RNA sequencing.

Abstract : The number of protein-coding genes in a human, a nematodeand a fruit fly are roughly equal.The paradoxical miscorrelation between the number of genesin an organism's genome and its phenotypic complexityfinds an explanation in the alternative natureof splicing in higher organisms.Alternative splicing largely increases the functionaldiversity of proteins encoded by a limitednumber of genes.It is known to be involved incell fate decisionand embryonic development,but also appears to be dysregulatedin inherited and acquired human genetic disorders,in particular in cancers.High-throughput RNA sequencing technologiesallow us to measure and question splicingat an unprecedented resolution.However, while the cost of sequencing RNA decreasesand throughput increases,many computational challenges arise from the discrete and local nature of the data.In particular, the task of inferring alternative transcripts requires a non-trivial deconvolution procedure.In this thesis, we contribute to deciphering alternative transcript expressions andalternative splicing events fromhigh-throughput RNA sequencing data.We propose new methods to accurately and efficientlydetect and quantify alternative transcripts.Our methodological contributionslargely rely on sparse regression techniquesand takes advantage ofnetwork flow optimization techniques.Besides, we investigate means to query splicing abnormalitiesfor clinical diagnosis purposes.We suggest an experimental protocolthat can be easily implemented in routine clinical practice,and present new statistical models and algorithmsto quantify splicing events and measure how abnormal these eventsmight be in patient data compared to wild-type situations.
Document type :
Complete list of metadata

Cited literature [200 references]  Display  Hide  Download
Contributor : ABES STAR :  Contact
Submitted on : Friday, January 12, 2018 - 10:36:08 AM
Last modification on : Wednesday, November 17, 2021 - 12:31:04 PM
Long-term archiving on: : Wednesday, May 23, 2018 - 7:42:29 PM


Version validated by the jury (STAR)


  • HAL Id : tel-01681314, version 2


Elsa Bernard. Deciphering splicing with sparse regression techniques in the era of high-throughput RNA sequencing.. Bioinformatics [q-bio.QM]. Université Paris sciences et lettres, 2016. English. ⟨NNT : 2016PSLEM063⟩. ⟨tel-01681314v2⟩



Record views


Files downloads