Skip to Main content Skip to Navigation

Applications of machine learning in computational biology

Abstract : Biotechnologies came to an era where the amount of information one has access to allows to think about biological objects as complex systems. In this context, the phenomena emerging from those systems are tightly linked to their organizational properties. This raises computational and statistical challenges which are precisely the focus of study of the machine learning community. This thesis is about applications of machine learning methods to study biological phenomena from a complex systems viewpoint. We apply machine learning methods in the context of protein-ligand interaction and side effect analysis, cell population phenotyping and experimental design for partially observed non linear dynamical systems.Large amount of data is available about marketed molecules, such as protein target interaction profiles and side effect profiles. This raises the issue of making sense of this data and finding structure and patterns that underlie these observations at a large scale. We apply recent unsupervised learning methods to the analysis of large datasets of marketed drugs. Examples show the relevance of extracted information which is further validated in a prediction context.The variability of the response to a treatment between different individuals poses the challenge of defining the effect of this stimulus at the level of a population of individuals. For example in the context High Content Screening, a population of cells is exposed to different stimuli. Between cell variability within a population renders the comparison of different treatments difficult. A generative model is proposed to overcome this issue and properties of the model are investigated based on experimental data.At the molecular scale, complex behaviour emerge from cascades of non linear interaction between molecular species. These non linearities leads to system identifiability issues. These can be overcome by specific experimental plan, one of the field of research in systems biology. A Bayesian iterative experimental design strategy is proposed and numerical results based on in silico biological network simulations are presented.
Document type :
Complete list of metadata

Cited literature [127 references]  Display  Hide  Download
Contributor : ABES STAR :  Contact
Submitted on : Wednesday, March 12, 2014 - 2:52:16 PM
Last modification on : Wednesday, November 17, 2021 - 12:30:57 PM
Long-term archiving on: : Thursday, June 12, 2014 - 11:41:48 AM


Version validated by the jury (STAR)


  • HAL Id : pastel-00958432, version 1


Edouard Pauwels. Applications of machine learning in computational biology. Agricultural sciences. Ecole Nationale Supérieure des Mines de Paris, 2013. English. ⟨NNT : 2013ENMP0052⟩. ⟨pastel-00958432⟩



Record views


Files downloads