Skip to Main content Skip to Navigation

Contributions to variable selection, clustering and statistical estimation inhigh dimension

Abstract : This PhD thesis deals with the following statistical problems: Variable selection in high-Dimensional Linear Regression, Clustering in the Gaussian Mixture Model, Some effects of adaptivity under sparsity and Simulation of Gaussian processes.Under the sparsity assumption, variable selection corresponds to recovering the "small" set of significant variables. We study non-asymptotic properties of this problem in the high-dimensional linear regression. Moreover, we recover optimal necessary and sufficient conditions for variable selection in this model. We also study some effects of adaptation under sparsity. Namely, in the sparse vector model, we investigate, the changes in the estimation rates of some of the model parameters when the noise level or its nominal law are unknown.Clustering is a non-supervised machine learning task aiming to group observations that are close to each other in some sense. We study the problem of community detection in the Gaussian Mixture Model with two components, and characterize precisely the sharp separation between clusters in order to recover exactly the clusters. We also provide a fast polynomial time procedure achieving optimal recovery.Gaussian processes are extremely useful in practice, when it comes to model price fluctuations for instance. Nevertheless, their simulation is not easy in general. We propose and study a new rate-optimal series expansion to simulate a large class of Gaussian processes.
Document type :
Complete list of metadata

Cited literature [171 references]  Display  Hide  Download
Contributor : ABES STAR :  Contact
Submitted on : Wednesday, August 14, 2019 - 9:03:06 AM
Last modification on : Friday, August 5, 2022 - 2:49:41 PM
Long-term archiving on: : Thursday, January 9, 2020 - 11:33:54 PM


Version validated by the jury (STAR)


  • HAL Id : tel-02266365, version 1


Mohamed Ndaoud. Contributions to variable selection, clustering and statistical estimation inhigh dimension. Statistics [math.ST]. Université Paris-Saclay, 2019. English. ⟨NNT : 2019SACLG005⟩. ⟨tel-02266365⟩



Record views


Files downloads