Skip to Main content Skip to Navigation

Sampling methods for scaling up empirical risk minimization

Abstract : In this manuscript, we present and study applied sampling strategies, with problems related to statistical learning. The goal is to deal with the problems that usually arise in a context of large data when the number of observations and their dimensionality constrain the learning process. We therefore propose to address this problem using two sampling strategies: - Accelerate the learning process by sampling the most helpful. - Simplify the problem by discarding some observations to reduce complexity and the size of the problem. We first consider the context of the binary classification, when the observations used to form a classifier come from a sampling / survey scheme and present a complex dependency structure. for which we establish bounds of generalization. Then we study the implementation problem of stochastic gradient descent when observations are drawn non uniformly. We conclude this thesis by studying the problem of graph reconstruction for which we establish new theoretical results,
Complete list of metadata
Contributor : ABES STAR :  Contact
Submitted on : Tuesday, April 27, 2021 - 3:54:07 PM
Last modification on : Wednesday, June 15, 2022 - 8:46:31 PM


Version validated by the jury (STAR)


  • HAL Id : tel-03209978, version 1


Guillaume Papa. Sampling methods for scaling up empirical risk minimization. Machine Learning [stat.ML]. Télécom ParisTech, 2018. English. ⟨NNT : 2018ENST0005⟩. ⟨tel-03209978⟩



Record views


Files downloads