Safe optimization algorithms for variable selection and hyperparameter tuning

Abstract : Massive and automatic data processing requires the development of techniques able to filter the most important information. Among these methods, those with sparse structures have been shown to improve the statistical and computational efficiency of estimators in a context of large dimension. They can often be expressed as a solution of regularized empirical risk minimization and generally lead to non differentiable optimization problems in the form of a sum of a smooth term, measuring the quality of the fit, and a non-smooth term, penalizing complex solutions. Although it has considerable advantages, such a way of including prior information, unfortunately introduces many numerical difficulties both for solving the underlying optimization problem and to calibrate the level of regularization. Solving these issues has been at the heart of this thesis. A recently introduced technique, called "Screening Rules", proposes to ignore some variables during the optimization process by benefiting from the expected sparsity of the solutions. These elimination rules are said to be safe when the procedure guarantees to not reject any variable wrongly. In this work, we propose a unified framework for identifying important structures in these convex optimization problems and we introduce the "Gap Safe Screening Rules". They allows to obtain significant gains in computational time thanks to the dimensionality reduction induced by this method. In addition, they can be easily inserted into iterative algorithms and apply to a large number of problems.To find a good compromise between minimizing risk and introducing a learning bias, (exact) homotopy continuation algorithms offer the possibility of tracking the curve of the solutions as a function of the regularization parameters. However, they exhibit numerical instabilities due to several matrix inversions and are often expensive in large dimension. Another weakness is that a worst-case analysis shows that they have exact complexities that are exponential in the dimension of the model parameter. Allowing approximated solutions makes possible to circumvent the aforementioned drawbacks by approximating the curve of the solutions. In this thesis, we revisit the approximation techniques of the regularization paths given a predefined tolerance and we propose an in-depth analysis of their complexity w.r.t. the regularity of the loss functions involved. Hence, we propose optimal algorithms as well as various strategies for exploring the parameters space. We also provide calibration method (for the regularization parameter) that enjoys globalconvergence guarantees for the minimization of the empirical risk on the validation data.Among sparse regularization methods, the Lasso is one of the most celebrated and studied. Its statistical theory suggests choosing the level of regularization according to the amount of variance in the observations, which is difficult to use in practice because the variance of the model is oftenan unknown quantity. In such case, it is possible to jointly optimize the regression parameter as well as the level of noise. These concomitant estimates, appeared in the literature under the names of Scaled Lasso or Square-Root Lasso, and provide theoretical results as sharp as that of theLasso while being independent of the actual noise level of the observations. Although presenting important advances, these methods are numerically unstable and the currently available algorithms are expensive in computation time. We illustrate these difficulties and we propose modifications based on smoothing techniques to increase stability of these estimators as well as to introduce a faster algorithm.
Document type :
Theses
Complete list of metadatas

Cited literature [182 references]  Display  Hide  Download

https://pastel.archives-ouvertes.fr/tel-01962450
Contributor : Abes Star <>
Submitted on : Thursday, December 20, 2018 - 3:58:06 PM
Last modification on : Friday, May 17, 2019 - 12:48:45 PM
Long-term archiving on : Friday, March 22, 2019 - 10:05:05 AM

File

75453_NDIAYE_2018_archivage.pd...
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01962450, version 1

Citation

Eugene Ndiaye. Safe optimization algorithms for variable selection and hyperparameter tuning. Optimization and Control [math.OC]. Université Paris-Saclay, 2018. English. ⟨NNT : 2018SACLT004⟩. ⟨tel-01962450⟩

Share

Metrics

Record views

436

Files downloads

360