Skip to Main content Skip to Navigation

Méthodes d'apprentissage statistique pour le scoring

Abstract : Bipartite ranking is a statistical issue consisting in sorting objects lying in a multidimensional feature space, randomly associated with binary labels, so that positive instances appear on top of the list with highest probability. This research work aims at developing a tree-induction ranking method based on a top-down recursive partitioning strategy and leading to a scoring function summarized by a rooted, binary, left-right oriented tree graph. In order to improve the flexibility of this learning method, we introduce a partition-based procedure involving complex and adaptive splitting rules. We then tackle the classical issue of model selection and propose two penalization-based procedures providing the best ranking tree for prediction. Finally, in order to reduce the instability of ranking trees and increase their accuracy, we propose to adapt two re-sampling and aggregating procedures introduced by Breiman in the classification and regression contexts: bagging (1996) and random forests (2001). An empirical comparison between several versions of this ranking algorithm and state-of-the-art scoring methods is provided. We also present the results output on industrial objectivization data. Last but not least, we introduce a two-stage testing procedure aiming at solving the two-sample problem in a multidimensional setting, based on the proposed ranking algorithm and on one-dimensional rank tests.
Complete list of metadata
Contributor : Marine Depecker Connect in order to contact the contributor
Submitted on : Tuesday, March 1, 2011 - 2:48:31 PM
Last modification on : Friday, October 23, 2020 - 4:37:49 PM
Long-term archiving on: : Monday, May 30, 2011 - 2:57:27 AM


  • HAL Id : pastel-00572421, version 1


Marine Depecker. Méthodes d'apprentissage statistique pour le scoring. Apprentissage [cs.LG]. Télécom ParisTech, 2010. Français. ⟨pastel-00572421⟩



Record views


Files downloads