Skip to Main content Skip to Navigation

Clustering prédictif Décrire et prédire simultanément

Abstract : Predictive clustering is a new supervised learning framework derived from traditional clustering. This new framework allows to describe and to predict simultaneously. Compared to a classical supervised learning, predictive clsutering algorithms seek to discover the internal structure of the target class in order to use it for predicting the class of new instances.The purpose of this thesis is to look for an interpretable model of predictive clustering. To acheive this objective, we choose to modified traditional K-means algorithm. This new modified version is called predictive K-means. It contains 7 differents steps, each of which can be supervised seperatly from the others. In this thesis, we only deal four steps : 1) data preprocessing, 2) initialization of centers, 3) selecting of the best partition, and 4) importance of features.Our experimental results show that the use of just two supervised steps (data preprocessing and initialization of centers), allow the K-means algorithm to acheive competitive performances with some others predictive clustering algorithms.These results show also that our preprocessing methods can help predictive K-means algorithm to provide results easily comprehensible by users. We are also showing in this thesis that the use of our new measure to evaluate predictive clustering quality, helps our predictive K-means algorithm to find the optimal partition that establishes the best trade-off between description and prediction. It thus allows users to find the different reasons behind the same prediction : two differents instances could have the same predicted label.
Document type :
Complete list of metadata

Cited literature [171 references]  Display  Hide  Download
Contributor : ABES STAR :  Contact
Submitted on : Friday, August 28, 2020 - 5:09:16 PM
Last modification on : Wednesday, September 28, 2022 - 3:07:20 PM
Long-term archiving on: : Sunday, November 29, 2020 - 12:52:40 PM


Version validated by the jury (STAR)


  • HAL Id : tel-02925127, version 1


Oumaima Alaoui Ismaili. Clustering prédictif Décrire et prédire simultanément. Informatique et langage [cs.CL]. Université Paris Saclay (COmUE), 2016. Français. ⟨NNT : 2016SACLA010⟩. ⟨tel-02925127⟩



Record views


Files downloads