Compressed k-Nearest Neighbors Ensembles for Evolving Data Streams - Archive ouverte HAL Access content directly
Conference Papers Year :

Compressed k-Nearest Neighbors Ensembles for Evolving Data Streams

(1) , (2, 3) , (1, 4) , (5) , (6)
1
2
3
4
5
6

Abstract

The unbounded and multidimensional nature, the evolution of data distributions with time, and the requirement of singlepass algorithms comprise the main challenges of data stream classification, which makes it impossible to infer learning models in the same manner as for batch scenarios. Data dimensionality reduction arises as a key factor to transform and select only the most relevant features from those streams in order to reduce algorithm space and time demands. In that context, Compressed Sensing (CS) encodes an input signal into lower-dimensional space, guaranteeing its reconstruction up to some distortion factor. This paper employs CS on data streams as a pre-processing step to support a k-Nearest Neighbors (kNN) classification algorithm, one of the most often used algorithms in the data stream mining area-all this while ensuring the key properties of CS hold. Based on topological properties, we show that our classification algorithm also preserves the neighborhood (withing an factor) of kNN after reducing the stream dimensionality with CS. As a consequence, end-users can set an acceptable error margin while performing such projections for kNN. For further improvements, we incorporate this method into an ensemble classifier, Leveraging Bagging, by combining a set of different CS matrices which increases the diversity inside the ensemble. An extensive set of experiments is performed on various datasets, and the results were compared against those yielded by current state-of-the-art approaches, confirming the good performance of our approaches.
Fichier principal
Vignette du fichier
bahri2020knn.pdf (497.44 Ko) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

hal-03189997 , version 1 (05-04-2021)

Identifiers

  • HAL Id : hal-03189997 , version 1

Cite

Maroua Bahri, Silviu Maniu, Albert Bifet, Rodrigo Fernandes de Mello, Nikolaos Tziortziotis. Compressed k-Nearest Neighbors Ensembles for Evolving Data Streams. ECAI 2020 - 24th European Conference on Artificial Intelligence, Aug 2020, Santiago de Compostella / Virtual, Spain. ⟨hal-03189997⟩
89 View
65 Download

Share

Gmail Facebook Twitter LinkedIn More