Efficient Batch-Incremental Classification Using UMAP for Evolving Data Streams - Archive ouverte HAL Access content directly
Conference Papers Year : 2020

Efficient Batch-Incremental Classification Using UMAP for Evolving Data Streams

(1, 2) , (3) , (3, 1, 2) , (4, 5)
1
2
3
4
5

Abstract

Learning from potentially infinite and high-dimensional data streams poses significant challenges in the classification task. For instance, k-Nearest Neighbors (kNN) is one of the most often used algorithms in the data stream mining area that proved to be very resource-intensive when dealing with high-dimensional spaces. Uniform Manifold Approximation and Projection (UMAP) is a novel manifold technique and one of the most promising dimension reduction and visualization techniques in the non-streaming setting because of its high performance in comparison with competitors. However, there is no version of UMAP that copes with the challenging context of streams. To overcome these restrictions, we propose a batch-incremental approach that pre-processes data streams using UMAP, by producing successive embeddings on a stream of disjoint batches in order to support an incremental kNN classification. Experiments conducted on publicly available synthetic and real-world datasets demonstrate the substantial gains that can be achieved with our proposal compared to state-of-the-art techniques.
Fichier principal
Vignette du fichier
bahri2020efficient.pdf (1.54 Mo) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

hal-03190032 , version 1 (05-04-2021)

Identifiers

Cite

Maroua Bahri, Bernhard Pfahringer, Albert Bifet, Silviu Maniu. Efficient Batch-Incremental Classification Using UMAP for Evolving Data Streams. IDA 2020 - 18th International Symposium on Intelligent Data Analysis, Apr 2020, Konstanz / Virtual, Germany. pp.40-53, ⟨10.1007/978-3-030-44584-3_4⟩. ⟨hal-03190032⟩
165 View
279 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More