Big data analysis in the field of transportation

Abstract : The aim of this thesis is to apply new methodologies to public transportation data. Indeed, we are more and more surrounded by sensors and computers generating huge amount of data. In the field of public transportation, smart cards generate data about our purchases and our travels every time we use them. In this thesis, we used this data for two purposes. First of all, we wanted to be able to detect passenger's groups with similar temporal habits. To that end, we began to use the Non-negative Matrix Factorization as a pre-processing tool for clustering. Then, we introduced the NMF-EM algorithm allowing simultaneous dimension reduction and clustering on a multinomial mixture model. The second purpose of this thesis is to apply regression methods on these data to be able to forecast the number of check-ins on a network and give a range of likely check-ins. We also used this methodology to be able to detect anomalies on the network.
Complete list of metadatas

Cited literature [144 references]  Display  Hide  Download

https://pastel.archives-ouvertes.fr/tel-02052118
Contributor : Abes Star <>
Submitted on : Thursday, February 28, 2019 - 12:07:07 PM
Last modification on : Thursday, September 12, 2019 - 3:01:51 AM
Long-term archiving on: Wednesday, May 29, 2019 - 6:50:39 PM

File

70378_CAREL_2019_archivage.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-02052118, version 1

Citation

Léna Carel. Big data analysis in the field of transportation. Statistics [math.ST]. Université Paris-Saclay, 2019. English. ⟨NNT : 2019SACLG001⟩. ⟨tel-02052118⟩

Share

Metrics

Record views

983

Files downloads

664