Skip to Main content Skip to Navigation

Big data analysis in the field of transportation

Abstract : The aim of this thesis is to apply new methodologies to public transportation data. Indeed, we are more and more surrounded by sensors and computers generating huge amount of data. In the field of public transportation, smart cards generate data about our purchases and our travels every time we use them. In this thesis, we used this data for two purposes. First of all, we wanted to be able to detect passenger's groups with similar temporal habits. To that end, we began to use the Non-negative Matrix Factorization as a pre-processing tool for clustering. Then, we introduced the NMF-EM algorithm allowing simultaneous dimension reduction and clustering on a multinomial mixture model. The second purpose of this thesis is to apply regression methods on these data to be able to forecast the number of check-ins on a network and give a range of likely check-ins. We also used this methodology to be able to detect anomalies on the network.
Complete list of metadata

Cited literature [164 references]  Display  Hide  Download
Contributor : ABES STAR :  Contact
Submitted on : Thursday, February 28, 2019 - 12:07:07 PM
Last modification on : Friday, August 5, 2022 - 2:49:41 PM
Long-term archiving on: : Wednesday, May 29, 2019 - 6:50:39 PM


Version validated by the jury (STAR)


  • HAL Id : tel-02052118, version 1


Léna Carel. Big data analysis in the field of transportation. Statistics [math.ST]. Université Paris-Saclay, 2019. English. ⟨NNT : 2019SACLG001⟩. ⟨tel-02052118⟩



Record views


Files downloads