Learning from Sequences with Point Processes

Abstract : The guiding principle of this thesis is to show how the arsenal of recent optimization methods can help solving challenging new estimation problems on events models.While the classical framework of supervised learning treat the observations as a collection of independent couples of features and labels, events models focus on arrival timestamps to extract information from the source of data.These timestamped events are chronologically ordered and can't be regarded as independent.This mere statement motivates the use of a particular mathematical object called point process to learn some patterns from events.Two examples of point process are treated in this thesis.The first is the point process behind Cox proportional hazards model:its conditional intensity function allows to define the hazard ratio, a fundamental quantity in survival analysis literature.The Cox regression model relates the duration before an event called failure to some covariates.This model can be reformulated in the framework of point processes.The second is the Hawkes process which models how past events increase the probability of future events.Its multivariate version enables encoding a notion of causality between the different nodes.The thesis is divided into three parts.The first focuses on a new optimization algorithm we developed to estimate the parameter vector of the Cox regression in the large-scale setting.Our algorithm is based on stochastic variance reduced gradient descent (SVRG) and uses Monte Carlo Markov Chain to estimate one costly term in the descent direction.We proved the convergence rates and showed its numerical performance on both simulated and real-world datasets.The second part shows how the Hawkes causality can be retrieved in a nonparametric fashion from the integrated cumulants of the multivariate point process.We designed two methods to estimate the integrals of the Hawkes kernels without any assumption on the shape of the kernel functions. Our methods are faster and more robust towards the shape of the kernels compared to state-of-the-art methods. We proved the statistical consistency of the first method, and designed turned the second into a convex optimization problem.The last part provides new insights from order book data using the first nonparametric method developed in the second part.We used data from the EUREX exchange, designed new order book model (based on the previous works of Bacry et al.) and ran the estimation method on these point processes.The results are very insightful and consistent with an econometric analysis.Such work is a proof of concept that our estimation method can be used on complex data like high-frequency financial data.
Complete list of metadatas

Cited literature [144 references]  Display  Hide  Download

https://pastel.archives-ouvertes.fr/tel-01775239
Contributor : Abes Star <>
Submitted on : Tuesday, April 24, 2018 - 2:30:13 PM
Last modification on : Thursday, May 2, 2019 - 10:01:56 AM
Long-term archiving on : Wednesday, September 19, 2018 - 9:16:10 AM

File

65238_ACHAB_2017_archivage.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01775239, version 1

Citation

Massil Achab. Learning from Sequences with Point Processes. Computational Finance [q-fin.CP]. Université Paris-Saclay, 2017. English. ⟨NNT : 2017SACLX068⟩. ⟨tel-01775239⟩

Share

Metrics

Record views

584

Files downloads

431