H. Akaike, Fitting autoregressive models for prediction, Annals of the Institute of Statistical Mathematics, vol.28, issue.1, pp.243-247, 1969.
DOI : 10.1007/BF02532251

H. Akaike, Information theory and an extension of the maximum likelihood principle, Second International Symposium on Information Theory (Tsahkadsor, pp.267-281, 1971.

H. Akaike, Time series analysis and control through parametric models Applied Time Series Analysis, p.42, 1978.

P. Alquier and X. Li, Prediction of Quantiles by Statistical Learning and Application to GDP Forecasting, Discovery Science, pp.22-36, 2012.
DOI : 10.1007/978-3-642-33492-4_5
URL : https://hal.archives-ouvertes.fr/hal-00671982

P. Alquier and O. Wintenberger, Model selection for weakly dependent time series forecasting, Bernoulli, vol.18, issue.3, pp.883-913, 2012.
DOI : 10.3150/11-BEJ359
URL : https://hal.archives-ouvertes.fr/inria-00386733

Y. Amit and D. Geman, Shape Quantization and Recognition with Randomized Trees, Neural Computation, vol.1, issue.1, pp.1545-1588, 1997.
DOI : 10.1016/0031-3203(90)90098-6

O. Anava, E. Hazan, S. Mannor, and O. Shamir, Online learning for time series prediction, J. Mach. Learn. Res, vol.30, issue.84, pp.172-184, 2013.

C. Andrieu and A. Doucet, An improved method for uniform simulation of stable minimum phase real ARMA (p,q) processes, IEEE Signal Processing Letters, vol.6, issue.6, pp.142-144, 1999.
DOI : 10.1109/97.763147

O. Arkoun, Sequential Adaptive Estimators in Nonparametric Autoregressive Models, Sequential Analysis, vol.9, issue.2, pp.229-247, 2011.
DOI : 10.1137/1135065
URL : https://hal.archives-ouvertes.fr/hal-00465587

Y. F. Atchadé, An Adaptive Version for the Metropolis Adjusted Langevin Algorithm with a Truncated Drift, Methodology and Computing in Applied Probability, vol.22, issue.2, pp.235-254, 2006.
DOI : 10.1007/s11009-006-8550-0

J. Audibert, PAC-Bayesian Statistical Learning Theory, p.63, 2004.

J. Audibert, Fast learning rates in statistical inference through aggregation, The Annals of Statistics, vol.37, issue.4, pp.1591-1646, 2009.
DOI : 10.1214/08-AOS623
URL : https://hal.archives-ouvertes.fr/hal-00139030

J. Audibert and O. Catoni, Robust linear regression through pac-bayesian truncation, p.40, 2010.

J. Audibert and O. Catoni, Robust linear least squares regression, The Annals of Statistics, vol.39, issue.5, pp.2766-2794, 2011.
DOI : 10.1214/11-AOS918SUPP
URL : https://hal.archives-ouvertes.fr/hal-00522534

P. Auer, N. Cesa-bianchi, and C. Gentile, Adaptive and Self-Confident On-Line Learning Algorithms, Journal of Computer and System Sciences, vol.64, issue.1, pp.48-75, 2000.
DOI : 10.1006/jcss.2001.1795

J. Baranger and C. Brezinski, Analyse numérique, p.135, 1991.

A. R. Barron, Are Bayes Rules Consistent in Information?, Open Problems in Communication and Computation, pp.85-91, 1987.
DOI : 10.1007/978-1-4612-4808-8_22

A. R. Barron and T. M. Cover, Minimum complexity density estimation, IEEE Transactions on Information Theory, vol.37, issue.4, pp.1034-1054, 1991.
DOI : 10.1109/18.86996

P. H. Baxendale, Renewal theory and computable convergence rates for geometrically ergodic Markov chains, The Annals of Applied Probability, vol.15, issue.1B, pp.700-738, 2005.
DOI : 10.1214/105051604000000710

E. R. Beadle and P. M. Djuri´cdjuri´c, Uniform random parameter generation of stable minimum-phase real ARMA (p,q) processes, IEEE Signal Processing Letters, vol.4, issue.9, pp.259-261, 1999.
DOI : 10.1109/97.623043

K. N. Berk, Consistent Autoregressive Spectral Estimates, The Annals of Statistics, vol.2, issue.3, pp.489-502, 1974.
DOI : 10.1214/aos/1176342709
URL : http://projecteuclid.org/download/pdf_1/euclid.aos/1176342709

R. J. Bhansali, Linear Prediction by Autoregressive Model Fitting in the Time Domain, The Annals of Statistics, vol.6, issue.1, pp.224-231, 1978.
DOI : 10.1214/aos/1176344081

L. Birgé, Approximation dans les espaces m???triques et th???orie de l'estimation, Zeitschrift f???r Wahrscheinlichkeitstheorie und Verwandte Gebiete, vol.3, issue.2, pp.181-237, 1983.
DOI : 10.1007/BF00532480

L. Birgé and P. Massart, An Adaptive Compression Algorithm in Besov Spaces, Constructive Approximation, vol.16, issue.1, pp.1-36, 2000.
DOI : 10.1007/s003659910001

L. Bottou, Online learning and stochastic approximations. On-line learning in neural networks, pp.142-186, 1998.

L. Bottou, Large-scale machine learning with stochastic gradient descent, Statistical learning and data science, pp.17-25, 2012.

L. Breiman, P. J. Brockwell, and R. A. Davis, Bagging predictors Introduction to time series and forecasting. Springer Texts in Statistics, With 1 CD-ROM (Windows, pp.123-140, 1996.

P. J. Brockwell and R. A. Davis, Time Series, p.141, 1991.
DOI : 10.1007/978-3-642-04898-2_595

O. Cappé, E. Moulines, and T. And-rydén, Inference in hidden Markov models Springer Series in Statistics With Randal Douc's contributions to Chapter 9 and Christian P. Robert's to Chapters 6, 7 and 13, With Chapter, Philippe Soulier and Moulines, and Chapter 15 by Stéphane Boucheron and Elisabeth Gassiat, p.48, 2005.

O. Catoni, A mixture approach to universal model selection, pp.49-133, 1997.

O. Catoni, Statistical learning theory and stochastic optimization Lecture notes from the 31st Summer School on Probability Theory held in Saint-Flour, Lecture Notes in Mathematics, vol.1851, issue.104, pp.64-76, 2001.

N. N. Cencov, A bound for an unknown distribution density in terms of the observations, Dokl. Akad. Nauk SSSR, vol.147, issue.16, pp.45-48, 1962.

N. Cesa-bianchi, Analysis of two gradient-based algorithms for on-line regression, Proceedings of the tenth annual conference on Computational learning theory , COLT '97, pp.392-411, 1999.
DOI : 10.1145/267460.267492

N. Cesa-bianchi and G. Lugosi, Prediction, learning, and games, pp.88-89, 2006.
DOI : 10.1017/CBO9780511546921

N. Cesa-bianchi, Y. Mansour, and G. Stoltz, Improved Second-Order Bounds for Prediction with Expert Advice, Learning theory, pp.217-232, 2005.
DOI : 10.1007/11503415_15
URL : https://hal.archives-ouvertes.fr/hal-00019799

J. B. Conway, Functions of one complex variable, Graduate Texts in Mathematics, vol.11, issue.146, 1973.

C. Coulon-prieur and P. Doukhan, A triangular central limit theorem under a new weak dependence condition, Statistics & Probability Letters, vol.47, issue.1, pp.61-68, 2000.
DOI : 10.1016/S0167-7152(99)00138-8

T. M. Cover, Universal Portfolios, Mathematical Finance, vol.9, issue.1, pp.1-29, 1991.
DOI : 10.1016/0378-4266(79)90023-2

R. Dahlhaus, Local inference for locally stationary time series based on the empirical spectral measure, Journal of Econometrics, vol.151, issue.2, pp.101-112, 2009.
DOI : 10.1016/j.jeconom.2009.03.002
URL : https://hal.archives-ouvertes.fr/hal-00577962

R. T. Dahlhaus, S. Rao, and C. Rao, Locally Stationary Processes, Time Series Analysis: Methods and Applications, pp.351-413, 2012.
DOI : 10.1016/B978-0-444-53858-1.00013-2

R. Dahlhaus and L. Giraitis, On the Optimal Segment Length for Parameter Estimates for Locally Stationary Time Series, Journal of Time Series Analysis, vol.19, issue.6, pp.629-655, 1998.
DOI : 10.1111/1467-9892.00114

R. Dahlhaus and W. Polonik, Nonparametric quasi-maximum likelihood estimation for Gaussian locally stationary processes, The Annals of Statistics, vol.34, issue.6, pp.2790-2824, 2006.
DOI : 10.1214/009053606000000867

R. Dahlhaus and W. Polonik, Empirical spectral processes for locally stationary time series, Bernoulli, vol.15, issue.1, pp.1-39, 2009.
DOI : 10.3150/08-BEJ137

R. Dahlhaus, S. Rao, and S. , Statistical inference for time-varying ARCH processes, The Annals of Statistics, vol.34, issue.3, pp.1075-1114, 2006.
DOI : 10.1214/009053606000000227

A. S. Dalalyan and A. B. Tsybakov, Aggregation by exponential weighting, sharp PAC-Bayesian bounds and sparsity, Machine Learning, pp.39-61, 2008.
DOI : 10.1007/s10994-008-5051-0
URL : https://hal.archives-ouvertes.fr/hal-00291504

A. S. Dalalyan and A. B. Tsybakov, Sparse regression learning by aggregation and Langevin Monte-Carlo, Journal of Computer and System Sciences, vol.78, issue.5, pp.1423-1443, 2012.
DOI : 10.1016/j.jcss.2011.12.023
URL : https://hal.archives-ouvertes.fr/hal-00362471

J. Dedecker, P. Doukhan, G. Lang, R. León, J. R. Louhichi et al., Weak dependence, Lecture Notes in Statistics, vol.190, issue.35, pp.34-64, 2007.
DOI : 10.1007/978-0-387-69952-3_2
URL : https://hal.archives-ouvertes.fr/hal-00686031

J. Dedecker and C. Prieur, New dependence coefficients. Examples and applications to statistics. Probab. Theory Related Fields, pp.203-236, 2005.

D. L. Donoho and I. M. Johnstone, Minimax estimation via wavelet shrinkage, The Annals of Statistics, vol.26, issue.3, pp.879-921, 1998.
DOI : 10.1214/aos/1024691081

P. Doukhan, Models, inequalities, and limit theorems for stationary sequences, Theory and applications of long-range dependence, pp.43-100, 2003.

P. Doukhan and S. Louhichi, A new weak dependence condition and applications to moment inequalities. Stochastic Process, Appl, vol.84, issue.33, pp.313-342, 1999.

P. Bibliography-doukhan and O. Wintenberger, Weakly dependent chains with infinite memory, Stochastic Processes and their Applications, vol.118, issue.11, pp.1997-2013, 2008.
DOI : 10.1016/j.spa.2007.12.004

M. Duflo, Random iterative models, Translated from the 1990 French original by Stephen S. Wilson and revised by the author, 1997.
DOI : 10.1007/978-3-662-12880-0

S. Y. Efroimovich and M. Pinsker, A self-educating nonparametric filtration algorithm. Automation and Remote Control, pp.58-65, 1984.

R. H. Farrell, On the Best Obtainable Asymptotic Rates of Convergence in Estimation of a Density Function at a Point, The Annals of Mathematical Statistics, vol.43, issue.1, pp.170-180, 1972.
DOI : 10.1214/aoms/1177692711

J. M. Flegal and G. L. Jones, Batch means and spectral variance estimators in Markov chain Monte Carlo, The Annals of Statistics, vol.38, issue.2, pp.1034-1070, 2010.
DOI : 10.1214/09-AOS735

D. P. Foster, Prediction in the Worst Case, The Annals of Statistics, vol.19, issue.2, pp.1084-1090, 1991.
DOI : 10.1214/aos/1176348140

Y. Freund, Boosting a Weak Learning Algorithm by Majority, Information and Computation, vol.121, issue.2, pp.256-285, 1995.
DOI : 10.1006/inco.1995.1136

T. W. Gamelin, Complex analysis. Undergraduate Texts in Mathematics, p.146, 2001.

S. Gerchinovitz, Prediction of individual sequences and prediction in the statistical framework: some links around sparse regression and aggregation techniques, p.88, 2011.
URL : https://hal.archives-ouvertes.fr/tel-00653550

S. Gerchinovitz, Sparsity regret bounds for individual sequences in online linear regression, J. Mach. Learn. Res, vol.14, issue.20, pp.729-769, 2013.
URL : https://hal.archives-ouvertes.fr/inria-00552267

C. J. Geyer, Practical Markov Chain Monte Carlo, Statistical Science, vol.7, issue.4, p.49, 1992.
DOI : 10.1214/ss/1177011137

S. Ghosal and A. W. Van-der-vaart, Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities, Ann. Statist, vol.29, issue.5, pp.1233-1263, 2001.

R. D. Gill and B. Y. Levit, Applications of the van Trees Inequality: A Bayesian Cramer-Rao Bound, Bernoulli, vol.1, issue.1/2, pp.59-79, 1995.
DOI : 10.2307/3318681

C. Giraud, Introduction to high-dimensional statistics, volume 139 of Monographs on Statistics and Applied Probability, p.49, 2015.

C. W. Granger, U. Bibliography-grenander, and G. Szeg?-o, Spectral analysis of economic time series In association with M. Hatanaka. Princeton Studies in Mathematical Economics, No. I Toeplitz forms and their applications, N.J, vol.5, issue.36, p.147, 1964.

Y. Grenier, Time-dependent ARMA modeling of nonstationary signals, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.31, issue.4, pp.899-911, 1983.
DOI : 10.1109/TASSP.1983.1164152

D. Haussler, J. Kivinen, and M. K. Warmuth, Sequential prediction of individual sequences under general loss functions, IEEE Transactions on Information Theory, vol.44, issue.5, pp.1906-1925, 1998.
DOI : 10.1109/18.705569

D. Hsu, S. M. Kakade, and T. Zhang, An analysis of random design linear regression, Proc. COLT. Citeseer, p.40, 2011.

C. M. Hurvich and C. Tsai, Regression and time series model selection in small samples, Biometrika, vol.76, issue.2, pp.297-307, 1989.
DOI : 10.1093/biomet/76.2.297

G. L. Jones, On the Markov chain central limit theorem, Probability Surveys, vol.1, issue.0, pp.299-320, 2004.
DOI : 10.1214/154957804100000051

G. L. Jones and J. P. Hobert, Honest Exploration of Intractable Probability Distributions via Markov Chain Monte Carlo, Statistical Science, vol.16, issue.4, pp.312-334, 2001.
DOI : 10.1214/ss/1015346317

A. Juditsky and A. Nemirovski, Functional aggregation for nonparametric regression, Ann. Statist, vol.28, issue.49, pp.681-712, 2000.

A. Kalai and S. Vempala, Efficient algorithms for universal portfolios, Proceedings 41st Annual Symposium on Foundations of Computer Science, pp.423-440, 2002.
DOI : 10.1109/SFCS.2000.892136

H. R. Künsch, A note on causal solutions for locally stationary ar-processes, p.98, 1995.

K. ?atuszy´nski?atuszy´nski, B. Miasojedow, and W. Niemiro, Nonasymptotic bounds on the estimation error of MCMC algorithms, Bernoulli, vol.19, issue.5A, pp.2033-2066, 2013.
DOI : 10.3150/12-BEJ442

K. ?atuszy´nski?atuszy´nski and W. Niemiro, Rigorous confidence bounds for MCMC under a geometric drift condition, Journal of Complexity, vol.27, issue.1, pp.23-38, 2011.
DOI : 10.1016/j.jco.2010.07.003

O. V. Lepski?-i, On a Problem of Adaptive Estimation in Gaussian White Noise, Theory of Probability & Its Applications, vol.35, issue.3, pp.459-470, 1990.
DOI : 10.1137/1135065

O. V. Lepski?-i, Asymptotically Minimax Adaptive Estimation. I: Upper Bounds. Optimally Adaptive Estimates, Theory of Probability & Its Applications, vol.36, issue.4, pp.645-659, 1991.
DOI : 10.1137/1136085

G. Leung and A. R. Barron, Information Theory and Mixing Least-Squares Regressions, IEEE Transactions on Information Theory, vol.52, issue.8, pp.3396-3410, 2006.
DOI : 10.1109/TIT.2006.878172

B. Lewis, R. Reinsel, and G. C. , Prediction of multivariate time series by autoregressive model fitting, Journal of Multivariate Analysis, vol.16, issue.3, pp.393-411, 1985.
DOI : 10.1016/0047-259X(85)90027-2

N. Littlestone and M. K. Warmuth, The Weighted Majority Algorithm, Information and Computation, vol.108, issue.2, pp.212-261, 1994.
DOI : 10.1006/inco.1994.1009

J. Makhoul, Linear prediction: A tutorial review, Proceedings of the IEEE, pp.561-580, 1975.
DOI : 10.1109/PROC.1975.9792

P. Massart, Concentration inequalities and model selection, volume 1896 of Lecture Notes in Mathematics Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, p.120, 2003.

D. A. Mcallester, PAC-Bayesian model averaging, Proceedings of the twelfth annual conference on Computational learning theory , COLT '99, pp.164-170, 1999.
DOI : 10.1145/307400.307435

K. L. Mengersen and R. L. Tweedie, Rates of convergence of the Hastings and Metropolis algorithms, The Annals of Statistics, vol.24, issue.1, pp.101-121, 1996.
DOI : 10.1214/aos/1033066201

S. Meyn and R. L. Tweedie, Markov chains and stochastic stability, p.49, 2009.

E. Moulines, P. Priouret, R. , and F. , On recursive estimation for time varying autoregressive processes, The Annals of Statistics, vol.33, issue.6, pp.2610-2654, 2005.
DOI : 10.1214/009053605000000624
URL : https://hal.archives-ouvertes.fr/hal-00022067

A. S. Nemirovski?-i, Necessary conditions for efficient estimation of functionals of a nonparametric signal observed in white noise, Teor. Veroyatnost. i Primenen, vol.35, issue.16, pp.83-91, 1990.

M. B. Priestley, Evolutionary spectra and non-stationary processes.(With discussion), J. Roy. Statist. Soc. Ser. B, vol.27, issue.36, pp.204-237, 1965.

P. Rigollet and A. B. Tsybakov, Sparse Estimation by Exponential Weighting, Statistical Science, vol.27, issue.4, pp.558-575, 2012.
DOI : 10.1214/12-STS393

E. Rio, In??galit??s de Hoeffding pour les fonctions lipschitziennes de suites d??pendantes, Comptes Rendus de l'Acad??mie des Sciences - Series I - Mathematics, vol.330, issue.10, pp.905-908, 2000.
DOI : 10.1016/S0764-4442(00)00290-1

G. O. Roberts and J. S. Rosenthal, General state space Markov chains and MCMC algorithms, Probability Surveys, vol.1, issue.0, pp.20-71, 2004.
DOI : 10.1214/154957804100000024

M. Rosenblatt, A central limit theorem and a strong mixing condition Linear processes and bispectra, Proc. Nat. Acad. Sci, pp.43-47265, 1956.

A. Sancetta, RECURSIVE FORECAST COMBINATION FOR DEPENDENT HETEROGENEOUS DATA, Econometric Theory, vol.1, issue.02, pp.598-631, 2010.
DOI : 10.1111/j.1467-9965.1991.tb00002.x

G. Schwarz, Estimating the Dimension of a Model, The Annals of Statistics, vol.6, issue.2, pp.461-464, 1978.
DOI : 10.1214/aos/1176344136

R. H. Shumway and D. S. Stoffer, Time series analysis and its applications, p.146, 2011.

G. Stoltz, Contributions to the sequential prediction of arbitrary sequences: applications to the theory of repeated games and empirical studies of the performance of the aggregation of experts. Habilitation à diriger des recherches, pp.51-57, 2011.

S. Rao and S. , On some nonstationary, nonlinear random processes and their stationary approximations, Adv. in Appl. Probab, vol.38, issue.6, pp.1155-1172, 2006.

H. Tong and K. Lim, Threshold Autoregression, Limit Cycles and Cyclical Data, Journal of the Royal Statistical Society, Series B, vol.42, issue.39, pp.245-292, 1980.
DOI : 10.1142/9789812836281_0002

A. B. Tsybakov, Optimal Rates of Aggregation, Learning Theory and Kernel Machines, pp.303-313, 2003.
DOI : 10.1007/978-3-540-45167-9_23
URL : https://hal.archives-ouvertes.fr/hal-00104867

A. B. Tsybakov, Introduction to nonparametric estimation Springer Series in Statistics Revised and extended from the 2004 French original, Translated by Vladimir Zaiats, pp.46-47, 2009.

L. G. Valiant, A theory of the learnable, Communications of the ACM, vol.27, issue.11, pp.1134-1142, 1984.
DOI : 10.1145/1968.1972

V. A. Volkonski?-i and Y. A. Rozanov, Some Limit Theorems for Random Functions. I, Theory of Probability & Its Applications, vol.4, issue.2, pp.178-197, 1959.
DOI : 10.1137/1104015

V. Vovk, A game of prediction with expert advice, J. Comput. System Sci. Eighth Annual Workshop on Computational Learning Theory (COLT), vol.56, issue.2, pp.153-173, 1995.

V. Vovk, On-Line Regression Competitive with Reproducing Kernel Hilbert Spaces, Lecture Notes in Comput. Sci, vol.3959, issue.20, pp.452-463, 2006.
DOI : 10.1007/11750321_43

V. G. Vovk, AGGREGATING STRATEGIES, Proc. Third Workshop on Computational Learning Theory, pp.371-383, 1990.
DOI : 10.1016/B978-1-55860-146-8.50032-1

P. Whittle, On the fitting of multivariate autoregressions, and the approximate canonical factorization of a spectral density matrix, Biometrika, vol.50, issue.1-2, pp.129-134, 1963.
DOI : 10.1093/biomet/50.1-2.129

Y. Yang, Combining Different Procedures for Adaptive Regression, Journal of Multivariate Analysis, vol.74, issue.1, pp.135-161, 2000.
DOI : 10.1006/jmva.1999.1884

Y. Yang, Mixing strategies for density estimation, The Annals of Statistics, vol.28, issue.1, pp.75-87, 2000.
DOI : 10.1214/aos/1016120365

Y. Yang, COMBINING FORECASTING PROCEDURES: SOME THEORETICAL RESULTS, Econometric Theory, vol.137, issue.01, 2004.
DOI : 10.1017/S0266466604201086

Y. Yang and A. Barron, Information-theoretic determination of minimax rates of convergence, Ann. Statist, vol.27, issue.16, pp.1564-1599, 1999.