R. Messina and D. Jouvet, Sequential clustering algorithm for gaussian mixture initialisation, In proceedings ICASSP, 2004.
DOI : 10.1109/icassp.2004.1326115

C. Mokbel, Online adaptation of HMMs to real-life conditions: a unified framework, IEEE Transaction on Speech and Audio Processing, 2001.
DOI : 10.1109/89.917680

J. B. Galliano, E. Geoffrois, D. Mostefa, K. Choukri, and G. Gravier, The ester phase ii campaign for the rich transcription of french broadcast news, In proceedings Eurospeech Interspeech, 2005.

H. M. Digalakis and P. Monaco, Genones: generalized mixture tying in continuous hidden Markov model-based speech recognizers, IEEE Transactions on Speech and audio Processing p, pp.294-300, 1996.
DOI : 10.1109/89.506931

G. Linares, C. Fredouille, D. Matrouf, and P. Nocera, Segmentation en Macro-classes Acoustiques d'Emissions Radiophoniques dans le cadre d'ESTER, 2004.
URL : https://hal.archives-ouvertes.fr/hal-00477753

G. D. Guo and S. Z. Li, Content-based Audio Classification and Retrieval by Support Vector Machines, IEEE transactions on Neural Network, 2003.

J. Rouas, J. Pinquier, and R. A. Obrecht, A Fusion Study in Speech/Music Classification, proceedings ICASSP, 2003.

O. , M. J. Razik, D. Fohr, and P. Valles, Segmentation Parole/Musique pour la Transcription Rapide, 2004.

B. Logan, Mel Frequency Cepstral Coefficients for Music Modelling, proceedings of the International Symposium on Music Information Retrieval, 2000.

E. Scheirer and M. Slaney, Construction and Evaluation of a Robust Mainframe Speech/Music Discriminator, IEEE International Conference on Audio Speech and Signal Processing, pp.1331-1334, 1997.

]. A. Bibliographie1, M. Aiyer, M. Gales, and . Picheny, Rapid Likelihood Calculation of Subspace Clustered Gaussian Components, International Conference on Acoustics, Speech, and Signal Processing ICASSP, pp.1519-1522, 2000.

J. Ajmera, I. Mccowan, and H. Bourlard, Robust HMM-based Speech/Music Segmentation, International Conference on Acoustics, Speech, and Signal Processing ICASSP, pp.297-300, 2002.

F. Alleva, X. Huang, and M. Hwang, Improvements on the pronunciation prefix tree search organization, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, pp.134-136, 1996.
DOI : 10.1109/ICASSP.1996.540308

T. Anastasakos, J. Mcdonough, R. Schwartz, and J. Makhoul, A compact model for speaker-adaptive training, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96, pp.1137-1140, 1996.
DOI : 10.1109/ICSLP.1996.607807

E. Bocchieri, Vector quantization for the efficient computation of continuous density likelihoods, IEEE International Conference on Acoustics Speech and Signal Processing, pp.692-695, 1993.
DOI : 10.1109/ICASSP.1993.319405

E. Bocchieri and B. Mak, Subspace distribution clustering hidden Markov model, IEEE transactions on Speech and Audio Processing, pp.264-275, 2001.
DOI : 10.1109/89.906000

M. Carey, E. Parris, and H. Lloyd-thomas, A comparison of features for speech, music discrimination, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258), pp.149-152, 1999.
DOI : 10.1109/ICASSP.1999.758084

A. Chan, M. Ravishankar, and A. Rudnicky, On Improvements to CI based GMM Selection

A. Chan, J. Sherwani, R. Mosur, and A. Rudnicky, Four Layer Categorization Scheme of Fast GMM Computation Techniques in Large Vocabulary Continuous Speech Recognition Systems, proceedings ICSLP, 2004.

G. Chollet, Evaluation of ASR Systems, Algorithms and Databases, NATO-ASI : Speech Recognition and Coding. New Advances and Trends, 1995.
DOI : 10.1007/978-3-642-57745-1_3

G. Cook, J. Christie, P. Clarkson, and M. M. Hochberg, Real-Time Recognition of Broadcast News, International Conference on Acoustics, Speech, and Signal Processing ICASSP, pp.141-144, 1996.

P. Delacourt, La Segmentation and le Regroupement par Locuteurs pour l'Indexation de Documents Audio, 2000.

L. Deng and X. Huang, Challenges in adopting speech recognition, Communications of the ACM, pp.69-75, 2004.
DOI : 10.1145/962081.962108

V. Digalakis, P. Monaco, and H. Murveit, Genones: generalized mixture tying in continuous hidden Markov model-based speech recognizers, IEEE Transactions on Speech and audio Processing, pp.281-289, 1996.
DOI : 10.1109/89.506931

V. Digalakis, S. Tsakalidis, C. Harizakis, and L. Neumeyer, Efficient speech recognition using subvector quantization and discrete-mixture HMMS, Computer Speech and Language, pp.33-46, 2000.
DOI : 10.1006/csla.1999.0134

J. Dolmazon, F. Bimbot, G. Adda, M. El-b-`-eze, J. Caërou et al., Organisation de laPremì ere Campagne AUPELF pour l'Evaluation des Systèmes de Dictée Vocale, Journées Scientifiques and Techniques Francil, pp.13-18, 1997.

K. El-maleha, M. Kleina, G. Petrucci, and P. Kabal, Speech/Music Discrimination for Multimedia Applications, International Conference on Acoustics, Speech, and Signal Processing ICASSP, pp.2445-2448, 2000.

J. Fiscus, A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER), 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings, pp.347-354, 1997.
DOI : 10.1109/ASRU.1997.659110

C. Fredouille, D. Matrouf, G. Linares, and P. Nocera, Segmentation en Macro-classes Acoustiques d'Emissions Radiophoniques dans le cadre d'ESTER, Journées d'Etude sur la Parole JEP, pp.225-228, 2004.
URL : https://hal.archives-ouvertes.fr/hal-00477753

M. Gales, D. Kim, P. Woodlet, H. Chan, D. Mrva et al., Progress in the CU-HTK broadcast news transcription system, IEEE Transactions on Audio, Speech, and Language Processing, pp.1513-1525, 2006.
DOI : 10.1109/TASL.2006.878264

M. Gales, K. Knill, and S. Young, Use of Gaussian Selection in Large Vocabulary Continuous Speech Recognition using HMMs, Proceedings ICSLP, pp.470-473, 1996.

S. Galliano, E. Geoffrois, D. Mostefa, K. Choukri, J. Bonastre et al., The Ester Phase II Campaign for the Rich Transcription of French Broadcast News, European Conference on Speech, Communication and Technology Eurospeech, 2005.

J. Gauvain, G. Adda, L. Lamel, F. Lefèvre, and H. Schwenk, Transcription de la Parole Conversationnelle, Journées d'Etude sur la Parole JEP, F` es, 2004.
URL : https://hal.archives-ouvertes.fr/hal-01434260

J. Gauvain and L. Lamel, Large-vocabulary continuous speech recognition: advances and applications, proceedings of the IEEE, pp.1181-1200, 2000.
DOI : 10.1109/5.880079

G. Gravier, J. Bonastre, S. Galliano, E. Geoffrois, K. Mc-tait et al., ESTER Une Campagne d'Evaluation des Systèmes d'Indexation d'Emissions Radiophoniques, Journées d'Etude sur la Parole JEP, pp.253-256, 2004.

G. Gravier, F. Yvon, B. Jacob, and F. Bimbot, Sirocco : un Système Ouvert de Reconnaissance de la Parole, XXIVème Journées d'Etude sur la Parole, pp.273-276, 2002.

G. Guo and S. Li, Content-Based Audio Classification and Retrieval by Support Vector Machines, IEEE Transactions on Neural Network, pp.209-215, 2003.

J. Haton, C. Cerisara, D. Fohr, Y. Laprie, and K. Sma¨?lisma¨?li, Reconnaissance Automatique de la Parole du Signaì a son Interprétation, 2006.

M. Y. Hwang, Sub-phonetic Acoustic Modeling for Speaker Independent Continuous Speech Recognition, 1993.

F. Jelinek, Continuous speech recognition by statistical methods, Proceedings of the IEEE, pp.532-556, 1976.
DOI : 10.1109/PROC.1976.10159

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.473.9761

F. Jürgen and R. Ivica, The Bucket Box Intersection (BBI) Algorithm for Fast Approximative Evaluation of Diagonal Mixture Gaussians, Proceedings International Conference on Acoustics , Speech, and Signal Processing ICASSP, pp.837-840, 1996.

F. Jürgen, R. Ivica, and S. Tilo, Speeding up the Score Computation of HMM Speech Recognizers with the Bucket Voronoi Intersection Algorithm, European Conference on Speech, Communication and Technology Eurospeech, pp.1091-1094, 1995.

K. Knill, M. Gales, and S. Young, State based Gaussian Selection in Large Vocabulary Continuous Speech Recognition using HMMs, IEEE Transactions on Speech and Audio Processing, pp.152-161, 1999.

A. Lee, T. Kawahara, and K. Shikano, Gaussian mixture selection using context-independent HMM, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221), 2001.
DOI : 10.1109/ICASSP.2001.940769

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.4.7213

A. Lee, T. Kawahara, K. Takeda, and K. Shikano, A new phonetic tied-mixture model for efficient decoding, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100), pp.1269-1272, 2000.
DOI : 10.1109/ICASSP.2000.861808

J. Leppänen and I. Kiss, Gaussian Selection with Non-Overlapping Clusters for ASR in Embedded Devices, 2006 IEEE International Conference on Acoustics Speed and Signal Processing Proceedings, 2006.
DOI : 10.1109/ICASSP.2006.1659987

X. Li and J. Bilmes, Feature Pruning in Likelihood Evaluation of HMM-Based Speech Recognition, IEEE Workshop on Automatic Speech Recognition and Understanding ASRU, pp.303-308, 2003.

X. Li and J. Bilmes, Feature Pruning for Low-Power ASR Systems in Clean and Noisy Environments, IEEE Signal Processing Letters, 2005.

G. Linares, P. Nocera, and D. Matrouf, Partitionnement Dynamique des Distributions pour le Calcul des Emissions dans un Décodeur Acoustico-Phonétique Markovien, 2000.

B. Logan, Mel Frequency Cepstral Coefficients for Music Modelling, International Symposium on Music Information Retrieval, 2000.

L. Lu, H. Zhang, and H. Jiang, Content analysis for audio classification and segmentation, IEEE Transactions on Speech and Audio Processing, pp.504-516, 2002.
DOI : 10.1109/TSA.2002.804546

B. Mak, Towards A Compact Speech Recognizer : Subspace Distribution Clustering Hidden Markov Model, 1998.

B. Mak, E. Bocchieri, and E. Barnard, Stream derivation and clustering scheme for subspace distribution clustering hidden Markov model, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings, pp.339-346, 1997.
DOI : 10.1109/ASRU.1997.659109

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.37.8516

L. Mangu, E. Brillet, and A. Stolke, Finding Consensus Among Words : Lattice-Based Word Error Minimization, proceedings Eurospeeech, pp.495-498, 1999.
DOI : 10.1006/csla.2000.0152

URL : http://arxiv.org/abs/cs/0010012

R. Messina and D. Jouvet, Sequential Clustering Algorithm for Gaussian Mixture Initialisation, International Conference on Acoustics, Speech, and Signal Processing ICASSP, pp.264-275, 2004.
DOI : 10.1109/icassp.2004.1326115

C. Mokbel, Online adaptation of HMMs to real-life conditions: a unified framework, IEEE Transaction on Speech and Audio Processing, pp.342-357, 2001.
DOI : 10.1109/89.917680

N. Morgan, E. Luissier, A. Janin, and B. Kingsbury, Reducing errors by increasing the error rate : MLP Acoustic Modeling for Broadcast News Transcription, DARPA Broadcast News Workshop, 1999.

H. Murveit, P. Monaco, V. Digalakis, and J. Butzberger, Techniques to achieve an accurate real-time large-vocabulary speech recognition system, Proceedings of the workshop on Human Language Technology , HLT '94, pp.393-398, 1994.
DOI : 10.3115/1075812.1075903

H. Ney and S. Ortmann, Progress in Dynamic Programming Search for LVCSR, IEEE Signal Processing Magazine, pp.1224-1240, 2000.

H. Ney, R. Uaeb-umbach, B. Tran, and M. Oerder, Improvements in beam search for 10000-word continuous speech recognition, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing, pp.9-12, 1992.
DOI : 10.1109/ICASSP.1992.225985

. Nist, National institute of standards and technology. www.nist.giv/speech, 2003.

J. and J. Odell, The Use of Context in Large Vocabulary Speech Recognition, 1995.

J. Olsen, Gaussian Selection using Multiple Quantisation Indexes, IEEE Nordic Processing symposium, 2000.

S. Ortmanns, A. Eiden, H. Ney, and N. Coenen, Look-Ahead Techniques for Fast Beam Search, International Conference on Acoustics, Speech, and Signal Processing ICASSP, pp.1783-1789, 1997.
DOI : 10.1109/icassp.1997.598876

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.18.8232

S. Ortmanns, A. Eiden, H. Ney, and N. Coenen, Look-Ahead Techniques for Improved Beam Search, CRIM-FORWISS Workshop, pp.10-22, 1997.
DOI : 10.1109/icassp.1997.598876

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.18.8232

S. Ortmanns, H. Ney, and A. Eiden, Language-model look-ahead for large vocabulary speech recognition, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96, pp.2091-2094, 1996.
DOI : 10.1109/ICSLP.1996.607215

S. Ortmanns, H. Ney, and T. Firslaff, Fast Likelihood Computation Methods for Continuous Mixture Densities in Large Vocabulary Speech Recognition, European Conference on Speech Communication and Technology, pp.139-142, 1998.

M. Padmanabhan and M. Picheny, Large Vocabulary Speech Recognition Algorithms, Computer Magazine, 2002.
DOI : 10.1109/mc.2002.993770

M. Padmanablan, L. Bahl, and D. Nahamoo, Partitioning the Feature Space of a Classifier with Linear Hyperplanes, IEEE Transactions on Speech and Audio Processing, pp.282-288, 1999.

M. Padmanablan, E. Jan, L. Bahl, and M. Picheny, Decision-Tree based Feature Space Quantization for Fast Gaussian Computation, IEEE Workshop on Automatic Speech Recognition and Understanding, pp.325-330, 1997.

D. Pallett, A look at NIST'S benchmark ASR tests: past, present, and future, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721), pp.483-488, 2003.
DOI : 10.1109/ASRU.2003.1318488

C. Panagiotakis and G. Tziritas, A Speech/Mmusic Discriminator based on RMS and Zero- Crossings, IEEE Transactions on Multimedia, 2005.

D. Paul, An Investigation of Gaussian Shortlists, IEEE workshop on Automatic Speech Recognition and Understanding ASRU, 1999.

J. Pinquier, Indexation Sonore : Recherche de Composantes Primaires pour une Structuration Audiovisuelle, 2004.
URL : https://hal.archives-ouvertes.fr/tel-00008755

J. Pinquier, J. Rouas, and R. Obrecht, A fusion study in speech / music classification, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698), pp.17-20, 2003.
DOI : 10.1109/ICME.2003.1220941

M. Ravishankar, Sphinx-3 s3.x decoder (x=5) Sphinx Speech Group School of, 2004.

M. Ravishankar, R. Bisiani, and E. Thayer, Sub-vector Clustering to Improve Memory and Speed Performance of Acoustic Likelihood Computation, European Conference on Speech, Communication and Technology Eurospeech, 1997.

M. Ravishankar, R. Singh, B. Raj, and R. Stern, The 1999 CMU 10X Real Time Broadcast News Transcription System, Nist Speech Transcription Workshop, 2000.

J. Razik, D. Fohr, O. Mella, and P. Valles, Segmentation Parole/Musique pour la Transcription Rapide, Journées d'Etude sur la Parole JEP, F` es, 2004.

S. Renals, N. Morgan, H. Bourlard, M. Cohen, and H. Franco, Connectionist probability estimators in HMM speech recognition, IEEE Transactions on Speech and Audio Processing, pp.161-174, 1994.
DOI : 10.1109/89.260359

A. Sankar and V. Ramana, Parameter Tying and Gaussian Clustering for Faster, Better, and Smaller Speech Recognition, European Conference on Speech, Communication and Technology Eurospeech, Grèce, 1999.

A. Sankar, V. Ramana, A. Slolcke, and F. Weng, Improved modeling and efficiency for automatic transcription of Broadcast News, Speech Communication, pp.133-158, 2002.
DOI : 10.1016/S0167-6393(01)00063-2

J. Saunders, Real-time discrimination of broadcast speech/music, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, pp.993-996, 1996.
DOI : 10.1109/ICASSP.1996.543290

E. Scheirer and M. Slaney, Construction and Evaluation of a Robust Mainframe Speech/Music Discriminator, IEEE International Conference on Audio Speech and Signal Processing, pp.1331-1334, 1997.

M. Seck, Détection de Ruptures and Suivi de Classes de Sons pour l'Indextaion Sonore, 2001.

S. Srivastava, Fast Gaussian Evaluation in Large Vocabulary Continuous Speech Recognition, 2002.

J. Suontausts, J. Hakkinen, and O. Viikki, Fast Decoding Techniques for Practical Realtime Speech Recognition Systems, IEEE Workshop on Automatic Speech Recognition and Understanding ASRU, 1999.

S. Takahashi and S. Sagayama, Four-level tied-structure for efficient representation of acoustic modeling, 1995 International Conference on Acoustics, Speech, and Signal Processing, pp.520-523, 1995.
DOI : 10.1109/ICASSP.1995.479643

M. Woszczyna, Fast Speaker Independent Large Vocabulary Speech Recognition, 1998.

M. Woszczyna and M. Finke, Minimizing search errors due to delayed bigrams in real-time speech recognition systems, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, pp.137-140, 1996.
DOI : 10.1109/ICASSP.1996.540309

S. Young, Statistical Modelling in Continuous Speech Recognition (CSR), Proceedings of the 17th International Conference on Uncertainty in Artificial Intelligence, 2001.

S. Young, G. Evermann, D. Kershaw, G. Moore, J. Odell et al., The HTK Book version 3.2, 2002.

S. Young, J. Odell, and P. Woodland, Tree-based State Tying for High Accuracy Acoustic Modeling, proceedings ARPA Workshop on Human Language Technology, pp.307-312, 1994.

Q. Zhu, A. Stolckea, B. Chen, and N. Morgan, Using MLP Features in SRI's Conversational Speech Recognition System, European Conference on Speech Communication and Technology, pp.2141-2144, 2005.

L. Zouari and G. Chollet, Efficient Mixture for Speech Recognition, International Conference in Pattern Recognition, ICPR, pp.294-297, 2006.

L. Zouari and G. Chollet, Sélection des Paramètres pour la Discrimination Parole/non Parole d' ´ Emissions Radio Diffusées. InCinquì emé edition des Ateliers de Travail sur le Traitement and l'Analyse de l'Information TAIMA, 2007.