Techniques de conversion de voix appliquéesappliquéesà l'impost ure. Traitement ET Analyse dE l'I nformation, 2009. ,
Reconnaissance de la parole en t emps réel pour le dialogue oral, Traitement ET Analyse dE l'Information: Méthodes et Applications (TAI MA), 2009. ,
Spoken Dialogue in Virt ual Worlds Chap. Development of Multimodal I nterfaces: Active Listening and Synchrony, LNCS, vol.5967, pp.423-443, 2010. ,
Aut omat ic Det ect ion of K nown Advert isement s in Radio Broadcast wit h Dat a-driven ALISP Transcript ions. I nternational Workshop on Content-Based Multimedia I ndexing (CBMI ), pp.223-228, 2011. ,
Une empreint e audiò a base d'ALISP appliquéè a l'ident ificat ion audio dans un flux radiophonique, Colloque en COmpression et REprésentation des Signaux Audiovisuels (CORESA), 2012. ,
Soft ware Radio FM Broadcast Receiver for Audio Indexing Applicat ions. I EEE I nternational Conference on I ndustrial Technology (I CI T, pp.585-590, 2012. ,
Prot otype of a radio-on-demand broadcast receiver wit h real t ime musical genre classificat ion, Conference on Design and Architectures for Signal and I mage Processing (DASI P), pp.1-2, 2012. ,
A Generic Audio Ident ificat ion Syst em for Radio Broadcast Monit oring Based on Dat a-driven Segment at ion. I EEE I nternational Symposium on Multimedia (I SM), pp.427-432, 2012. ,
Aut omat ic Det ect ion of K nown Advert isement s in Radio Broadcast wit h Dat a-driven ALISP Transcript ions, Multimedia Tools And Applications (MTAP), pp.35-49, 2013. ,
Unknown-Mult iple Speaker Clust ering Using Hmm, I nternational Conference on Spoken Language Processing, pp.573-576, 2002. ,
Basic local alignment search tool, Journal of Molecular Biology, vol.215, issue.3, pp.403-410, 1990. ,
DOI : 10.1016/S0022-2836(05)80360-2
Hybrid Speech/ nonspeech det ect or applied t o Speaker Diarizat ion of Meet ings, Speaker and Language Recognition Workshop, Odyssey, pp.1-6, 2006. ,
Speaker Diarization: A Review of Recent Research, IEEE Transactions on Audio, Speech, and Language Processing, vol.20, issue.2, pp.356-370, 2012. ,
DOI : 10.1109/TASL.2011.2125954
URL : https://hal.archives-ouvertes.fr/hal-00733397
Robust speaker segment at ion for meet ings: T he ICSI-SRI spring 2005 diarizat ion syst em. In I nternational conference on Machine Learning for Multimodal I nteraction, pp.26-38, 2005. ,
Bridging Lossy and Lossless Compression by Mot if Pat t ern Discovery General T heory of I nformation Transfer ,
Bridging lossy and lossless compression by mot if pat t ern discovery, pp.793-813, 2006. ,
The acoustic features of human laughter, The Journal of the Acoustical Society of America, vol.110, issue.3, pp.1581-1597, 2001. ,
DOI : 10.1121/1.1391244
Developments and directions in speech recognition and understanding, Part 1 [DSP Education], IEEE Signal Processing Magazine, vol.26, issue.3, pp.75-80, 2009. ,
DOI : 10.1109/MSP.2009.932166
Waveprint: Efficient wavelet-based audio fingerprinting, Pattern Recognition, vol.41, issue.11, pp.3467-3480, 2008. ,
DOI : 10.1016/j.patcog.2008.05.006
Mult ist age speaker diarizat ion of broadcast news. I EEE Transactions on Audio, Speech, and Language Processing, pp.1505-1512, 2006. ,
To cat ch a chorus: using chroma-based represent at ions for audio t humbnailing, Workshop on the Applications of Signal Processing to Audio and Acoustics, pp.15-18, 2001. ,
Décomposition harmonique des signaux audio appliquéappliquéà l'indexation audio, 2008. ,
Large vocabulary continuous speech recognition of Broadcast News ??? The Philips/RWTH approach, Speech Communication, vol.37, issue.1-2, pp.109-131, 2002. ,
DOI : 10.1016/S0167-6393(01)00062-0
Acoust ic analysis of laught er, I nternational Conference on Spoken Language Processing, pp.927-930, 1992. ,
An evaluat ion of t emporal decomposit ion, EUROSPEECH, 1991. ,
Joint-sequence models for grapheme-to-phoneme conversion, Speech Communication, vol.50, issue.5, pp.434-451, 2008. ,
DOI : 10.1016/j.specom.2008.01.002
URL : https://hal.archives-ouvertes.fr/hal-00499203
T he lia-eurecom RT '09 speaker diarizat ion syst em: Enhancement s in speaker modelling and clust er purificat ion, I EEE International Conference on Acoustics Speech and Signal Processing, pp.4958-4961, 2010. ,
Using audio fingerprint ing for duplicat e det ect ion and t humbnail generat ion, I EEE I nternational Conference on Acoustics, Speech, and Signal Processing, pp.9-12, 2005. ,
Distortion discriminant analysis for audio fingerprinting, IEEE Transactions on Speech and Audio Processing, vol.11, issue.3, pp.165-174, 2003. ,
DOI : 10.1109/TSA.2003.811538
No laughing mat t er, I nterspeech, pp.465-468, 2005. ,
Robust Sound Modeling for Song Det ect ion in Broadcast Audio. Audio engineering society, 2002. ,
A Review of Audio Fingerprinting, Journal of VLSI signal processing systems for signal, image and video technology, vol.33, issue.3, pp.271-284, 2005. ,
DOI : 10.1007/s11265-005-4151-3
Speech Processing Using Automatically Derived Segmental Units: Applications to Very Low Rate Coding and Speaker Verification, 1998. ,
Impact of Overlapping Speech Det ect ion on Speaker Diarizat ion for Broadcast News and Debat es, I EEE I nternational Conference on Acoustics, Speech and Signal Processing, 2013. ,
Pet rovska-Delacrét az. Dat a driven approaches t o speech and language processing. Lecture notes in computer science, pp.164-198, 2005. ,
A Secure, Robust Wat ermark for Mult imedia, I nternational Workshop on I nformation Hiding, pp.185-206, 1996. ,
AudioID: Towards Cont ent -Based Ident ificat ion of Audio Mat erial, Audio Engineering Society Convention 110, 2001. ,
List ening t o " Naima " : An Aut omat ed St ruct ural Analysis from Recorded Audio, I nternational Computer Music Conference, pp.28-34, 2002. ,
Pat t ern discovery t echniques for music audio, I nternational Conference on Music I nformation Retrieval, pp.63-70, 2002. ,
Computational Auditory Scene Analysis, pp.65-70, 2006. ,
DOI : 10.1002/9780470611180.ch5
Det ect ion of speaker changes in an audio document, EUROSPEECH, 1999. ,
DISTBIC: A speaker-based segmentation for audio data indexing, Speech Communication, vol.32, issue.1-2, pp.111-126, 2000. ,
DOI : 10.1016/S0167-6393(00)00027-3
Inference of variable-length linguistic and acoustic units by multigrams, Speech Communication, vol.23, issue.3, pp.223-241, 1997. ,
DOI : 10.1016/S0167-6393(97)00048-4
Print z. A robust high accuracy speech recognit ion syst em for mobile applicat ions, I EEE Transactions on Speech and Audio Processing, issue.8, pp.10551-561, 2002. ,
Text-I ndependant Speaker Verification Based On High-Level I nformation Extracted With Data-Driven Methods, 2007. ,
Text independent Speaker Verificat ion, Guide to Biometric Reference Systems and Performance Evaluation, 2009. ,
DOI : 10.1007/978-1-84800-292-0_7
Unsupervised Video I ndexing based on Audiovisual Characterization of Persons, 2010. ,
Improved speaker diarizat ion syst em for meet ings, I EEE I nternational Conference on Acoustics, Speech and Signal Processing, pp.4097-4100, 2009. ,
A framework for fingerprint -based det ect ion of repeat ing object s in mult imedia st reams, EUSI PCO, pp.1464-1468, 2012. ,
A Scalable Audio Fingerprint Met hod wit h Robust ness t o Pit ch-Shift ing, I nternational Symposium on Music Information Retrieval, pp.121-126, 2011. ,
Readings in comput er vision: issues, problems, principles, and paradigms. chapt er Random sample consensus: a paradigm for model fit t ing wit h applicat ions t o image analysis and aut omat ed cart ography, pp.726-740, 1987. ,
09 Speaker Diarizat ion Syst em, NI ST Rich Transcription Workshop, 2009. ,
Prosodic and other Long-Term Features for Speaker Diarization, IEEE Transactions on Audio, Speech, and Language Processing, vol.17, issue.5, pp.985-993, 2009. ,
DOI : 10.1109/TASL.2009.2015089
T he EST ER Phase I I Evaluat ion Campaign for t he Rich Transcript ion o French Broadcast News, EUROSPEECH, 2005. ,
T he EST ER 2 Evaluat ion Campaign for t he Rich Transcript ion of French Radio Broadcast s. In I nterspeech, pp.2583-2586, 2009. ,
Maximum a post eriori est imat ion for mult ivariat e Gaussian mixt ure observat ions of Markov chains, Transactions on Speech and Audio Processing, pp.291-298, 1994. ,
Unsupervised Training of an HMMbased Speech Rec, 2009. ,
Segregat ion of speakers for speech recognit ion and speaker ident ificat ion, I EE I nternational Conference on Acoustics, Speech, and Signal Processing, pp.873-876, 1991. ,
Towards unsupervised speech processing, 2012 11th International Conference on Information Science, Signal Processing and their Applications (ISSPA), pp.1-4, 2012. ,
DOI : 10.1109/ISSPA.2012.6310546
An Improved Met hod for Unsupervised Training of LVCSR Syst ems, I nterspeech, pp.2101-2104, 2007. ,
T he ETAPE corpus for t he evaluat ion of speech-based T V cont ent processing in t he French ,
Speaker diarizat ion of French broadcast news, I EEE I nternational Conference on Acoustics, Speech and Signal Processing, pp.4365-4368, 2008. ,
A Highly Robust Audio Fingerprint ing Syst em. In I nternational Society for Music I nformation Retrieval, pp.107-115, 2002. ,
Speed-change resist ant audio fingerprint ing using aut ocorrelat ion, I EEE I nternational Conference on Acoustics, Speech, and Signal Processing, pp.728-759, 2003. ,
Agglomerat ive hierarchical speaker clust ering using increment al Gaussian mixt ure clust er modeling, I nterspeech, pp.20-23, 2008. ,
Zero resource spoken audio corpus analysis, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, 2013. ,
DOI : 10.1109/ICASSP.2013.6639335
ARGOS: aut omat ically ext ract ing repeat ing object s from mult imedia st reams. I EEE Transactions on Multimedia, pp.115-129, 2006. ,
Perceptual linear predictive (PLP) analysis of speech, The Journal of the Acoustical Society of America, vol.87, issue.4, pp.1738-1752, 1990. ,
DOI : 10.1121/1.399423
Woot ers. Speaker Diarizat ion Error Analysis Using Oracle Component s. I EEE Transactions on Audio, Speech, and Language Processing, pp.393-403, 2012. ,
Towards Unsupervised Training of Speaker Independent Acoust ic Models, In I NT ERSPEECH, pp.1693-1692, 2011. ,
Speaker segment at ion and clust ering in meet ings, International Conference on Spoken Language Processing, 2004. ,
T he Argos Campaign: Evaluat ion of Video Analysis Tools, I nternational Workshop on Content-Based Multimedia I ndexing, pp.130-137, 2007. ,
Mult ilingual Acoust ic Modeling Using Graphemes, pp.1145-1148, 2003. ,
Comput er vision for music ident ificat ion, I EEE Conference on Computer Vision and Pattern Recognition, pp.597-604, 2005. ,
Laught er Det ect ion in Meet ings, NI ST Meeting Recognition Workshop, pp.118-121, 2004. ,
Schult z. Grapheme Based Speech Recognit ion, EU- ROSPEECH, pp.3141-3144, 2003. ,
Aut omat ic laught er det ect ion using neural networks. In I nterspeech, pp.2973-2976, 2007. ,
On Information and Sufficiency, The Annals of Mathematical Statistics, vol.22, issue.1, pp.79-86, 1951. ,
DOI : 10.1214/aoms/1177729694
Det ect ion of Laught er-in-Int eract ion in Mult ichannel Close-Talk Microphone Recordings of Meet ings, workshop on Machine Learning for Multimodal I nteraction, pp.149-160, 2008. ,
Speaker diarizat ion using normalized cross likelihood rat io, I nterspeech, pp.1869-1872, 2007. ,
Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models, Computer Speech & Language, vol.9, issue.2, pp.171-185, 1995. ,
DOI : 10.1006/csla.1995.0010
Binary codes capable of correct ing delet ions, insert ions, and reversals. Cybernetics and control theory, pp.707-710, 1966. ,
An Algorithm for Vector Quantizer Design, IEEE Transactions on Communications, vol.28, issue.1, pp.84-95, 1980. ,
DOI : 10.1109/TCOM.1980.1094577
DCT based mult iple hashing t echnique for robust audio fingerprint ing, I CASSP, pp.61-64, 2009. ,
Cross-language Boot st rapping for Unsupervised Acoust ic Model Training: Rapid Development of a Polish Speech Recognit ion Syst em, I nterspeech, pp.88-91, 2009. ,
Using Acoust ic Condit ion Clust ering To Improve Acoust ic Change Det ect ion On Broadcast News, I nternational Conference on Speech and Language Processing, pp.568-571, 2000. ,
Vect or Quant izat ion in Speech Coding, Proceedings of the I EEE, pp.1551-1588, 1985. ,
Making Sense of Sound: Unsupervised Topic Segment at ion over Acoust ic Input, Annual Meeting of the Association of Computational Linguistics, pp.504-511, 2007. ,
SONIC: Transcript ion of Polyphonic Piano Music wit h Neural Networks, Workshop on Current Research Directions in Computer Music, pp.217-224, 2001. ,
NIST Speech Processing Evaluat ions: LVCSR, Speaker Recognit ion, Language Recognit ion, I EEE Workshop on Signal Processing Applications for Public Security and Forensics, pp.1-7, 2007. ,
Bonast re, P. Tresadern, and T . Coot es. Bi-Modal Person Recognit ion on a Mobile Phone: Using Mobile Phone Dat a, IEEE I nternational Conference on Multimedia and Expo Workshops, pp.635-640, 2012. ,
The SEMAINE Database: Annotated Multimodal Records of Emotionally Colored Conversations between a Person and a Limited Agent, IEEE Transactions on Affective Computing, vol.3, issue.1, pp.5-17, 2012. ,
DOI : 10.1109/T-AFFC.2011.20
E-HMM approach for learning and adapt ing sound models, Speaker and Language Recognition Workshop, pp.175-180, 2001. ,
Dist ance Measures for Speech Recognit ion?Psychological and Inst rument al. In Joint Workshop on Pattern Recognition and Artificial I ntelligence, 1976. ,
Experiment s on speaker t racking and segment at ion in radio broadcast news, 2005. ,
An effi cient met hod for t he unsupervised discovery of signalling mot ifs in large audio st reams, Content-Based Multimedia I ndexing (CBMI), 2011 9th International Workshop on, pp.145-150, 2011. ,
Zero-resource audio-only spoken t erm det ect ion based on a combinat ion of t emplat e mat ching t echniques, 2011. ,
Unsupervised Motif Acquisition in Speech via Seeded Discovery and Template Matching Combination, IEEE Transactions on Audio, Speech, and Language Processing, vol.20, issue.7, pp.2031-2044, 2012. ,
DOI : 10.1109/TASL.2012.2194283
URL : https://hal.archives-ouvertes.fr/hal-00740978
A coupled HMM for audio-visual speech recognit ion, I EEE I nternational Conference on Acoustics, Speech, and Signal Processing, pp.2013-2016, 2002. ,
Unsupervised acoust ic and language model t raining wit h small amount s of labelled dat a, I EEE I nternational Conference on Acoustics, Speech and Signal Processing, pp.4297-4300, 2009. ,
Fingerprint ing t o Ident ify Repeat ed Sound Event s in Long-Durat ion Personal Audio Recordings, IEEE I nternational Conference on Acoustics, Speech, and Signal Processing, pp.233-236, 2007. ,
Very Low Bit Rat e speech coding in Noisy Environment s, Speech and Computer (SPECOM), 2005. ,
Unsupervised Pat t ern Discovery in Speech. I EEE Transactions on Audio, Speech, and Language Processing, pp.186-197, 2008. ,
Improved t ools for Biological Sequence Comparison, Proceedings of the National Academy of Sciences, pp.2444-2448, 1988. ,
Voice Forgery Using ALISP: Indexation in a Client Memory, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005., pp.17-20, 2005. ,
DOI : 10.1109/ICASSP.2005.1415039
The MAHNOB Laughter database, Image and Vision Computing, vol.31, issue.2, pp.186-202, 2013. ,
DOI : 10.1016/j.imavis.2012.08.014
Fusion of audio and visual cues for laught er det ect ion, I nternational Conference on I mage and Video Retrieval, pp.329-337, 2008. ,
Segmental Approaches for Automatic Speaker Verification, Digital Signal Processing, vol.10, issue.1-3, pp.198-212, 2000. ,
DOI : 10.1006/dspr.2000.0370
Guide to Biometric Reference Systems and Performance Evaluation, 2009. ,
Jingle det ect ion and ident ificat ion in audio document s. I EEE I nternational Conference on Acoustics, Speech, and Signal Processing, pp.329-322, 2004. ,
A fusion st udy in speech/ music classificat ion, I EEE I nternational Conference on Acoustics, Speech, and Signal Processing, pp.17-20, 2003. ,
A new adapt ive longt erm spect ral est imat ion voice act ivity det ect or, EUROSPEECH, pp.3041-3044, 2003. ,
A PUBLIC AUDIO IDENTIFICATION EVALUATION FRAMEWORK FOR BROADCAST MONITORING, Applied Artificial Intelligence, vol.26, issue.1-2, pp.119-136, 2011. ,
DOI : 10.1109/LSP.2005.863678
Audio ident ificat ion based on spect ral modeling of barkbands energy and synchronizat ion t hrough onset det ect ion, I EEE I nternational Conference on Acoustics, Speech and Signal Processing, pp.477-480, 2011. ,
Decision-Level Fusion for Audio-Visual Laught er Det ect ion. In I nternational Workshop on Machine Learning for Multimodal I nteraction, pp.137-148, 2008. ,
Speaker verificat ion using Adapt ed Gaussian mixt ure models, Digital Signal Processing, pp.19-41, 2000. ,
Stochastic Complexity in Statistical I nquiry T heory, 1989. ,
A Global Opt imizat ion Framework For Speaker Diarizat ion, Speaker and Language Recognition Workshop, 2012. ,
Learning words from sights and sounds: a computational model, Cognitive Science, vol.55, issue.3, pp.113-146, 2000. ,
DOI : 10.1207/s15516709cog2601_4
A survey of mot if discovery met hods in an int egrat ed framework, Biology Direct, vol.1, issue.1, 2006. ,
Discriminat ion of speech and non-linguist ic vocalizat ions by Non-Negat ive Mat rix Fact orizat ion, International Conference on Acoustics Speech and Signal Processing, pp.5054-5057, 2010. ,
Aut omat ic Segment at ion, Classificat ion and Clust ering of Broadcast News Audio, DARPA Speech Recognition Workshop, pp.97-99, 1997. ,
Where Are T he Challenges in Speaker Diarizat ion? In I EEE I nternational Conference on Acoustics, Speech and Signal Processing, 2013. ,
Duplicat e Song Det ect ion using Audio Fingerprint ing for Consumer Elect ronics Devices, I EEE I nternational Symposium on Consumer Electronics, pp.1-6, 2006. ,
Unsupervised Audio Pat t ern Discovery using HMM-based Self-Organized Unit s, Interspeech, 2011. ,
DOI : 10.1016/j.csl.2013.05.002
Modèles Harmoniques plus Bruit combinés avec des Méthodes Statistiques , pour la Modification de la Parole et du Locuteur, 1996. ,
An overview of aut omat ic speaker diarizat ion syst ems. I EEE Transactions on Audio, Speech, and Language Processing, pp.1557-1565, 2006. ,
Improved speaker segment at ion and segment s clust ering using t he bayesian informat ion crit erion, EUROSPEECH, pp.679-682, 1999. ,
Segment ing phonet ic unit s in laught er, I nternational Conference of the Phonetic Sciences, pp.2793-2796, 2003. ,
Aut omat ic det ect ion of laught er, I nterspeech, pp.485-488, 2005. ,
Automatic discrimination between laughter and speech, Speech Communication, vol.49, issue.2, pp.144-158, 2007. ,
DOI : 10.1016/j.specom.2007.01.001
URL : https://hal.archives-ouvertes.fr/hal-00499165
Evaluat ing aut omat ic laught er segment at ion in meet ings using acoust ic and acoust ic-phonet ic feat ures, I nterdisciplinary Workshop on the Phonetics of Laughter, pp.49-53, 2007. ,
T he AVLaught erCycle Dat abase, I nternational Conference on Language Resources and Evaluation (LREC'10), pp.2996-3001, 2010. ,
A phonet ic analysis of nat ural laught er, for use in aut omat ic laught er processing syst ems, I nternational Conference on Aff ective Computing and I ntelligent Interaction, pp.397-406, 2011. ,
Rapid object det ect ion using a boost ed cascade of simple feat ures, I EEE Conference on Computer Vision and Pattern Recognition, pp.511-518, 2001. ,
The Shazam music recognition service, Communications of the ACM, vol.49, issue.8, pp.44-48, 2006. ,
DOI : 10.1145/1145287.1145312
Online pat t ern learning for non-negat ive convolut ive sparse coding, 2011. ,
Localizat ion of non-linguist ic event s in spont aneous speech by Non-Negat ive Mat rix Fact orizat ion and Long Short ,
Towards mult i-speaker unsupervised speech pat t ern discovery, IEEE I nternational Conference on Acoustics Speech and Signal Processing, pp.4366-4369, 2010. ,
Token Passing: a Concept ual Model for Connect ed Speech Recognit ion Syst ems, 1989. ,
A novel audio fingerprint ing met hod robust t o t ime scale modificat ion and pit ch shift ing, Proceedings of the international conference on Multimedia, pp.987-990, 2010. ,
Speaker Diarization: From Broadcast News to Lectures, Machine Learning for Multimodal I nteraction, pp.396-406, 2006. ,
DOI : 10.1007/11965152_35
Combining Speaker Ident ificat ion and BIC for Speaker Diarizat ion, 2005. ,