S. A. Bibliography, M. D. Abdallah, and . Plumbley, Polyphonic music transcription by non-negative sparse coding of power spectra, Proceedings of the International Conference on Music Information Retrieval, p.318325, 2004.

T. Abe and M. Honda, Sinusoidal model based on instantaneous frequency attractors, IEEE Transactions on Audio, Speech and Language Processing, vol.14, issue.4, pp.1292-1300, 2006.
DOI : 10.1109/TSA.2005.858545

M. Alonso, G. Richard, and B. David, Accurate tempo estimation based on harmonic + noise decomposition, EURASIP Journal on Advances in Signal Processing, vol.2007, issue.1, 2007.
DOI : 10.1016/0047-259X(86)90017-5

S. Arberet, R. Gribonval, and F. Bimbot, A Robust Method to Count and Locate Audio Sources in a Multichannel Underdetermined Mixture, IEEE Transactions on Signal Processing, vol.58, issue.1, pp.121-133, 2010.
DOI : 10.1109/TSP.2009.2030854

URL : https://hal.archives-ouvertes.fr/inria-00489529

R. Badeau, N. Bertin, and E. Vincent, On the stability of multiplicative update algorithms. application to non-negative matrix factorization, Institut TELECOM

J. P. Bello and J. Pickens, A robust mid-level representation for harmonic content in music signals, Proceedings of the International Conference on Music Information Retrieval, pp.311-322, 2005.

L. Benaroya, Séparation de plusieurs sources sonores avec un seul microphone, 2003.

L. Benaroya, L. Donagh, F. Bimbot, and R. Gribonval, Non negative sparse representation for Wiener based source separation with a single sensor, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)., p.61316, 2003.
DOI : 10.1109/ICASSP.2003.1201756

URL : https://hal.archives-ouvertes.fr/inria-00574784

L. Benaroya, F. Bimbot, and R. Gribonval, Audio source separation with a single sensor, IEEE Transactions on Audio, Speech and Language Processing, vol.14, issue.1, p.191199, 2006.
DOI : 10.1109/TSA.2005.854110

URL : https://hal.archives-ouvertes.fr/inria-00544949

N. Bertin, R. Badeau, and E. Vincent, Enforcing Harmonicity and Smoothness in Bayesian Non-Negative Matrix Factorization Applied to Polyphonic Music Transcription, IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue.3, pp.538-549, 2010.
DOI : 10.1109/TASL.2010.2041381

URL : https://hal.archives-ouvertes.fr/inria-00557088

J. Brown, spectral transform, The Journal of the Acoustical Society of America, vol.89, issue.1, p.425434, 1991.
DOI : 10.1121/1.400476

P. Cancela, Tracking melody in polyphonic audio, Music Information Retrieval Evaluation eXchange, 2008.

C. Cao and M. Li, Multiple f0 estimation in polyphonic music (mirex 2008). extended abstract for the Music Information Retrieval Evaluation eXchange, 2008.

J. Cardoso, M. Martin, J. Delabrouille, M. Betoule, and G. Patnachon, Component separation with exible models. application to the separation of astrophysical emissions

A. T. Cemgil and H. J. Kappen, Monte Carlo methods for Tempo Tracking and Rhythm Quantization, Journal of Articial Intelligence Research, vol.18, p.4581, 2003.

A. T. Cemgil, P. Desain, and H. J. Kappen, Rhythm Quantization for Transcription, Proceedings of the AISB'99 Symposium on Musical Creativity, p.140146, 1999.
DOI : 10.2307/3680894

Z. Chen, A. Cichocki, and T. M. Rutkowski, Constrained non-negative matrix factorisation method for eeg analysis in early detection of alzheimer's disease, Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, p.893896, 2006.

M. G. Christensen and A. Jakobsson, Multi-Pitch Estimation, 2009.

A. Cichocki, Generalized component analysis and blind source separation methods for analyzing multichannel brain signals, 2004.

A. Cichocki, Generalized independent component analysis and its applications in processing of multisensory biomedical data, Proceedings of IVth International Workshop Computational Problems of Electrical Engineering, p.1324, 2002.

P. Comon, Independent component analysis, A new concept?, Signal Processing, vol.36, issue.3, p.287314, 1994.
DOI : 10.1016/0165-1684(94)90029-9

URL : https://hal.archives-ouvertes.fr/hal-00417283

A. Daniel, V. Emiya, and B. David, Perceptually-based evaluation of the errors usually made when automatically transcribing music, ISMIR, 2008.
URL : https://hal.archives-ouvertes.fr/inria-00452615

M. Davy and S. J. , Bayesian harmonic models for musical signal analysis (with discussion), Bayesian Statistics VII, 2003.

M. Davy, S. Godsill, and J. Idier, Bayesian analysis of polyphonic western tonal music, The Journal of the Acoustical Society of America, vol.119, issue.4, p.24982517, 2006.
DOI : 10.1121/1.2168548

URL : https://hal.archives-ouvertes.fr/inria-00120240

A. De-cheveigné, Separation of concurrent harmonic sounds: Fundamental frequency estimation and a time???domain cancellation model of auditory processing, The Journal of the Acoustical Society of America, vol.93, issue.6, p.32713290, 1993.
DOI : 10.1121/1.405712

A. De-cheveigné and H. Kawahara, YIN, a fundamental frequency estimator for speech and music, The Journal of the Acoustical Society of America, vol.111, issue.4, p.19171930, 2002.
DOI : 10.1121/1.1458024

A. Dempster, N. Laird, and D. Rubin, Maximum Likelihood from Incomplete Data via the EM Algorithm, Journal of the Royal Statistical Society. Series B (Methodological), vol.39, issue.1, p.138, 1977.

I. Dhillon and S. Sra, Generalized nonnegative matrix approximations with Bregman divergences, Proceeding of the Neural Information Processing Systems (NIPS) Conference, 2005.

K. Dressler, Extraction of the Melody Pitch Contour from Polyphonic Audio. extended abstract for the Music Information Retrieval Evaluation eXchange, 2005.

K. Dressler, Audio melody extraction for MIREX 2009. extended abstract for the Music Information Retrieval Evaluation eXchange, 2009.

Z. Y. Duan, J. Y. Han, and B. Pardo, Harmonically informed multi-pitch tracking, Proceedings of the International Society on Music Information Retrieval conference, pp.333-338, 2009.

J. Durrieu, G. Richard, and B. David, Singer melody extraction in polyphonic signals using source separation methods, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, p.169172, 2008.
DOI : 10.1109/ICASSP.2008.4517573

J. Durrieu, G. Richard, and B. David, Single sensor singer/music separation using a source/lter model of the singer voice, ACOUSTICS, 2008.

J. Durrieu, G. Richard, and B. David, Main melody extraction from polyphonic music excerpts using a source/lter model of the main source. extended abstract for the Music Information Retrieval Evaluation eXchange, 2008.

J. Durrieu, G. Richard, and B. David, An iterative approach to monaural musical mixture de-soloing, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, p.105108, 2009.
DOI : 10.1109/ICASSP.2009.4959531

J. Durrieu, A. Ozerov, C. Févotte, G. Richard, and B. David, Main instrument separation from stereophonic audio signals using a source/lter model, European Signal Processing Conference (EUSIPCO), 2009.

J. Durrieu, G. Richard, and B. David, A source/lter approach to audio melody extraction . extended abstract for the Music Information Retrieval Evaluation eXchange, 2009.

J. Durrieu, G. Richard, B. David, and C. Févotte, Source/Filter Model for Unsupervised Main Melody Extraction From Polyphonic Audio Signals, IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue.3, pp.564-575, 2010.
DOI : 10.1109/TASL.2010.2041114

D. Ellis, Beat Tracking by Dynamic Programming, Journal of New Music Research, vol.51, issue.1, p.5160, 2007.
DOI : 10.1155/2007/67215

D. Ellis and G. Poliner, Classication-based melody transcription, Machine Learning, p.439456, 2006.
DOI : 10.1007/s10994-006-8373-9

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.90.4819

D. Ellis and G. Poliner, Identifying`coverIdentifying`cover songs' with chroma features and dynamic programming beat tracking, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, pp.1429-1432, 2007.

V. Emiya, R. Badeau, and B. David, Multipitch Estimation of Piano Sounds Using a New Probabilistic Spectral Smoothness Principle, IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue.6, 2009.
DOI : 10.1109/TASL.2009.2038819

URL : https://hal.archives-ouvertes.fr/inria-00510392

Y. Ephraim and D. Malah, Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.32, issue.6, p.11091121, 1984.
DOI : 10.1109/TASSP.1984.1164453

S. Essid, G. Richard, and B. David, Instrument recognition in polyphonic music based on automatic taxonomies, IEEE Transactions on Audio, Speech and Language Processing, vol.14, issue.1, p.6880, 2006.
DOI : 10.1109/TSA.2005.860351

URL : https://hal.archives-ouvertes.fr/hal-00477670

S. Essid, G. Richard, and B. David, Musical instrument recognition by pairwise classication strategies, IEEE Transactions on Audio, Speech, and Language Processing, vol.14, issue.4, p.14011412, 2006.
DOI : 10.1109/tsa.2005.860842

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.222.7986

G. Fant, Acoustic Theory of Speech Production, 1970.
DOI : 10.1515/9783110873429

C. Févotte, N. Bertin, and J. Durrieu, Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis, Neural Computation, vol.14, issue.3, 2009.
DOI : 10.1016/j.sigpro.2007.01.024

C. Févotte, N. Bertin, and J. Durrieu, Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis, Neural Computation, vol.14, issue.3, 2009.
DOI : 10.1016/j.sigpro.2007.01.024

D. Fitzgerald, M. Cranitch, and M. Cychowski, Towards an inverse constant Q transform, 120th Audio Engineering Society Convention, 2006.

D. Fitzgerald, M. Cranitch, and E. Coyle, Extended Nonnegative Tensor Factorisation Models for Musical Sound Source Separation, Computational Intelligence and Neuroscience, vol.2008, 2008.
DOI : 10.1109/TSA.2005.858005

R. Foucard, J. Durrieu, M. Lagrange, and G. Richard, Multimodal similarity between musical streams for cover version detection, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, 2010.
DOI : 10.1109/ICASSP.2010.5495217

URL : https://hal.archives-ouvertes.fr/hal-01132553

H. Fujihara, T. Kitahara, M. Goto, K. Komatani, T. Ogata et al., F0 Estimation Method for Singing Voice in Polyphonic Audio Signal Based on Statistical Vocal Model and Viterbi Search, 2006 IEEE International Conference on Acoustics Speed and Signal Processing Proceedings, pp.14-19, 2006.
DOI : 10.1109/ICASSP.2006.1661260

H. Fujihara, M. Goto, and H. G. Okuno, An F0 estimation method of vocal part in polyphonic music by using statistical modelling of singing voice and Viterbi search, pp.3682-3693, 2008.

O. Gillet and G. Richard, Transcription and Separation of Drum Signals From Polyphonic Music. Audio, Speech, and Language Processing, IEEE Transactions on [see also Speech and Audio Processing, p.529540, 2008.

E. Gómez, Melodic Description of Audio Signals for Music Content Processing, 2002.

E. Gómez, S. Streich, B. Ong, R. P. Paiva, S. Tappert et al., A quantitative comparison of dierent approaches for melody extraction from polyphonic audio recordings, 2006.

M. Goto, A robust predominant-F0 estimation method for real-time detection of melody and bass lines in CD recordings, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100), p.757760, 2000.
DOI : 10.1109/ICASSP.2000.859070

M. Goto, A real-time music-scene-description system: predominant-F0 estimation for detecting melody and bass lines in real-world audio signals, Speech Communication, vol.43, issue.4, pp.311-329, 2004.
DOI : 10.1016/j.specom.2004.07.001

M. Goto, PreFEst: A Predominant-F0 Estimation method for polyphonic musical audio signals, Proceedings of the 2nd Music Information Retrieval Evaluation eXchange, 2005.

M. Goto, H. Hashiguchi, T. Nishimura, and R. Oka, RWC music database: Popular, classical, and jazz music databases, Proceedings of the International Conference on Music Information Retrieval, p.287288, 2002.

D. W. Grin and J. S. Lim, Signal estimation from modied short-time Fourier transform, IEEE Transactions on Acoustics, Speech, and Signal Processing, pp.32236-242, 1984.

Y. S. Han and C. Raphael, Desoloing monaural audio using mixture models, Proceedings of the International Conference on Music Information Retrieval, 2007.

T. Heittola, A. Klapuri, and T. Virtanen, Musical instrument recognition in polyphonic audio using source-lter model for sound separation, Proceedings of the International Society for Music Information Retrieval Conference, pp.327-332, 2009.

N. Henrich, Etude de la source glottique en voix parlée et chantée, 2001.

D. J. Hermes, Measurement of pitch by subharmonic summation, The Journal of the Acoustical Society of America, vol.83, issue.1, p.257264, 1988.
DOI : 10.1121/1.396427

C. Hsu, L. Chen, J. Jang, and H. Li, Singing pitch extraction from monaural polyphonic songs by contextual audio modeling and singing harmonic enhancement, Proceedings of the International Society for Music Information Retrieval conference, pp.26-30, 2009.

C. Joder, S. Essid, and G. Richard, Temporal integration for audio classication with application to musical instrument classication, IEEE Transactions on Audio, Speech and Language Processing, vol.17, issue.1, p.174186, 2009.

A. Jourjine, S. Rickard, and O. Yilmaz, Blind separation of disjoint orthogonal signals: demixing N sources from 2 mixtures, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100), p.29852988, 2000.
DOI : 10.1109/ICASSP.2000.861162

C. Jutten and J. Herault, Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture, Signal Processing, vol.24, issue.1, p.110, 1991.
DOI : 10.1016/0165-1684(91)90079-X

A. Klapuri, Multipitch estimation and sound separation by the spectral smoothness principle, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221), pp.33813384-33813391, 2001.
DOI : 10.1109/ICASSP.2001.940384

A. Klapuri, Multipitch Analysis of Polyphonic Music and Speech Signals Using an Auditory Model, IEEE Transactions on Audio, Speech, and Language Processing, vol.16, issue.2, p.255266, 2008.
DOI : 10.1109/TASL.2007.908129

D. Klatt and L. Klatt, Analysis, synthesis, and perception of voice quality variations among female and male talkers, The Journal of the Acoustical Society of America, vol.87, issue.2, p.820857, 1990.
DOI : 10.1121/1.398894

J. Kornycky, B. Gunel, and A. Kondoz, Comparison of subjective and objective evaluation methods for audio source separation, The Journal of the Acoustical Society of America, vol.123, issue.5, p.3569, 2008.
DOI : 10.1121/1.2934636

H. W. Kuhn and A. W. Tucker, Nonlinear programming, Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability, p.481492, 1951.

M. Lagrange, L. G. Martins, J. Murdoch, and G. Tzanetakis, Normalized Cuts for Predominant Melodic Source Separation, IEEE Transactions on Audio, Speech, and Language Processing, vol.16, issue.2, pp.278290-1558, 2008.
DOI : 10.1109/TASL.2007.909260

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.184.9752

E. Large and J. Kolen, Resonance and the Perception of Musical Meter, Connection Science, vol.55, issue.2-3, p.3, 1994.
DOI : 10.1007/978-3-662-22492-2

J. , L. Roux, H. Kameoka, N. Ono, A. De-cheveigné et al., Single channel speech and background segregation through harmonic-temporal clustering, Proceedings of the WASPAA 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, p.279282, 2007.

J. , L. Roux, N. Ono, and S. Sagayama, Explicit consistency constraints for STFT spectrograms and their application to phase reconstruction, Proceedings of the SAPA 2008 ISCA Workshop on Statistical and Perceptual Audition, p.2328, 2008.

D. Lee and H. Seung, Algorithms for Non-negative Matrix Factorization, Advances in Neural Information Processing Systems, p.556562, 2001.

D. D. Lee and H. S. Seung, Learning the parts of objects by nonnegative matrix factorization, Nature, vol.401, pp.788-791, 1999.

P. Leveau, Décompositions parcimonieuses structurées : application à la représentation objet de la musique : modèles de signaux, algorithmes et applications, 2007.

Y. P. Li and D. L. Wang, Separation of Singing Voice From Music Accompaniment for Monaural Recordings, IEEE Transactions on Audio, Speech and Language Processing, vol.15, issue.4, p.1475, 2007.
DOI : 10.1109/TASL.2006.889789

S. Mallat, A Wavelet tour of signal processing, 2008.

S. Mallat and Z. Zhang, Matching pursuits with time-frequency dictionaries, IEEE Transactions on Signal Processing, vol.41, issue.12, p.33973415, 1993.
DOI : 10.1109/78.258082

M. Marolt, A Connectionist Approach to Automatic Transcription of Polyphonic Piano Music, IEEE Transactions on Multimedia, vol.6, issue.3, p.439449, 2004.
DOI : 10.1109/TMM.2004.827507

M. Marolt, Audio Melody Extraction Based on Timbral Similarity of Melodic Fragments, EUROCON 2005, The International Conference on "Computer as a Tool", 2005.
DOI : 10.1109/EURCON.2005.1630193

R. Mcaulay and M. Malpass, Speech enhancement using a soft-decision noise suppression lter, Acoustics, Speech and Signal Processing IEEE Transactions on, vol.28, issue.2, p.137145, 1980.

R. J. Mcaulay and T. F. Quatieri, Speech analysis/Synthesis based on a sinusoidal representation, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.34, issue.4, pp.34744-754, 1986.
DOI : 10.1109/TASSP.1986.1164910

R. Meddis, Simulation of mechanical to neural transduction in the auditory receptor, The Journal of the Acoustical Society of America, vol.79, issue.3, p.70271179, 1986.
DOI : 10.1121/1.393460

G. H. Mohimani, M. Babaie-zadeh, and C. Jutten, Complex-valued sparse representation based on smoothed l 0 norm, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, p.38813884, 2008.
URL : https://hal.archives-ouvertes.fr/hal-00271364

M. Mørup, L. K. Hansen, S. M. Arnfred, L. Lim, and K. H. Madsen, Shift invariant multilinear decomposition of neuroimaging data. accepted for publication NeuroImage, p.14391450, 2008.

L. Oudre, Y. Grenier, and C. Févotte, Template-based chord recognition: inuence of the chord types, Proceedings of the International Society for Music Information Retrieval conference, pp.153-158, 2009.

A. Ozerov, Adaptation de modèles statistiques pour la séparation de sources mono-capteur

A. Ozerov and C. Févotte, Multichannel Nonnegative Matrix Factorization in Convolutive Mixtures for Audio Source Separation, IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue.3, pp.550-563, 2010.
DOI : 10.1109/TASL.2009.2031510

A. Ozerov, P. Philippe, F. Bimbot, and R. Gribonval, Adaptation of Bayesian Models for Single-Channel Source Separation and its Application to Voice/Music Separation in Popular Songs, IEEE Transactions on Audio, Speech and Language Processing, vol.15, issue.5, p.15641578, 2007.
DOI : 10.1109/TASL.2007.899291

URL : https://hal.archives-ouvertes.fr/inria-00544774

R. Paiva, Melody Detection in Polyphonic Audio, 2006.

R. P. Paiva, T. Mendes, and A. Cardoso, On the detection of melody notes in polyphonic audio, Proceedings of the International Conference on Music Information Retrieval, pp.11-15, 2005.

H. Papadopoulos and G. Peeters, Simultaneous estimation of chord progression and downbeats from an audio le, IEEE International Conference on Acoustics, Speech and Signal Processing, p.121124, 2008.

S. Pauws, Cubyhum: A fully operational query by humming system, ISMIR 2002 Conference Proceedings, p.187196, 2002.

G. Peeters, Template-Based Estimation of Time-Varying Tempo, EURASIP Journal on Advances in Signal Processing, vol.2007, issue.1, 2007.
DOI : 10.1109/5.18626

G. Peeters, Beat-marker location using a probabilistic framework and linear discriminant analysis, Proceedings of the Digital Audio Eects (DAFX) conference, 2009.
URL : https://hal.archives-ouvertes.fr/hal-01106384

M. D. Plumbley, Algorithms for nonnegative independent component analysis, IEEE Transactions on Neural Networks, vol.14, issue.3, p.534543, 2003.
DOI : 10.1109/TNN.2003.810616

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.19.4128

G. Poliner, D. Ellis, A. Ehmann, E. Gómez, S. Streich et al., Melody Transcription From Music Audio: Approaches and Evaluation, IEEE Transactions on Audio, Speech and Language Processing, vol.15, issue.4, p.12471256, 2007.
DOI : 10.1109/TASL.2006.889797

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.331.1287

P. Ponce-de-león, D. Rizo, R. Ramirez, and J. Iñesta, Melody characterization by a genetic fuzzy system, Proceedings of the 5th Sound and Music Computing Conference, p.1523, 2008.

L. Rabiner, A tutorial on hidden Markov models and selected applications inspeech recognition, Proceedings of the IEEE, p.257286, 1989.

B. Raj, P. Smaragdis, M. Shashanka, and R. Singh, Separating a foreground singer from background music, International Symposium on Frontiers of Research on Speech and Music (FRSM), 2007.

V. Rao and P. Rao, Melody extraction using harmonic matching. Music Information Retrieval Evaluation eXchange, 2008.

D. Rizo, P. J. Ponce-de-león, C. Pérez-sancho, A. Pertusa, and J. M. Iñesta, A pattern recognition approach for melody track selection in MIDI les, Proceedings of the Internation Society for Music Information Retrieval conference, pp.8-12, 2006.

S. Roweis, One microphone source separation, Advances in Neural Information Processing Systems, p.793799, 2001.

M. Ryynänen and A. Klapuri, Query by humming of midi and audio using locality sensitive hashing, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, p.22492252, 2008.
DOI : 10.1109/ICASSP.2008.4518093

M. Ryynänen and A. Klapuri, Polyphonic music transcription using note event modeling, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005.
DOI : 10.1109/ASPAA.2005.1540233

M. Ryynänen and A. Klapuri, Modelling of note events for singing transcription, Proceedings of ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing, 2004.

M. Ryynänen, T. Virtanen, J. Paulus, and A. Klapuri, Accompaniment separation and karaoke application based on automatic melody transcription, 2008 IEEE International Conference on Multimedia and Expo, p.14171420, 2008.
DOI : 10.1109/ICME.2008.4607710

M. P. Ryynänen and A. P. Klapuri, Automatic Transcription of Melody, Bass Line, and Chords in Polyphonic Music, Computer Music Journal, vol.1, issue.4, p.7286, 2008.
DOI : 10.1109/18.87000

M. P. Ryynänen and A. P. Klapuri, Transcription of the singing melody in polyphonic music, Proceedings of the International Conference on Music Information Retrieval, p.222227, 2006.

E. D. Scheirer, Tempo and beat analysis of acoustic musical signals, The Journal of the Acoustical Society of America, vol.103, issue.1, p.588601, 1998.
DOI : 10.1121/1.421129

J. Serrà, E. Gómez, P. Herrera, and X. Serra, Chroma binary similarity and local alignment applied to cover song identication, IEEE Transactions on Audio, Speech and Language Processing, vol.16, p.11381151, 2008.

. Sisec, Professionally produced music recordings Internet page, 2008.

M. Slaney, D. Naar, and R. Lyon, Auditory model inversion for sound separation, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing, 1994.
DOI : 10.1109/ICASSP.1994.389714

P. Smaragdis and J. C. Brown, Non-negative matrix factorization for polyphonic music transcription, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684), p.177180, 2003.
DOI : 10.1109/ASPAA.2003.1285860

P. Smaragdis, B. Raj, and M. Shashanka, Sparse and shift-invariant feature extraction from non-negative data, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, 2008.
DOI : 10.1109/ICASSP.2008.4518048

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.219.3967

S. S. Stevens, A scale for the measurement of a psychological magnitude: loudness., Psychological Review, vol.43, issue.5, p.405416, 1936.
DOI : 10.1037/h0058773

C. Sutton, E. Vincent, M. Plumbley, and J. Bello, Transcription of vocal melodies using voice characteristics and algorithm fusion. Extended abstract for the Music Information Retrieval Evaluation eXchange, 2006.
URL : https://hal.archives-ouvertes.fr/inria-00544277

E. Vincent, Musical source separation using time-frequency source priors, IEEE Transactions on Audio, Speech and Language Processing, vol.14, issue.1, p.9198, 2006.
DOI : 10.1109/TSA.2005.860342

URL : https://hal.archives-ouvertes.fr/inria-00544269

E. Vincent, R. Gribonval, and C. Févotte, Performance measurement in blind audio source separation, IEEE Transactions on Audio, Speech and Language Processing, vol.14, issue.4, pp.1462-1469, 2006.
DOI : 10.1109/TSA.2005.858005

URL : https://hal.archives-ouvertes.fr/inria-00544230

E. Vincent, N. Bertin, and R. Badeau, Harmonic and inharmonic Nonnegative Matrix Factorization for Polyphonic Pitch transcription, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, p.109112, 2008.
DOI : 10.1109/ICASSP.2008.4517558

URL : https://hal.archives-ouvertes.fr/inria-00544183

E. Vincent, S. Araki, and P. Boll, The 2008 Signal Separation Evaluation Campaign: A Community-Based Approach to Large-Scale Evaluation, Proc. Int. Conf. on Independent Component Analysis and Blind Source Separation (ICA), pp.734741-734756, 2009.
DOI : 10.1109/TASL.2007.899176

URL : https://hal.archives-ouvertes.fr/inria-00544168

M. Vinyes, MTG MASS database, 2008.

T. Virtanen, Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria, IEEE Transactions on Audio, Speech and Language Processing, vol.15, issue.3, p.10661074, 2007.
DOI : 10.1109/TASL.2006.885253

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.330.1508

T. Virtanen, Sound Source Separation in Monaural Music Signals, 2006.

T. Virtanen and A. Klapuri, Analysis of polyphonic audio using source-lter model and non-negative matrix factorization, Advances in Models for Acoustic Processing, Neural Information Processing Systems Workshop, 2006.

A. Viterbi, Error bounds for convolutional codes and an asymptotically optimum decoding algorithm, IEEE transactions on Information Theory, vol.13, issue.2, p.260269, 1967.

R. M. Warren, Elimination of Biases in Loudness Judgments for Tones, The Journal of the Acoustical Society of America, vol.48, issue.6B, p.13971403, 1397.
DOI : 10.1121/1.1912298

J. Weil, J. Durrieu, G. Richard, and T. Sikora, Beat tracking using the delta-phase matrix, Groupe AAO : Audio, Acoustique et Ondes, Télécom ParisTech, 2009.

J. Weil, T. Sikora, J. Durrieu, and G. Richard, Automatic generation of lead sheets from polyphonic music signals, Proceedings of International Society fo Music Information Retrieval Conference, pp.26-30, 2009.

R. Weiss and D. Ellis, A variational EM algorithm for learning eigenvoice parameters in mixed signals, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.113-116, 2009.
DOI : 10.1109/ICASSP.2009.4959533

R. Weiss and D. Ellis, Speech separation using speaker-adapted eigenvoice speech models, Computer Speech & Language, vol.24, issue.1, pp.16-29, 2010.
DOI : 10.1016/j.csl.2008.03.003

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.148.6572

M. Wendelboe, Using OQSTFT and a modied SHS to detect the melody in polyphonic music (mirex 2009) Extended abstract for the Music Information Retrieval Evaluation eXchange, 2009.

.. Note-wise-melody-estimation, 63 P Power Spectral Density (PSD), p.89

.. Sinebell-window, see Cosine window Smooth lters -Gaussian Scaled Mixture Model (SGSMM), p.82

.. Smoothness-constraint-on-the-lters, 82 INDEX Source separation (instantaneous linear mixture ), p.64

. Parameter, . St, F. For-each-system, F. , and F. , 147 6.2 Results of the proposed algorithms compared to the other systems submitted to MIREX 2008 Audio Melody Extraction task. We also added the results by 2 participants from the MIREX, p.150, 2006.

S. Short-time-fourier-transform, STFT) of 2 excerpts from the ADC2004 database. The ground-truth melody line is drawn as solid line over the, p.54

]. Benaroya, Graphical model for the observation layer, rst layer dependency for the mixture. The Fourier vectors for the voice v n and the music m n are respectively generated through the states Z V n and Z M n . The mixture vector x n is the sum of v n and v n , and thus only depends on these vectors. The only observed variable is x n STFT example: excerpt from ADC2004 database, opera_male5.wav. Darker colors correspond to higher energy, proportional to the squared magnitude of the STFT (its power), in dB. The analysis window length is 46.44ms, and the overlap ratio is 87, 2004.

S. Estimated, H. , W. ?. , W. ?. , W. M. et al., song opera_male5.wav, second round (for system SEP-I), p.125, 2004.

. Solodurrieu, Accompaniment Separation: algorithm outline, p.162, 2009.

. Solodurrieu, Accompaniment Separation System Flow, p.166, 2009.

S. Ugelhorn, Evolution of SIR gains and solo sections for 4 instruments: guitar, piano Spectrum of a harmonic sound and of an inharmonic sound, p.180

I. Estimating, ?. Imm-=-{w-?, H. ?. , W. M. , H. et al., 121 5.2 Updating rules for the SIMM: Estimating ? 122 5.3 EM algorithm for the (S)GSMM: Estimating ?, equal to ?