Unified data-driven approach for audio indexing, retrieval and recognition

Abstract : The amount of available audio data, such as broadcast news archives, radio recordings, music and songs collections, podcasts or various internet media is constantly increasing. Therefore many audio indexing techniques are proposed in order to help users to browse audio documents. Nevertheless, these methods are developed for a specific audio content which makes them unsuitable to simultaneously treat audio streams where different types of audio document coexist. In this thesis we report our recent efforts in extending the ALISP approach developed for speech as a generic method for audio indexing, retrieval and recognition. The particularity of ALISP tools is that no textual transcriptions are needed during the learning step. Any input speech data is transformed into a sequence of arbitrary symbols. These symbols can be used for indexing purposes. The main contribution of this thesis is the exploitation of the ALISP approach as a generic method for audio indexing. The proposed system consists of three steps; an unsupervised training to model and acquire the ALISP HMM models, ALISP segmentation of audio data using the ALISP HMM models and a comparison of ALISP symbols using the BLAST algorithm and Levenshtein distance. The evaluations of the proposed systems are done on the YACAST and other publicly available corpora for several tasks of audio indexing.
Complete list of metadatas

Cited literature [143 references]  Display  Hide  Download

https://pastel.archives-ouvertes.fr/tel-01179994
Contributor : Abes Star <>
Submitted on : Thursday, July 23, 2015 - 4:58:05 PM
Last modification on : Thursday, October 17, 2019 - 12:36:06 PM
Long-term archiving on : Wednesday, April 26, 2017 - 8:03:54 AM

File

ThesekhemiriV2.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01179994, version 1

Citation

Houssemeddine Khemiri. Unified data-driven approach for audio indexing, retrieval and recognition. Signal and Image processing. Télécom ParisTech, 2013. English. ⟨NNT : 2013ENST0055⟩. ⟨tel-01179994⟩

Share

Metrics

Record views

286

Files downloads

260