Représentations redondantes et hiérarchiques pour l'archivage et la compression de scènes sonores

Abstract : The main goal of this work is automated processing of large volumes of audio data. Most specifically, one is interested in archiving, a process that encompass at least two distinct problems: data compression and data indexing. Jointly addressing these problems is a difficult task since many of their objectives may be concurrent. Therefore, building a consistent framework for audio archival is the matter of this thesis. Sparse representations of signals in redundant dictionaries have recently been found of interest for many sub-problems of the archival task. Sparsity is a desirable property both for compression and for indexing. Methods and algorithms to build such representations are the first topic of this thesis. Given the dimensionality of the considered data, greedy algorithms will be particularly studied. A first contribution of this thesis is the proposal of a variant of the famous Matching Pursuit algorithm, that exploits randomness and sub-sampling of very large time frequency dictionaries. We show that audio compression (especially at low bit-rate) can be improved using this method. This new algorithms comes with an original modeling of asymptotic pursuit behaviors, using order statistics and tools from extreme values theory. Other contributions deal with the second member of the archival problem: indexing. The same framework is used and applied to different layers of signal structures. First, redundancies and musical repetition detection is addressed. At larger scale, we investigate audio fingerprinting schemes and apply it to radio broadcast on-line segmentation. Performances have been evaluated during an international campaign within the QUAERO project. Finally, the same framework is used to perform source separation informed by the redundancy. All these elements validate the proposed framework for the audio archiving task. The layered structures of audio data are accessed hierarchically by greedy decomposition algorithms and allow processing the different objectives of archival at different steps, thus addressing them within the same framework.
Document type :
Theses
Complete list of metadatas

Cited literature [208 references]  Display  Hide  Download

https://pastel.archives-ouvertes.fr/pastel-00834272
Contributor : Abes Star <>
Submitted on : Friday, June 14, 2013 - 3:37:10 PM
Last modification on : Thursday, October 17, 2019 - 12:36:06 PM
Long-term archiving on : Sunday, September 15, 2013 - 4:11:43 AM

File

these_Moussalam.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : pastel-00834272, version 1

Collections

Citation

Manuel Moussallam. Représentations redondantes et hiérarchiques pour l'archivage et la compression de scènes sonores. Autre. Télécom ParisTech, 2012. Français. ⟨NNT : 2012ENST0079⟩. ⟨pastel-00834272⟩

Share

Metrics

Record views

680

Files downloads

750