Skip to Main content Skip to Navigation

Vers le temps réel en transcription automatique de la parole grand vocabulaire

Abstract : Large vocabulary speech recognition systems based on hidden Markov models (HMM) make use of many tens of thousands of Gaussian distributions to achieve improved recognition. Hence, the computation of the state likelihood is time consuming. As the performance and the speed of such recognition systems are closely related to the number of HMM Gaussians, reducing the number of Gaussians without decreasing the system performance is of major interest.
Assuming that only few Gaussians dominate the state likelihood, Gaussian selection techniques have been developed to detect them. These techniques are based on classification and can be divided into two categories: state and model based methods.
In order to improve the state-based Gaussian selection we propose an original clustering and a multi-level Gaussian selection.
The clustering algorithm use a new Gaussian similarity distance.
In model based methods the classification is applied to the Gaussian distributions of all the models. The contextual information is lost due to merging distributions of the different contexts. So we introduce a contextual Gaussian selection.
In recent years, as an alternative to the Gaussian selection, sub-vector quantization was successfully used to reduce the acoustic models complexity. Unfortunally, these techniques make use of the classification by merging different contexts. Hence we investigate a contextual sub-vector quantization.
The proposed algorithms are evaluated within a framework of large vocabulary continuous speech recognition. Results outperform some existing methods.
Document type :
Complete list of metadatas

Cited literature [96 references]  Display  Hide  Download
Contributor : Leila Zouari <>
Submitted on : Thursday, March 20, 2008 - 11:24:25 AM
Last modification on : Friday, October 23, 2020 - 4:37:49 PM
Long-term archiving on: : Friday, May 21, 2010 - 12:43:29 AM


  • HAL Id : tel-00265838, version 1



Leila Zouari. Vers le temps réel en transcription automatique de la parole grand vocabulaire. Interface homme-machine [cs.HC]. Télécom ParisTech, 2007. Français. ⟨tel-00265838⟩



Record views


Files downloads