Modèle statistique de l'animation expressive de la parole et du rire pour un agent conversationnel animé

Abstract : Our aim is to render expressive multimodal behaviors for Embodied conversational agents, ECAs. ECAs are entities endowed with communicative and emotional capabilities; they have human-like appearance. When an ECA is speaking or laughing, it is capable of displaying autonomously behaviors to enrich and complement the uttered speech and to convey qualitative information such as emotion. Our research lies in the data-driven approach. It focuses on generating the multimodal behaviors for a virtual character speaking with different emotions. It is also concerned with simulating laughing behavior on an ECA. Our aim is to study and to develop human-like animation generators for speaking and laughing ECA. On the basis of the relationship linking speech prosody and multimodal behaviors, our animation generator takes as input human uttered audio signals and output multimodal behaviors. Our work focuses on using statistical framework to capture the relationship between the input and the output signals; then this relationship is rendered into synthesized animation. In the training step, the statistical framework is trained based on joint features, which are composed of input and of output features. The relation between input and output signals can be captured and characterized by the parameters of the statistical framework. In the synthesis step, the trained framework is used to produce output signals (facial expression, head and torso movements) from input signals (F0, energy for speech or pseudo-phoneme of laughter). The relation captured in the training phase can be rendered into the output signals. Our proposed module is based on variants of Hidden Markov Model (HMM), called Contextual HMM. This model is capable of capturing the relationship between human motions and speech (or laughter); then such relationship is rendered into the synthesized animations.
Complete list of metadatas

Cited literature [124 references]  Display  Hide  Download

https://pastel.archives-ouvertes.fr/tel-01354335
Contributor : Abes Star <>
Submitted on : Thursday, August 18, 2016 - 3:49:08 PM
Last modification on : Friday, May 17, 2019 - 12:36:47 PM
Long-term archiving on : Saturday, November 19, 2016 - 7:21:05 PM

File

TheseDing2.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01354335, version 1

Citation

Yu Ding. Modèle statistique de l'animation expressive de la parole et du rire pour un agent conversationnel animé. Intelligence artificielle [cs.AI]. Télécom ParisTech, 2014. Français. ⟨NNT : 2014ENST0050⟩. ⟨tel-01354335⟩

Share

Metrics

Record views

394

Files downloads

100