The aim of the book is to represent a flexible and efficient algorithm and a novel system used for the planning, generation, and realization of conversational behavior (co-verbal behavior). Such behavior is best described as a set of moving body parts, which are meaningful. In terms of prosody, it is synchronized with the accompanying speech. The movement and shapes generated as a co-verbal behavior represent a contextual link between a repertoire of independent motor skills (shapes, movements, and poses that conversational agent can reproduce and execute), and the intent/meaning of spoken sequences (context). The actual intent/meaning of spoken content is identified through language-dependent linguistic markers and prosody. The knowledge databases used to determine the intent/meaning of text are based on the linguistic analysis and classification of the text into semiotic classes and subclasses achieved through annotation of multimodal corpora based on the proposed EVA annotation scheme. The scheme allows for capturing features at a functional (context-dependent), as well as at a descriptive (context-independent) level. The functional level captures high-level features that describe the correlation between speech and co-verbal behavior, whereas the descriptive level allows us to capture and define body-poses and shapes independently of verbal content and in high-resolution. The annotation scheme, therefore, not only interlinks speech and gesture at a semiotic level, but also serves as a basis for the creation of a context independent repertoire of movement and shapes. The process of generating the co-verbal behavior is, in this book, divided into two phases. The first phase deals with the classification of intent and its synchronization with the verbal content and prosody. The second phase then transforms the planned and synchronized behavior into a co-verbal animation performed by an embodied conversational agent (ECA). In order to be able to extrapolate intent from arbitrary text-sequences, the algorithm for the formulation of behavior deduces meaning/intent in regard to the semiotic intent. Furthermore, the algorithm considers the linguistic features of arbitrary and un-annotated text and select primitive gestures based on semiotic nuclei, as identified by semiotic classification and further modeled by the predicted prosodic features of speech to be generated by a general text-to-speech system (TTS). The output of the phase for formulation of behavior is represented as a hierarchical procedure encoded in XML format, and as a speech sequence generated by TTS. The procedural description is event-oriented and represents a well-defined structure of consecutive movements of body-parts, as well as of body-parts moving in parallel. The second phase of the novel architecture transforms the procedural descriptions into a series of coherent animations of individual parts of the articulated embodied conversational agent. In this regard a novel ECA-based realization framework named EVA-framework is also represented. It supports a real-time realization of procedural animation descriptions and plans on multi-part mesh-based models, by using skeletal animation, blend shape animation, and the animation of predefined (pre-recorded) animated segments. This book, therefore, considers a complete design and implementation of an expressive model for the generation of co-verbal behavior, which is able to transform un-annotated text into a speech-synchronized series of animated sequences.
Izidor Mlakar & Matej Rojc
Expressive Conversational-Behavior Generation Model For Advanced Interaction within Multimodal User Interfaces [PDF ebook]
Expressive Conversational-Behavior Generation Model For Advanced Interaction within Multimodal User Interfaces [PDF ebook]
Köp den här e-boken och få 1 till GRATIS!
Formatera PDF ● Sidor 248 ● ISBN 9781634840842 ● Redaktör Izidor Mlakar & Matej Rojc ● Utgivare Nova Science Publishers ● Publicerad 2016 ● Nedladdningsbara 3 gånger ● Valuta EUR ● ID 7226324 ● Kopieringsskydd Adobe DRM
Kräver en DRM-kapabel e-läsare