This book describes the basic principles underlying the generation, coding, transmission and enhancement of speech and audio signals, including advanced statistical and machine learning techniques for speech and speaker recognition with an overview of the key innovations in these areas. Key research undertaken in speech coding, speech enhancement, speech recognition, emotion recognition and speaker diarization are also presented, along with recent advances and new paradigms in these areas.
Содержание
From ‘Harmonic Telegraph’ to Cellular Phones.- Challenges in Speech Coding Research.- Recent Speech Coding Technologies and Standards.- Ensemble Learning Approaches in Speech Recognition.- Dynamic and Deep Networks For Speech Modeling and Recognition.- Speech Based Emotion Recognition.- Speaker Diarization: Challenges and Emerging Research.- Maximum a posteriori spectral estimation with source log-spectral priors for multichannel speech enhancement.- Modulation Processing for Speech Enhancement.
Об авторе
Tokunbo Ogunfunmi is an Associate Professor of Electrical Engineering and an Associate Dean for Research and Fac. Dev. at Santa Clara University.
Roberto Togneri is a professor with the School of Electrical, Electronic and Computer Engineering at The University of Western Australia.
Madihally (Sim) Narasimha is a Senior Director of Technology at Qualcomm Inc.