This book includes peer-reviewed articles from the 12th International Workshop on Spoken Dialogue System Technology, IWSDS 2021, Singapore. Nowadays, dialogue systems or conversational agents have become one of the most important mechanisms for human-computer or human-robot interaction that has been widely adopted as new paradigm for many applications, companies, and final users. On the other hand, recent advances in natural language processing, understanding and generation, as well as a continuous increasing computational power and large number of resources and data, have brought important and consistent improvements to the capabilities of dialogue systems enabling users to have more productive and enjoyable interactions. However, on the threshold of a new decade, the current state of the art shows important areas where improvements are needed such as incorporation of ground-based knowledge, personality, emotions, and adaptability, as well as automatic mechanisms for objective, robustand fast evaluations, especially in the context of developing social and e-health applications. In this 12th edition of the International Workshop on Spoken Dialogue Systems (IWSDS), “Conversational AI for natural human-centric interaction“ compiles and presents a synopsis on current global research efforts to push forward the state of the art in dialogue technologies, including advances to the classical problems of dialogue management, language generation and understanding, personalisation and generation, spokena and multimodal interaction, dialogue evaluation, dialogue modelling and applications, as well as topics related to chatbots and conversational agent technologies.
表中的内容
Out-of-Scope Domain and Intent Classification through Hierarchical Joint Modeling.- Segmentation-Based Formulation of Slot Filling Task for Better Generative Modeling.- Can we predict how challenging Spoken Language Understanding corpora are across sources, languages and domains?.- Personalized Extractive Summarization with Discourse Structure Constraints Towards Efficient and Coherent Dialog-based News Delivery.- Empathetic Dialogue Generation with Pre-trained Ro BERTa-GPT2 and External Knowledge.- Towards Handling Unconstrained User Preferences.- Jurassic is (almost) All You Need: Few-Shot Meaning-to-Text Generation for Open-Domain Dialogue.- Comparison of Automatic Speech Recognition Systems.- Multimodal Dialogue Response Timing Estimation Using Dialogue Context Encoder.- Eliciting Cooperative Persuasive Dialogue by Multimodal Emotional Robot.
关于作者
Svetlana Stoyanchev is a Senior Research Engineer and Dialogue Group Lead at the Speech Technology Group in Cambridge Research Lab, Toshiba, Europe. She received her Ph.D. in Computer Science from Stony Brook University. Her research interests include spoken, multimodal, and argumentative dialogue, error recovery in human-computer communication, information presentation, and dialogue analytics. Her recent work focuses on combining knowledge-driven and statistical approaches to modelling human-computer dialogue. Previously, she held research positions at The Open University, Columbia University, AT&T Labs Research, and Interactions Corporation. Svetlana served as an IEEE Speech and Language Processing Technical Committee Member in 2014–2016. She is a Member of the programme committees for research conferences and workshops, including SIGDial, ACL, AAAI, COLING, and Interspeech. She has Co-authored over 30 research publications in the field of dialogue and natural language processing andis Co-inventor of four US patents.
Stefan Ultes is Dialogue Research Lead at Mercedes Benz Research & Development leading the Speech Technology research group. His research focusses on methods and technology that bring natural spoken human-machine interaction forward, thus contributing to the next generations of the Mercedes Benz User Experience. He Co-supervises several doctoral students and is a Lecturer in the “Dialogue Systems” course of the Dialogue Systems Group at Ulm University. Previously, he was a Research Associate at the Spoken Dialogue Systems Group at the University of Cambridge working with Prof. Steve Young and Prof. Milica Gasic within the EPSRC project “Open Domain Statistical Spoken Dialogue Systems”. He has received his Diploma (M.Sc.) in Computer Science from the Karlsruhe Institute of Technology (Germany) in 2010 and his doctorate in engineering (Ph.D.) at the Dialogue Systems Group at Ulm University (Germany) in 2015 on The topic “User-centred Adaptive Spoken Dialogue Modelling”.
Haizhou Li received the B.Sc., M.Sc., and Ph.D. degrees in electrical and electronic engineering from South China University of Technology, Guangzhou, China, in 1984, 1987, and 1990, respectively. He is now a Presidential Chair Professor and Associate Dean (Research) at the School of Data Science, The Chinese University of Hong Kong (Shenzhen). He is also with the Department of Electrical and Computer Engineering, National University of Singapore (NUS), Singapore. Dr Li has worked on speech and language technology in academia and industry since 1988. He has taught in The University of Hong Kong (1988-1989), South China University of Technology in Guangzhou, China (1990-1994), Nanyang Technological University in Singapore (2006-2016), University of Eastern Finland (2009), and University of New South Wales (since 2011). He was a Visiting Professor at CRIN/INRIA in France (1994-1995). Prior to joining CUHKSZ and NUS, he was a Research Manager in Apple-ISS Research Centre (1996-1998), Research Director of Lernout & Hauspie Asia Pacific (1999-2001), Vice President of Info Talk Corp. Ltd, and General Manager of Info Talk Technology (Singapore) Pte Ltd (2001-2003), the Principal Scientist and Department Head of Human Language Technology at the Institute for Infocomm Research (2003-2016), and the Research Director of the Institute for Infocomm Research (2014-2016), the Agency for Science, Technology and Research, Singapore. He Co-founded Baidu-I2R Research Centre in Singapore (2012). Dr. Li is an IEEE Fellow, and ISCA Fellow. Dr. Li’s research interests include automatic speech recognition, natural language processing, and information retrieval.