This book presents advances in speech and music in the domain of audio signal processing. The book begins with introductory chapters on the basics of speech and music, and then proceeds to computational aspects of speech and music, including music information retrieval and spoken language processing. The authors discuss the intersection in the field of computer science, musicology and speech analysis, and how the multifaceted nature of speech and music information processing requires unique algorithms, systems using sophisticated signal processing, and machine learning techniques that better extract useful information. The authors discuss how a deep understanding of both speech and music in terms of perception, emotion, mood, gesture and cognition is essential for successful application. Also discussed is the overwhelming amount of data that has been generated across the world that requires efficient processing for better maintenance, retrieval, indexing and querying and how machine learning and artificial intelligence are most suited for these computational tasks. The book provides both technological knowledge and a comprehensive treatment of essential topics in speech and music processing.
Зміст
State-of-the-Art.- A comprehensive review on Speaker Recognition.- Music Composition with Deep Learning: A Review .- Music Recommendation Systems: Overview and Challenges.- Music Recommender Systems: A Review Centered on Biases.- Computational Approaches for Indian Classical Music: A Comprehensive Review.- Machine Learning.- A Study on Effectiveness of Deep Neural Networks for Speech Signal Enhancement in Comparison with Wiener Filtering Technique.- Video Soundtrack Evaluation with Machine Learning: Data Availability, Feature Extraction and Classification.- Deep Learning Approach to Joint Identification of Instrument, Shruthi and Raga for Indian Classical Music.- Comparison of Convolutional Neural Networks and K-Nearest Neighbours for Music Instrument Recognition.- Emotion Recognition in Music using Deep Neural Networks.- Perception, Health and Emotion.- Music to Ears in Hearing Impaired-Signal Processing Advancements in Hearing Amplification Devices.- Music Therapy – A Best Way to Solve Anxiety and Depression in Diabetes Mellitus Patients.- Music and Stress During Covid-19 Lockdown: Influence of Locus of Control and Coping Styles on Musical Preferences.- Biophysics of Brain Plasticity and Its Correlation to Music Learning.- Dealing with Emotional Speech and Text: A Special Focus on Bengali Language.- Case Studies.- Duplicate Detection for Digital Audio Archive Management: Two Case Studies.- Section Order, Refrain Perception, and the Interpretation of a Song’s Meaning.- Musical Influence on Visual Aesthetics: An Exploration on Intermediality using Audience Response, Feature and Fractal Analysis.- Influence of Musical Acoustics on Graphic Design: An Exploration with Indian Classical Music Album Cover Design.- A Fractal Approach to Characterize Emotions in Audio and Visual Domain: A Study on Cross-Modal Interaction.- Inharmonic Frequency Analysis of Tabla Strokes in North Indian Classical Music.
Про автора
Dr. Anupam Biswas received his Ph.D. degree in computer science and engineering from Indian Institute of Technology (BHU), Varanasi, India, in 2017. He has received his M.Tech. and B.E. degrees in computer science and engineering from Nehru National Institute of Technology Allahabad, Prayagraj, India, in 2013, and Jorhat Engineering College, Jorhat, Assam, in 2011, respectively. He is currently working as Assistant Professor in the Department of Computer Science & Engineering, National Institute of Technology Silchar, Assam, India. He has published several research papers in reputed international journals, conference, and book chapters. His research interests include computational music, machine learning, fuzzy systems, information retrieval, and evolutionary computation. He has served as Program Chair of the International Conference on Big Data, Machine Learning and Applications (Big DML 2019). He has served as General Chair of 25th International Symposium Frontiers of Researchin Speech and Music (FRSM 2020) and co-edited proceedings of FRSM 2020 published as book volume in Springer AISC Series. He has co-edited three books titled “Health Informatics: A Computational Perspective in Healthcare” and “Principles of Social Networking: The New Horizon and Emerging Challenges” with Springer series and “Principles of Big Graph: In-depth Insight” with Elsevier book series.
Prof. Dr. Emile Wennekes is Chair Professor of Musicology at Utrecht University, The Netherlands. He was appointed full professor there in 2000. From 2006 to 2011, he was the first Head of the School of the Media and Culture Studies department. In 2017, his chair was modified from Musicology: Post-1800 Music History into Musicology: Music and Media, now also officially embracing his main field of research. Wennekes has written on a broad range of subjects, including a co-authored book on contemporary Dutch music available in six languages. His work has been published by, among others, Oxford University Press, Routledge, Michigan University Press, and Brepols. Most recently, he edited the volume Cinema Changes: Incorporation of Jazz in the Film Soundtrack (Brepols 2019) together with Dr. Emilio Audissino. Wennekes founded and chairs the Study Group Music and Media (Ma M) under the auspices of the International Musicological Society. He coordinates its annual conferences.
Dr. Alicja A. Wieczorkowska Ph.D., D.Sc., is a computer scientist, specializing in audio signal analysis. She holds a Ph.D. from the Gdansk University of Technology, and additionally she is the alumna of the State School of Music (Second Level). Her Ph.D. thesis examined the automatic recognition of musical instrument sounds, depending on parameterization and classifiers applied. Dr. A. Wieczorkowska is presently Associate Professor and Head of Multimedia Laboratory at the Polish-Japanese Academy of Information Technology, Warsaw, Poland. Additionally, she is also an associate member of the Graduate Faculty at the University of North Carolina at Charlotte. Her scientific interests include audio information retrieval, music and speech processing, data mining, as well as computer graphics, multimedia, and automated identification of emotions from various signals.
Dr. Rabul Husain Laskar received his Ph D degree in Electronics & Communication Engineering from National Institute of Technology Silchar, India, in 2013. He has received his M.Tech. and B.E. degrees in Electronics and Communication Engineering and Electrical Engineering from Indian Institute of Technology, Guwahati, India, in 2007, and National Institute of Technology, Silchar, Assam, in 1998, respectively. He is currently working as Associate Professor in the Department of Electronics & Communication Engineering, National Institute of Technology Silchar, Assam, India. He has published several research papers in reputed international journals, conference, and book chapters. His research interests include speech and audio processing, image and video processing, multimedia systems, biomedical signal processing, machine learning and soft computing techniques. He has served as technical program committee member and session chairs in different national and International Conferences.