The emerging biotechnologies have significantly advanced the study of biological mechanisms. However, biological data usually contain a great amount of missing information, e.g. missing features, missing labels or missing samples, which greatly limits the extensive usage of the data. In this book, we introduce different types of biological data missing scenarios and propose machine learning models to improve the data analysis, including deep recurrent neural network recovery for feature missings, robust information theoretic learning for label missings and structure-aware rebalancing for minor sample missings. Models in the book cover the fields of imbalance learning, deep learning, recurrent neural network and statistical inference, providing a wide range of references of the integration between artificial intelligence and biology. With simulated and biological datasets, we apply approaches to a variety of biological tasks, including single-cell characterization, genome-wide association studies, medical image segmentations, and quantify the performances in a number of successful metrics.
The outline of this book is as follows. In Chapter 2, we introduce the statistical recovery of missing data features; in Chapter 3, we introduce the statistical recovery of missing labels; in Chapter 4, we introduce the statistical recovery of missing data sample information; finally, in Chapter 5, we summarize the full text and outlook future directions. This book can be used as references for researchers in computational biology, bioinformatics and biostatistics. Readers are expected to have basic knowledge of statistics and machine learning.
Tabella dei contenuti
Chapter 1 Introduction.- Chapter 2 Fast computational recovery of missing features for large-scale biological data.- Chapter 3 Computational recovery of information from low-quality and missing labels.- Chapter 4 Computational recovery of sample missings.- Chapter 5 Summary and outlook.
Circa l’autore
Feng Bao currently is a Postdoctoral Scholar at University of California, San Francisco. He received the Ph.D. degree in control science and engineering from Department of Automation, Tsinghua University, Beijing, China, in 2019. His research interest covers machine learning, computational biology, cell profiling technologies.