The first book of its kind to review the current status and
future direction of the exciting new branch of machine
learning/data mining called imbalanced learning
Imbalanced learning focuses on how an intelligent system can
learn when it is provided with imbalanced data. Solving imbalanced
learning problems is critical in numerous data-intensive networked
systems, including surveillance, security, Internet, finance,
biomedical, defense, and more. Due to the inherent complex
characteristics of imbalanced data sets, learning from such data
requires new understandings, principles, algorithms, and tools to
transform vast amounts of raw data efficiently into information and
knowledge representation.
The first comprehensive look at this new branch of machine
learning, this book offers a critical review of the problem of
imbalanced learning, covering the state of the art in techniques,
principles, and real-world applications. Featuring contributions
from experts in both academia and industry, Imbalanced Learning:
Foundations, Algorithms, and Applications provides chapter
coverage on:
* Foundations of Imbalanced Learning
* Imbalanced Datasets: From Sampling to Classifiers
* Ensemble Methods for Class Imbalance Learning
* Class Imbalance Learning Methods for Support Vector
Machines
* Class Imbalance and Active Learning
* Nonstationary Stream Data Learning with Imbalanced Class
Distribution
* Assessment Metrics for Imbalanced Learning
Imbalanced Learning: Foundations, Algorithms, and
Applications will help scientists and engineers learn how to
tackle the problem of learning from imbalanced datasets, and gain
insight into current developments in the field as well as future
research directions.
İçerik tablosu
Preface ix
Contributors xi
1 Introduction 1
Haibo He
1.1 Problem Formulation 1
1.2 State-of-the-Art Research 3
1.3 Looking Ahead: Challenges and Opportunities 6
1.4 Acknowledgments 7
References 8
2 Foundations of Imbalanced Learning 13
Gary M. Weiss
2.1 Introduction 14
2.2 Background 14
2.3 Foundational Issues 19
2.4 Methods for Addressing Imbalanced Data 26
2.5 Mapping Foundational Issues to Solutions 35
2.6 Misconceptions About Sampling Methods 36
2.7 Recommendations and Guidelines 38
References 38
3 Imbalanced Datasets: From Sampling to Classifiers 43
T. Ryan Hoens and Nitesh V. Chawla
3.1 Introduction 43
3.2 Sampling Methods 44
3.3 Skew-Insensitive Classifiers for Class Imbalance 49
3.4 Evaluation Metrics 52
3.5 Discussion 56
References 57
4 Ensemble Methods for Class Imbalance Learning 61
Xu-Ying Liu and Zhi-Hua Zhou
4.1 Introduction 61
4.2 Ensemble Methods 62
4.3 Ensemble Methods for Class Imbalance Learning 66
4.4 Empirical Study 73
4.5 Concluding Remarks 79
References 80
5 Class Imbalance Learning Methods for Support Vector Machines 83
Rukshan Batuwita and Vasile Palade
5.1 Introduction 83
5.2 Introduction to Support Vector Machines 84
5.3 SVMs and Class Imbalance 86
5.4 External Imbalance Learning Methods for SVMs: Data Preprocessing Methods 87
5.5 Internal Imbalance Learning Methods for SVMs: Algorithmic Methods 88
5.6 Summary 96
References 96
6 Class Imbalance and Active Learning 101
Josh Attenberg and Seyda Ertekin
6.1 Introduction 102
6.2 Active Learning for Imbalanced Problems 103
6.3 Active Learning for Imbalanced Data Classification 110
6.4 Adaptive Resampling with Active Learning 122
6.5 Difficulties with Extreme Class Imbalance 129
6.6 Dealing with Disjunctive Classes 130
6.7 Starting Cold 132
6.8 Alternatives to Active Learning for Imbalanced Problems 133
6.9 Conclusion 144
References 145
7 Nonstationary Stream Data Learning with Imbalanced Class Distribution 151
Sheng Chen and Haibo He
7.1 Introduction 152
7.2 Preliminaries 154
7.3 Algorithms 157
7.4 Simulation 167
7.5 Conclusion 182
7.6 Acknowledgments 183
References 184
8 Assessment Metrics for Imbalanced Learning 187
Nathalie Japkowicz
8.1 Introduction 187
8.2 A Review of Evaluation Metric Families and their Applicability to the Class Imbalance Problem 189
8.3 Threshold Metrics: Multiple- Versus Single-Class Focus 190
8.4 Ranking Methods and Metrics: Taking Uncertainty into Consideration 196
8.5 Conclusion 204
8.6 Acknowledgments 205
References 205
Index 207
Yazar hakkında
HAIBO HE, Ph D, is an Associate Professor in the
Department of Electrical, Computer, and Biomedical Engineering at
the University of Rhode Island. He received the National Science
Foundation (NSF) CAREER Award and Providence Business News (PBN)
Rising Star Innovator Award.
YUNQIAN MA Ph D, is a senior principal research scientist
of Honeywell Labs at Honeywell Inter-national, Inc. He received the
International Neural Network Society (INNS) Young Investigator
Award.