In 1901, Karl Pearson invented Principal Component Analysis (PCA). Since then, PCA serves as a prototype for many other tools of data analysis, visualization and dimension reduction: Independent Component Analysis (ICA), Multidimensional Scaling (MDS), Nonlinear PCA (NLPCA), Self Organizing Maps (SOM), etc. The book starts with the quote of the classical Pearson definition of PCA and includes reviews of various methods: NLPCA, ICA, MDS, embedding and clustering algorithms, principal manifolds and SOM. New approaches to NLPCA, principal manifolds, branching principal components and topology preserving mappings are described as well. Presentation of algorithms is supplemented by case studies, from engineering to astronomy, but mostly of biological data: analysis of microarray and metabolite data. The volume ends with a tutorial ‘PCA and K-means decipher genome’. The book is meant to be useful for practitioners in applied data analysis in life sciences, engineering, physics and chemistry; it will also be valuable to Ph D students and researchers in computer sciences, applied mathematics and statistics.
Inhoudsopgave
Developments and Applications of Nonlinear Principal Component Analysis – a Review.- Nonlinear Principal Component Analysis: Neural Network Models and Applications.- Learning Nonlinear Principal Manifolds by Self-Organising Maps.- Elastic Maps and Nets for Approximating Principal Manifolds and Their Application to Microarray Data Visualization.- Topology-Preserving Mappings for Data Visualisation.- The Iterative Extraction Approach to Clustering.- Representing Complex Data Using Localized Principal Components with Application to Astronomical Data.- Auto-Associative Models, Nonlinear Principal Component Analysis, Manifolds and Projection Pursuit.- Beyond The Concept of Manifolds: Principal Trees, Metro Maps, and Elastic Cubic Complexes.- Diffusion Maps – a Probabilistic Interpretation for Spectral Embedding and Clustering Algorithms.- On Bounds for Diffusion, Discrepancy and Fill Distance Metrics.- Geometric Optimization Methods for the Analysis of Gene Expression Data.- Dimensionality Reduction and Microarray Data.- PCA and K-Means Decipher Genome.