This book offers a much needed critical introduction to data mining and ‘big data’. Supported by multiple case studies and examples, the authors provide:
- Digestible overviews of key terms and concepts relevant to using social media data in quantitative research.
- A critical review of data mining and ‘big data’ from a complexity science perspective, including its future potential and limitations
- A practical exploration of the challenges of putting together and managing a ‘big data’ database
- An evaluation of the core mathematical and conceptual frameworks, grounded in a case-based computational modeling perspective, which form the foundations of all data mining techniques
Inhoudsopgave
Chapter 1: Introduction
Part 1: Thinking Complex and Critically
Chapter 2: The Failure of Quantitative Social Science
Chapter 3: What is Big Data?
Chapter 4: What is Data Mining
Chapter 5: The Complexity Turn
Part 2: The Tools and Techniques of Data Mining
Chapter 6: Case-Based Complexity: A Data Mining Vocabulary
Chapter 7: Classification and Clustering
Chapter 8: Machine Learning
Chapter 9: Predictive Analytics and Data Forecasting
Chapter 10: Longitudinal Analysis
Chapter 11: Geospatial Modeling
Chapter 12: Complex Network Analysis
Chapter 13: Textual and Visual Data Mining
Chapter 14: Conclusion: Advancing A Complex Digital Social Science
Over de auteur
Rajeev Rajaram is a Professor of Mathematics at Kent State University. Rajeev’s primary training is in control theory of partial differential equations and he is currently interested in applications of differential equations and ideas from statistical mechanics and thermodynamics to model and measure complexity. He and Brian Castellani have worked together to create a new case – based method for modeling complex systems, called the SACS Toolkit, which has been used to study topics in health, health care, societal infrastructures, power – grid reliability, restaurant mobility, and depression trajectories. More recently, he is interested in mathematical properties of entropy based diversity measures for probability distributions.