The fast and easy way to learn Python programming and statistics
Python is a general-purpose programming language created in the late 1980s–and named after Monty Python–that’s used by thousands of people to do things from testing microchips at Intel, to powering Instagram, to building video games with the Py Game library.
Python For Data Science For Dummies is written for people who are new to data analysis, and discusses the basics of Python data analysis programming and statistics. The book also discusses Google Colab, which makes it possible to write Python code in the cloud.
* Get started with data science and Python
* Visualize information
* Wrangle data
* Learn from data
The book provides the statistical background needed to get started in data science programming, including probability, random distributions, hypothesis testing, confidence intervals, and building regression models for prediction.
Table des matières
Introduction 1
Part 1: Getting Started with Data Science and Python 7
Chapter 1: Discovering the Match between Data Science and Python 9
Chapter 2: Introducing Python’s Capabilities and Wonders 21
Chapter 3: Setting Up Python for Data Science 39
Chapter 4: Working with Google Colab 59
Part 2: Getting Your Hands Dirty with Data 81
Chapter 5: Understanding the Tools 83
Chapter 6: Working with Real Data 99
Chapter 7: Conditioning Your Data 121
Chapter 8: Shaping Data 149
Chapter 9: Putting What You Know in Action 169
Part 3: Visualizing Information 183
Chapter 10: Getting a Crash Course in Mat Plot Lib 185
Chapter 11: Visualizing the Data 201
Part 4: Wrangling Data 227
Chapter 12: Stretching Python’s Capabilities 229
Chapter 13: Exploring Data Analysis 251
Chapter 14: Reducing Dimensionality 275
Chapter 15: Clustering 295
Chapter 16: Detecting Outliers in Data 313
Part 5: Learning from Data 327
Chapter 17: Exploring Four Simple and Effective Algorithms 329
Chapter 18: Performing Cross-Validation, Selection, and Optimization 347
Chapter 19: Increasing Complexity with Linear and Nonlinear Tricks 371
Chapter 20: Understanding the Power of the Many 411
Part 6: The Part of Tens 429
Chapter 21: Ten Essential Data Resources 431
Chapter 22: Ten Data Challenges You Should Take 437
Index 447
A propos de l’auteur
John Paul Mueller is a tech editor and the author of over 100 books on topics from networking and home security to database management and heads-down programming. Follow John’s blog at http://blog.johnmuellerbooks.com/. Luca Massaron is a data scientist who specializes in organizing and interpreting big data and transforming it into smart data. He is a Google Developer Expert (GDE) in machine learning.