Leverage machine and deep learning models to build applications on real-time data using Py Spark. This book is perfect for those who want to learn to use this language to perform exploratory data analysis and solve an array of business challenges.
You’ll start by reviewing Py Spark fundamentals, such as Spark’s core architecture, and see how to use Py Spark for big data processing like data ingestion, cleaning, and transformations techniques. This is followed by build...
Leverage machine and deep learning models to build applications on real-time data using Py Spark. This book is perfect for those who want to learn to use this language to perform exploratory data analysis and solve an array of business challenges.
You’ll start by reviewing Py Spark fundamentals, such as Spark’s core architecture, and see how to use Py Spark for big data processing like data ingestion, cleaning, and transformations techniques. This is followed by building workflows for analyzing streaming data using Py Spark and a comparison of various streaming platforms.
You’ll then see how to schedule different spark jobs using Airflow with Py Spark and book examine tuning machine and deep learning models for real-time predictions. This book concludes with a discussion on graph frames and performing network analysis using graph algorithms in Py Spark. All the code presented in the book will be available in Python scripts on Github.
What You’ll Learn
- Develop pipelines for streaming data processing using Py Spark
- Build Machine Learning & Deep Learning models using Py Spark latest offerings
- Use graph analytics using Py Spark
- Create Sequence Embeddings from Text data
Who This Book is For
Data Scientists, machine learning and deep learning engineers who want to learn and use Py Spark for real time analysis on streaming data.