Examine the latest technological advancements in building a scalable machine learning model with Big Data using R. This book shows you how to work with a machine learning algorithm and use it to build a ML model from raw data.
All practical demonstrations will be explored in R, a powerful programming language and software environment for statistical computing and graphics. The various packages and methods available in R will be used to explain the topics. For every machine learning algorithm covered in this book, a 3-D approach of theory, case-study and practice will be given. And where appropriate, the mathematics will be explained through visualization in R. All the images are available in color and hi-res as part of the code download.
This new paradigm of teaching machine learning will bring about a radical change in perception for many of those who think this subject is difficult to learn. Though theory sometimes looks difficult, especially when there is heavy mathematics involved, the seamless flow from the theoretical aspects to example-driven learning provided in this book makes it easy for someone to connect the dots..
What You’ll Learn
- Use the model building process flow
- Apply theoretical aspects of machine learning
- Review industry-based cae studies
- Understand ML algorithms using R
- Build machine learning models using Apache Hadoop and Spark
Who This Book is Fo r
Data scientists, data science professionals and researchers in academia who want to understand the nuances of machine learning approaches/algorithms along with ways to see them in practice using R.
The book will also benefit the readers who want to understand the technology behind implementing a scalable machine learning model using Apache Hadoop, Hive, Pig and Spark.
Cuprins
Chapter 1: Introduction to Machine Learning and R.- Chapter 2: Data Preparation and Exploration.- Chapter 3: Sampling and Resampling Techniques.- Chapter 4: Visualization of Data.- Chapter 5: Feature Engineering.- Chapter 6: Machine Learning Models: Theory and Practice.- Chapter 7: Machine Learning Model Evaluation.-Chapter 8: Model Performance Improvement.- Chapter 9: Scalable Machine Learning and related technology.-
Despre autor
Karthik Ramasubramanian, works for one of the largest and fastest growing technology unicorn in India, Hike Messenger. In his 7 years of research and industry experience, he has worked on cross-industry data science problems in retail, e-commerce, and technology, developing and prototyping data driven solutions.
In his previous role at Snapdeal, one of the largest e-commerce retailers in India, he was leading core statistical modelling initiatives for customer growth and pricing analytics. Prior to Snapdeal, he was part of central database team, managing the data warehouses for global business applications of Reckitt Benckiser (RB).
He has a Masters in Theoretical Computer Science from PSG College of Technology, Anna University and certified big data professional. He is passionate about teaching and mentoring future data scientists through different online and public forums. He also enjoys writing poems in his spare time and is an avidtraveler.
Abhishek Singh, is based in Ireland as a Data Scientist in the Advanced Data Science team for Prudential Financial Inc. He has 5 years of professional and academic experience in the Data Science field. At Deloitte Advisory, he led Risk Analytics initiatives for top US banks in their regulatory risk, credit risk, and balance sheet modelling requirements. In his current role, he is working on scalable machine learning algorithms for Individual Life Insurance branch of Prudential. He was also a trainer at Deloitte Professional University and development initiatives for professionals in the areas of statistics, economics, financial risk and data science tools (SAS and R).
Abhishek is a B.Tech. in Mathematics and Computing from Indian Institute of Technology, Guwahati and has an MBA from Indian Institute of Management, Bangalore. He speaks at public events on Data Science and is working with leading universities towards bringing data science skills to graduates. He also holds a Post Graduate Diploma in Cyber Law from NALSAR University. He enjoys cooking and photography during his free hours.