Provides an up-to-date analysis of big data and multi-agent systems
The term Big Data refers to the cases, where data sets are too large or too complex for traditional data-processing software. With the spread of new concepts such as Edge Computing or the Internet of Things, production, processing and consumption of this data becomes more and more distributed. As a result, applications increasingly require multiple agents that can work together. A multi-agent system (MAS) is a self-organized computer system that comprises multiple intelligent agents interacting to solve problems that are beyond the capacities of individual agents. Modern Big Data Architectures examines modern concepts and architecture for Big Data processing and analytics.
This unique, up-to-date volume provides joint analysis of big data and multi-agent systems, with emphasis on distributed, intelligent processing of very large data sets. Each chapter contains practical examples and detailed solutions suitable for a wide variety of applications. The author, an internationally-recognized expert in Big Data and distributed Artificial Intelligence, demonstrates how base concepts such as agent, actor, and micro-service have reached a point of convergence–enabling next generation systems to be built by incorporating the best aspects of the field. This book:
* Illustrates how data sets are produced and how they can be utilized in various areas of industry and science
* Explains how to apply common computational models and state-of-the-art architectures to process Big Data tasks
* Discusses current and emerging Big Data applications of Artificial Intelligence
Modern Big Data Architectures: A Multi-Agent Systems Perspective is a timely and important resource for data science professionals and students involved in Big Data analytics, and machine and artificial learning.
Зміст
List of Figures ix
List of Tables xi
Preface xiii
Acknowledgments xv
Acronyms xvii
Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Assumptions 3
1.3 For Whom is This Book? 4
1.4 Book Structure 4
Chapter 2 Evolution of IT Architectures and Paradigms 7
2.1 Evolution of IT Architectures 7
2.1.1 Monolith 7
2.1.2 Service Oriented Architecture 9
2.1.3 Microservices 12
2.2 Actors and Agents 15
2.2.1 Actors 15
2.2.2 Agents 17
2.3 From ACID to BASE, CAP, and No SQL – The Database (R)evolution 22
2.4 The Cloud 24
2.5 From Distributed Sensor Networks to the Internet of Things and Cyber-Physical Systems 27
2.6 The Rise of Big Data 28
Chapter 3 Sources of Data 31
3.1 The Internet 32
3.1.1 The Semantic Web 32
3.1.2 Linked Data 35
3.1.3 Knowledge Graphs 36
3.1.4 Social Media 38
3.1.5 Web Mining 38
3.2 Scientific Data 40
3.2.1 Biomedical Data 40
3.2.2 Physics and Astrophysics Data 41
3.2.3 Environmental Sciences 44
3.3 Industrial Data 45
3.3.1 Smart Factories 45
3.3.2 Smart Grid 47
3.3.3 Aviation 47
3.4 Internet of Things 48
Chapter 4 Big Data Tasks 51
4.1 Recommender Systems 51
4.2 Search 52
4.3 Ad-tech and RTB Algorithms 55
4.4 Cross-Device Graph Generation 57
4.5 Forecasting and Prediction Systems 58
4.6 Social Media Big Data 59
4.7 Anomaly and Fraud Detection 61
4.8 New Drug Discovery 63
4.9 Smart Grid Control and Monitoring 64
4.10 Io T and Big Data Applications 65
Chapter 5 Cloud Computing 67
5.1 Cloud Enabled Architectures 67
5.1.1 Cloud Management Platforms 67
5.1.2 Efficient Cloud Computing 73
5.1.3 Distributed Storage Systems 75
5.2 Agents and the Cloud 82
5.2.1 Multi-agent Versus Cloud Paradigms 83
5.2.2 Agents in the Cloud 83
Chapter 6 Big Data Architectures 87
6.1 Big Data Computation Models 87
6.1.1 Map Reduce 87
6.1.2 Directed Acyclic Graph Models 89
6.1.3 All-Pairs 92
6.1.4 Very Large Bitmap Operations 93
6.1.5 Message Passing Interface 94
6.1.6 Graphical Processing Unit Computing 95
6.2 Publish-Subscribe Systems 97
6.3 Stream Processing 99
6.3.1 Information Flow Processing Concepts 99
6.3.2 Stream Processing Systems 101
6.4 Higer Level Big Data Architectures 110
6.4.1 Spark 110
6.4.2 Lambda 112
6.4.3 Multi-Agent View of the Lambda Architecture 113
6.4.4 Questioning the Lambda 115
6.5 Industry and Other Approaches 116
6.6 Actor and Agent-Based Big Data Architectures 118
Chapter 7 Big Data Analytics, Mining, and Machine Learning 121
7.1 To SQL or Not to SQL 122
7.1.1 SQL Hadoop Interfaces 123
7.1.2 From Shark to Spark SQL 125
7.2 Big Data Mining and Machine Learning 128
7.2.1 Graph Mining 133
7.2.2 Agent Based Machine Learning and Data Mining 134
Chapter 8 Physically Distributed Systems – Mobile Cloud, Internet of Things, Edge Computing 137
8.1 Mobile Cloud 138
8.2 Edge and Fog Computing 145
8.2.1 Business Case: Mobile Context Aware Recommender System 147
8.3 Internet of Things 148
8.3.1 Io T Fundamentals 148
8.3.2 Io T and the Cloud 151
8.3.3 MAS in Io T 156
Chapter 9 Summary 159
Bibliography 161
Index 179
Про автора
DOMINIK RYZKO is an Assistant Professor at the Institute of Computer Science at Warsaw University of Technology. His research interests include Big Data and Distributed Artificial Intelligence. He is widely published, serves on program committees at international conferences, and is Vice President of artificial intelligence and analytics at Adform, a global ad-tech platform provider. He also spent three years at Allegro Group as the Chief Data Scientist where he oversaw Data Science activities, design and methodology of experiments, and model building.