This book provides a comprehensive overview of core concepts and technological foundations for continuous engineering of Web streams. It presents various systems and applications and includes real-world examples. Last not least, it introduces the readers to RSP4J, a novel open-source project that aims to gather community efforts in software engineering and empirical research.
The book starts with an introductory chapter that positions the work by explaining what motivates the design of specific techniques for processing data streams using Web technologies. Chapter 2 briefly summarizes the necessary background concepts and models needed to understand the remaining content of the book. Subsequently, chapter 3 focuses on processing RDF streams, taming data velocity in an open environment characterized by high data variety. It introduces query answering algorithms with RSP-QL and analytics functions over streaming data. Chapter 4 presents the life cycle of streaming linked data, it focuses on publishing streams on the Web as a prerequisite aspect to make data findable and accessible for applications. Chapter 5 touches on the problems of benchmarks and systems that analyze Web streams to foster technological progress. It surveys existing benchmarks and introduces guidelines that may support new practitioners in approaching the issue of continuous analytics. Finally, chapter 6 presents a list of examples and exercises that will help the reader to approach the area, get used to its practices and become confident in its technological possibilities.
Overall, this book is mainly written for graduate students and researchers in Web and stream data management. It collects research results and will guide the next generation of researchers and practitioners.
Innehållsförteckning
1. General Introduction.- 2. Preliminaries.- 3. Taming Variety and Velocity.- 4. Streaming Linked Data Life Cycle.- 5. Web Stream Processing Systems and Benchmarks.- 6. Exercise Book.
Om författaren
Ricardo Tommasini is a maître de conference (Associate Professor) at INSA Lyon. Pior to join LIRIS lab, he was an assistant professor at the University of Tartu, Estonia. Riccardo holds a Ph.D. from the Department of Electronics and Information of the Politecnico di Milano. His thesis, titled *Velocity on the Web*, investigates the velocity aspects that concern the Web environment. His research interests span Stream Processing, Knowledge Graphs, Logics, and Programming Languages.
Pieter Bonte is a post-doctoral researcher at the University of Ghent, IMEC, IDLab. He holds a Ph.D. in Computer Science from the University of Ghent. His research focuses on the use of Semantic Web technologies in the Io T, with a specific focus on scalable & distributed reasoning, stream reasoning, and RDF Stream Processing. He is particularly interested in increasing the expressivity of reasoning over high volatile streams. He has been active in several interdisciplinary projects, in which he was able to leverage his research in an industrial setting. Furthermore, he detailed his research at many international conferences.
Fabiano Spiga is an Italian independent researcher. He received his MSc in Computer Science from the University of Jyväskylä (Finland) and he is currently an external Ph D student at the University of Tartu (Estonia). His interests span from Big Data to Distributed Systems, the Semantic Web, Process Mining and Network Science, with a keen interest in the scalability and optimization of dataflow processes.
Emanuele Della Valle is an Associate Professor at the Department of Electronics and Information of the Politecnico di Milano. His research interests covered Big Data, Stream Processing, Semantic Technologies, Data Science, Web Information Retrieval, and Service Oriented Architectures. His work on Stream Reasoning was applied in analyzing Social Media, Mobile Telecom and Io T data streams in collaboration with Telecom Italia, IBM, Siemens, Oracle, Indra, and Statoil.