This book provides insights into smart ways of computer log data analysis, with the goal of spotting adversarial actions. It is organized into 3 major parts with a total of 8 chapters that include a detailed view on existing solutions, as well as novel techniques that go far beyond state of the art. The first part of this book motivates the entire topic and highlights major challenges, trends and design criteria for log data analysis approaches, and further surveys and compares the state of the art. The second part of this book introduces concepts that apply character-based, rather than token-based, approaches and thus work on a more fine-grained level. Furthermore, these solutions were designed for “online use”, not only forensic analysis, but also process new log lines as they arrive in an efficient single pass manner. An advanced method for time series analysis aims at detecting changes in the overall behavior profile of an observed system and spotting trends and periodicitiesthrough log analysis. The third part of this book introduces the design of the AMiner, which is an advanced open source component for log data anomaly mining. The AMiner comes with several detectors to spot new events, new parameters, new correlations, new values and unknown value combinations and can run as stand-alone solution or as sensor with connection to a SIEM solution. More advanced detectors help to determines the characteristics of variable parts of log lines, specifically the properties of numerical and categorical fields.
Detailed examples throughout this book allow the reader to better understand and apply the introduced techniques with open source software. Step-by-step instructions help to get familiar with the concepts and to better comprehend their inner mechanisms. A log test data set is available as free download and enables the reader to get the system up and running in no time.
This book is designed for researchers working in the field of cyber security, and specifically system monitoring, anomaly detection and intrusion detection. The content of this book will be particularly useful for advanced-level students studying computer science, computer technology, and information systems. Forward-thinking practitioners, who would benefit from becoming familiar with the advanced anomaly detection methods, will also be interested in this book.
Tabla de materias
I1.- Introduction.- 1.1 State of the art in security monitoring and anomaly detection.- 1.2 Current trends.- 1.3. future challenges.- 1.4 Log data analysis: today and tomorrow.- 1.5 Smart log data analytics: Structure of the book.- 1.6 Try it out: Hands-on examples throughout the book.- 2 Survey on log clustering approaches.- 2.1 Introduction. 2.2 Survey background.- 2.1 The nature of log data. 2.2 Static clustering.- 2.3 Dynamic clustering.- 2.4 Applications in the security domain.- 2.3 Survey method.- 2.3.1 Set of criteria.- 2.3.2 Literature search.- 2.4 Survey results.- 2.4.1 Purpose and applicability (P).- 2.4.2 Clustering techniques (C).- 2.4.3 Anomaly detection (AD).- 2.4.4 Evaluation (E). 2.4.5 Discussion.- 2.5 Conclusion.- 3 Incremental log data clustering for processing large amounts of data online.- 3.1 Introduction.- 3.2 Concept for incremental clustering.- 3.2.1 Incremental clustering.- 3.2.2 Description of model.- 3.2.3 String metrics.- 3.2.4 Description of model M
1.‑‑ 3.2.5 Time series analysis.- 3.3 Outlook and further development.- 3.4 Try it out.- 3.4.1 Exim Mainlog.- 3.4.2 Messages log file.- 4 Generating character-based templates for log data.- 4.1 Introduction.- 4.2 Concept for generating character-based templates.- 4.3 Cluster template generator algorithms4.3.1 Initial matching.- 4.3.2 Merge algorithm.-4.3.3 Length algorithm.- 4.3.4 Equalmerge algorithm.- 4.3.5 Token_char algorithm.- 4.3.6 Comparison.- 4.4 Outlook and further development.- 4.5 Try it out .- 4.5.1 Exim Mainlog.- 5 Time series analysis for temporal anomaly detection5.1 Introduction.- 5.2 Concept for dynamic clustering and AD.- 5.3 Cluster evolution.- 5.3.1 Clustering model.- 5.3.2 Tracking.- 5.3.3 Transitions.- 5.3.4 Evolution metrics .- 5.4 Time series analysis.- 5.4.1 Model.- 5.4.2 Forecast.- 5.4.3 Correlation.- 5.4.4 Detection.- 5.5 Example.- 5.5.1 Long-term analysis of Suricata logs.- 5.5.2 Short-term analysis of Audit logs.- 6 AECID: A light-weight log analysis approach for online anomaly detection.- 6.1 Introduction.- 6.2 The AECID approach.- 6.2.1 AMiner.- 6.2. AECID central.- 6.2. Detecting anomalies.- 6.2. Rule generator.- 6.2. Correlation engine.- 6.2. Detectable anomalies.- 6. System deployment and operation.- 6. Application scenarios.- 6. Try it out.- 6.5.1 Configuration of the AMiner for AIT-LDSv1. – 6.5.2 Apache Access logs.- 6.5.3 Exim Mainlog file.- 6.5.4 Audit logs.- 7. A concept for a tree-based log parser generator.- 7.1 Introduction.- 7.2 Tree-based parser concept.- 7.3 AECID-PG: tree-based log parser generator.- 7.3.1 Challenges when generating tree-like parsers.- 7.3.2 AECID-PG concept.- 7.3.3 AECID-PG rules.- 7.3.4 Features.- 7.4 Outlook and further application.- 7.5 Try it out.- 7.5.1 Exim Mainlog.- 7.5.2 Audit logs.- 8 Variable type detector for statistical analysis of log tokens.- 8.1 Introduction.-.-8.2 Variable type detector concept.- 8.3 Variable type detector algorithm.- 8.3.1 Sanitize log data.- 8.3.2 Initialize types.- 8.3.3 Update types.- 8.3.4 Compute indicators.- 8.3.5 Select tokens.- 8.3.6 Compute indicator weights.- 8.3.7 Report anomalies.- 8.4 Try it out.- 8.4.1 Apache Access log.- 9. Final remarks.
Sobre el autor
Florian Skopik is Head of the Cyber Security Research Program at the Austrian Institute of Technology (AIT) with a team comprising around 30 people. He spent 10+ years in cyber security research, before, and partly in parallel, another 15 years in software development. Nowadays, he coordinates national and large-scale international research projects, as well as the overall research direction of the team. His main interests are centered on critical infrastructure protection, smart grid security and national cyber security and defense. Since 2018, Florian further works as ISO 27001 Lead Auditor. Before joining AIT in 2011, Florian was with the Distributed Systems Group at the Vienna University of Technology as a research assistant and post-doctoral research scientist from 2007 to 2011, where he was involved in a number of international research projects dealing with cross-organizational collaboration over the Web. In context of these projects, he also finished his Ph D studies. Florian further spent a sabbatical at IBM Research India in Bangalore for several months. He published more than 125 scientific conference papers and journal articles and holds more than 50 industry recognized security certifications, including CISSP, CISM, CISA, CRISC, and CCNP Security. In 2017 he finished a professional degree in Advanced Computer Security at the Stanford University, USA. Florian is member of various conference program committees and editorial boards and standardization groups, such as ETSI TC Cyber and OASIS CTI. He frequently serves as reviewer for numerous high-profile journals, including Elsevier’s Computers & Security. He is registered subject matter expert of ENISA in the areas of new ICTs and emerging application areas as well as Critical Information Infrastructure Protection (CIIP) and CSIRTs cooperation. In his career, he gave several keynote speeches, organized scientific panel discussions at flagship conferences, such as a smart grid security panel at the IEEE Innovative Smart Grid Technologies (ISGT) conference in Washington D.C., and acted as co-moderator of the National Austrian Cyber Security Challenge 2017, and as jury member of the United Nations Cyber Security Challenge 2019. Florian is IEEE Senior Member, Senior Member of the Association for Computing Machinery (ACM), Member of (ISC)2, Member of ISACA and Member of the International Society of Automation (ISA).
Markus Wurzenberger is a scientist and project manager at the Austrian Institute of Technology (AIT), located in Vienna, Austria. His main research interests are log data analysis with focus on anomaly detection and cyber threat intelligence (CTI). This includes the development of (i) novel machine learning that allow online processing of large amounts of log data to enable attack detection in real time, and (ii) artificial intelligence (AI) methods and concepts for extracting threat information from anomalies to automatically generate actionable andshareable CTI. Besides the involvement in several national and international research projects, Markus is one of the key researchers working on AIT’s anomaly detection project AECID (Automatic Event Correlation for Incident Detection). Among the most prominent solutions developed within this project, Markus and his team created AMiner, a software component for log analysis, which implements several anomaly detection algorithms and is included as package in the official Debian distribution. In 2016, Markus enrolled for his Ph D studies in computer science at the Vienna University of Technology, with focus on anomaly detection in computer log data. The subject of his Ph D aligns with several national and international research projects AIT is involved in. In 2015 Markus obtained his Master’s Degree in Technical Mathematics at the Vienna University of Technology. Since 2014 he is a full-time researcher at AIT in the area of cyber security.
Max Landauer finished his Bachelor’s Degree in Business Informatics at the Vienna University of Technology in 2016. In 2017, he joined the Austrian Institute of Technology (AIT), where he carried out his Master’s Thesis on clustering and time-series analysis of system log data. He started his Ph D studies as a cooperative project between the Vienna University of Technology and the Austrian Institute of Technology in 2018. For his dissertation, Max is working on an automatic threat intelligence mining approach that extracts actionable CTI from raw log data. The goal of this research is to transform threat information shared by different organizations into abstract alert patterns that allow detection and classification of similar attacks. Moreover, Max is a maintainer of the logdata-anomaly-miner (AMiner), an Open-Source agent for parsing and analyzing all kinds of system logs, that is developed at AIT and available in the Debian distribution. He is also contributing to multiple other tools thatare part of AECID (Automatic Event Correlation for Incident Detection), a framework for all kinds of efficient and scalable log data analysis techniques such as parser generation and log clustering. Max has multiple years of experience with nationally and internationally funded projects in numerous areas, including machine learning, artificial intelligence, cyber-physical systems, and digital service chains. He is currently employed as a Junior Scientist in the center for Digital Safety and Security at the Austrian Institute of Technology. His main research interests are log data analysis, anomaly detection, and cyber threat intelligence.