This book on statistical disclosure control presents the theory, applications and software implementation of the traditional approach to (micro)data anonymization, including data perturbation methods, disclosure risk, data utility, information loss and methods for simulating synthetic data. Introducing readers to the R packages sdc Micro and sim Pop, the book also features numerous examples and exercises with solutions, as well as case studies with real-world data, accompanied by the underlying R code to allow readers to reproduce all results.
The demand for and volume of data from surveys, registers or other sources containing sensible information on persons or enterprises have increased significantly over the last several years. At the same time, privacy protection principles and regulations have imposed restrictions on the access and use of individual data. Proper and secure microdata dissemination calls for the application of statistical disclosure control methods to the da
ta before release.
This book is intended for practitioners at statistical agencies and other national and international organizations that deal with confidential data. It will also be interesting for researchers working in statistical disclosure control and the health sciences.
表中的内容
Preface.- Software.- Basic Concepts.- Disclosure Risk.- Methods for Data Perturbation.- Data Utility and Information Loss.- Synthetic Data.- Practical Guidelines.- Case Studies.- Solutions.- Index.
关于作者
Matthias Templ works as a university lecturer and researcher at the Zurich University of Applied Sciences in Winterthur, Switzerland, and teaches at the Vienna University of Technology in Austria. He is one of the founders of data-analysis OG and holds a Ph D in Technical Mathematics and the Venia Legendi (Habilitation) in Statistics. His main research interests include computational statistics and survey methodology. He has been actively involved in many international projects on Statistical Disclosure Control with international organizations (the World Bank, UNIDO, OECD, Eurostat) and has published more than 44 papers in indexed scientific journals in the past few years. He maintains the CRAN Task View on Official Statistics and Survey Methodology and is the author of several R packages for official statistics, such as the R package sdc Micro for statistical disclosure control, the VIM package for the visualization and imputation of missing values, and the package rob Compositions for the robust analysis of compositional data.