This book provides practical information about web archives, offers inspiring examples for web archivists, raises new challenges, and shares recent research results about access methods to explore information from the past preserved by web archives.
The book is structured in six parts. Part 1 advocates for the importance of web archives to preserve our collective memory in the digital era, demonstrates the problem of web ephemera and shows how web archiving activities have been trying to address this challenge. Part 2 then focuses on different strategies for selecting web content to be preserved and on the media types that different web archives host. It provides an overview of efforts to address the preservation of web content as well as smaller-scale but high-quality collections of social media or audiovisual content. Next, Part 3 presents examples of initiatives to improve access to archived web information and provides an overview of access mechanisms for web archives designed to be used by humans or automatically accessed by machines. Part 4 presents research use cases for web archives. It also discusses how to engage more researchers in exploiting web archives and provides inspiring research studies performed using the exploration of web archives. Subsequently, Part 5 demonstrates that web archives should become crucial infrastructures for modern connected societies. It makes the case for developing web archives as research infrastructures and presents several inspiring examples of added-value services built on web archives. Lastly, Part 6 reflects on the evolution of the web and the sustainability of web archiving activities. It debates the requirements and challenges for web archives if they are to assume the responsibility of being societal infrastructures that enable the preservation of memory.
This book targets academics and advanced professionals in a broad range of research areas such as digital humanities, social sciences, history, media studies and information or computer science. It also aims to fill the need for a scholarly overview to support lecturers who would like to introduce web archiving into their courses by offering an initial reference for students.
Innehållsförteckning
Part I: The Era of Information Abundance and Memory Scarcity.- The Problem of Web Ephemera.- Web Archives Preserve Our Digital Collective Memory.- Part II Collecting Before It Vanishes.- National Web Archiving in Australia: Representing the Comprehensive.- Web Archiving in Singapore: The Realities of National Web Archiving.- Archiving Social Media: The Case of Twitter.- Part III: Access Methods to Analyse the Past Web.- Full-Text and URL Search Over Web Archives.- A Holistic View on Web Archives.- Interoperability for Accessing Versions of Web Resources with the Memento Protocol.- Linking Twitter Archives with Television Archives.- Image Analytics in Web Archives.- Part IV: Researching the Past Web.- Digital Archaeology in the Web of Links: Reconstructing a Late-1990s Web Sphere.- Quantitative Approaches to the Danish Web Archive.- Critical Web Archive Research.- Exploring Online Diasporas: London’s French and Latin American Communities in the UK Web Archive.- Platform and App Histories: Assessing Source Availability in Web Archives and App Repositories.- Part V: Web Archives as Infrastructures to Develop Innovative Services.- The Need for Research Infrastructures for the Study of Web Archives.- Automatic Generation of Timelines for Past-Web Events.- Political Opinions on the Past Web.- Oldweb.today: Browsing the Past Web with Browsers from the Past.- Big Data Science Over the Past Web.- Part VI: A Look into the Future.- The Past Web: A Look into the Future.
Om författaren
Daniel Gomes is the leader of Arquivo.pt – the Portuguese web-archive at the Foundation for Science and Technology (a Portuguese Government Institution). He started Arquivo.pt as an academic project during his Ph D, accomplished in 2007, and led it to become the research infrastructure he currently manages. During this process, he led several dissemination and communication activities complementary to the technological development and operation of the research infrastructure. From 2012 to 2015, he managed in parallel the web development team of the Foundation for National Scientific Computing. His research interests include user experience, digital preservation, information retrieval and web archiving.
Elena Demidova is the leader of the “Data Science and Intelligent Systems” group at the University of Bonn and a member of the L3S Research Center at the Leibniz University of Hannover, Germany. In the past, she worked as a research group leader at the L3S Research Centerand as a Senior Research Fellow at the Web and Internet Science Group at the University of Southampton, UK. Elena had leading roles in several large-scale EU-funded and national projects, most recently including coordination of Cleopatra – a Marie Skłodowska-Curie ITN. Her main research interests are in data analytics, mobility, multilingual data, Open Data, the Web and Semantic Web.
Jane Winters is Chair of Digital Humanities at the School of Advanced Study, University of London. She is a Fellow of the Royal Historical Society, and a member of RESAW (Research Infrastructure for the Study of the Archived Web) and the Advisory Boards of the European Holocaust Research Infrastructure and the Living with Machines project. Her research interests include digital history, web archives, big data for humanities research, peer review in the digital environment, text editing, and open access publishing, and she has led or co-directed a range of digital projects in these areas.
Thomas Risse is head of Electronic Services at the University Library J. C. Senckenberg of the Goethe University Frankfurt. Before joining the Library he was deputy managing director and research group leader at the L3S Research Center, Hannover, Germany. He was coordinator or technical director of several European projects in the area of digital libraries and web archives. Thomas is a member of the Steering Committee of the IEEE International Conference on Data Engineering and of the International Conference on Theory and Practice of Digital Libraries.