Programme - IIPC
GA & WAC REGISTRATION
SCHEDULE OVERVIEW
ORGANIZATION
PRACTICAL INFORMATION
TUESDAY, 8 APRIL
WEDNESDAY, 9 APRIL
THURSDAY, 10 APRIL
IIPC GENERAL ASSEMBLY
Tuesday, 8 April 2025
*all times in CEST
09:40am
OPENING REMARKS:
Olga Holownia, IIPC  & Jon Carlstedt Tønnessen, National Library of Norway
09:50am
CHAIR ADDRESS:
Jeffrey van der Hoeven, National Library of the Netherlands (KB)
10:00am
IIPC STRATEGIC PLAN 2026 - 2030
Jeffrey van der Hoeven, National Library of the Netherlands (KB) - IIPC Chair
Bjarne Andersen, Royal Danish Library - IIPC Treasurer
Olga Holownia, IIPC - Senior Program Officer
10:45am
BREAK
11:15am
FRAMEWORK FOR TOOLS SUSTAINABILITY
This session focuses on the new framework defining IIPC’s role in sustaining web archiving tools for its membership, developed as part of the Strategic Action Plan 2026 and the renewal of the Consortium Agreement.
Discussion leaders:
Ben O'Brien, National Library of New Zealand
Gil Hoggarth, British Library
Youssef Eldakar, Bibliotheca Alexandrina
Kristinn Sigurðsson, National and University Library of Iceland
Jeffrey van der Hoeven, National Library of the Netherlands
Bjarne Andersen, Royal Danish Library
CONTENT DEVELOPMENT WORKING GROUP MEETING
Chaired by:
Former CDG Co-Chair Nicola Bingham, British Library and current CDG Co-Chair Shereen Tay, National Library Board Singapore
AGENDA:
General updates
Collection presentations:
Paris Olympics - Helena Byrne, British Library
Street Art - Ricardo Basílio, Arquivo.pt and Miranda Siler, Columbia University Libraries
War in Ukraine - Anaïs Crinière-Boizet and Vladimir Tybin, National Library of France
Discussions:
Survey introduction and results
Future collections policies
Potential World War II anniversary and Southeast Asia collections
Introduction and handover to new co-chairs Anaïs Crinière-Boizet, National Library of France and Melissa Wertheimer, Library of Congress
12:45pm
LUNCH
2:00pm
RESEARCH WORKING GROUP MEETING
Chaired by RWG Co-Chairs
Ben O’Brien, National Library of New Zealand, Jon Carlstedt Tønnessen, National Library of Norway, and Olga Holownia, IIPC
AGENDA:
Introduction to RWG
Presentations:
The WebData Project
- Magnus Birkenes, National Library of Norway
There and Back Again: the WOD (Whole-of-Domain)’s Tale
- Ben O’Brien, National Library of New Zealand
The Common Crawl’s Datasets
- Sebastian Nagel, Common Crawl Foundation
Wrap-up
TRAINING WORKING GROUP MEETING
Chaired by:
TWG Co-Chair Claire Newing, The National Archives, UK
This 60-minute session focuses on introducing the work and goals of the IIPC Training Working Group to interested newcomers.
3:30pm
BREAK
4:00pm
CRAWLING NATIONAL DOMAIN: TOWARDS BEST PRACTICES
Chaired by:
Sara Aubry, National Library of France
This session explores members' expertise and experiences in performing national domain crawls with Heritrix. Members are asked to 1) share their challenges with domain crawls, 2) discuss possible solutions, and 3) establish best practices with specific tips and tricks for others to try.
Presentations:
Fighting 404s
- Sara Aubry, National Library of France
Annual national domain crawl using AWS
- Gil Hoggarth, British Library
Seedlist: approach emphasis and scope
- Tom Smyth, Library and Archives Canada
Effective handling of byte limits
- Thomas Smedebøl, Royal Danish Library
Use of sitemaps in domain crawls
- Kristinn Sigurðsson, National and University Library of Iceland
Browser-assisted Heritrix
- Alex Dempsey, Internet Archive
Assessing crawl completeness
- Thom Vaughan, Common Crawl Foundation
TWG WORKSHOP:
Case Studies ‘Write-a-thon’ - Documenting Best Practices
Claire Newing
, Lauren Baker
, Kody Willis
1: The National Archives (UK), United Kingdom; 2: Library of Congress, United States of America; 3: Internet Archive, United States of America
The session will be aimed at any conference attendees who wish to submit a case study on any topic relevant to web archiving. Suggested topics include: a process/workflow which works well; a decision rubric for selection; a method developed for capturing a specific type of content; building specialist search queries; a successful tool used for training. To support the creation of case studies during the session and beyond, participants will be provided a case study template and a sample completed case study for reference.
The overall outcome will be a case study collection open to all members to be launched shortly after the conference with additional case studies being added on an ongoing basis.
7:00pm
WELCOME RECEPTION
For IIPC members only
Watch WAC2025 recordings
Wednesday, 9 April 2025
view abstracts
view slides at UNT Digital Library
*all times in CEST
09:40am
OPENING REMARKS:
Olga Holownia, IIPC & Jon Carlstedt Tønnessen, National Library of Norway
09:50am
OPENING KEYNOTE: Libraries, Copyright, and Language Models
Javier de la Rosa, National Library of Norway
haired by Andrew Jackson, Digital Preservation Coalition
RECORDING
10:45am
BREAK
10:55am
LIGHTNING SESSION #1
Chair
: Ben Els, National Library of Luxembourg
Strategies and Challenges in the Preservation of Mexico’s Web Heritage: First Steps
Carolina Silva Bretón
National Library of Mexico, Mexico
RECORDING
SLIDES
Arquivo.pt Toolkit for Web Archiving
Daniel Gomes
Arquivo.pt, Portugal
RECORDING
SLIDES
Tracking the Political Representations of Life: Methodological Challenges of Exploring the BnF Web Archives
Guillaume Levrier
1,2
Dorothée Benhamou-Suesser
1: Centre de recherches politiques de Sciences Po (CEVIPOF, CNRS), France; 2: Bibliothèque nationale de France, France
RECORDING
(MOA) |
SLIDES
Collaborative Curatorial Approaches of the Czech Web Archive Using the Example of Thematic Literary Collections
Marie Haškovcová
National Library of the Czech Republic, Czech Republic
RECORDING
SLIDES
LIGHTNING SESSION #2
Chair
: Sawood Alam, Internet Archive
Modelling Archived Web Objects as Semantic Entities to Manage Contextual and Versioning Issues
Tom Storrar
Manuela Pallotto Strickland
1: The National Archives (UK), United Kingdom; 2: King's College London, United Kingdom
RECORDING
SLIDES
Modernizing Web Archives: The Bumpy Road Towards a General ARC2WARC Conversion Tool
Pedro Ortiz Suarez
, Sebastian Nagel, Thom Vaughan
Common Crawl Foundation, United States of America
RECORDING
SLIDES
Poking Around in Podcast Preservation
Jasper Snoeren
Netherlands Institute for Sound and Vision, Netherlands
RECORDING
(MOA) |
SLIDES
Automatic Clustering of Domains by Industry for Effective Curation
Thomas Smedebøl
Royal Danish Library, Denmark
RECORDING
(MOA) |
SLIDES
Best Practice of Preserving Posts from Social Media Feeds
Magdalena Sjödahl
Arkiwera wcrify AB, Sweden
RECORDING
SLIDES
11:25am
BREAK
11:55am
PANEL #1:
Engaging Audiences
Chair
: Eveline Vlassenroot, University of Ghent
“Beyond Preservation: Engaging Audiences and Researchers with Web Archives”
Eveline Vlassenroot
, Peter Mechant
, Friedel Geeraert
, Christina Vandendyck
Cui Cui
3,4
Beatrice Cannelli
Anders Klindt Myrvoll
Andrea Kocsis
1: University of Ghent, Belgium; 2: KBR - Royal Library of Belgium, Belgium; 3: University of Sheffield, United Kingdom; 4: Bodleian Libraries, United Kingdom; 5: Royal Danish Library, Denmark; 6: National Library of Scotland, United Kingdom
RECORDING (MOA) |
SLIDES
SESSION #01:
Tools Under Construction: Lessons Learned
Chair
: Katherine Boss, National Library of Norway
Embedding the Web Archive in an Overall Preservation System
Hansueli Locher
Swiss National Library, Switzerland
SLIDES
UKWA Rebuild
Gil Hoggarth
British Library, United Kingdom
RECORDING
SLIDES
Under Construction: Web Archive of the German National Library
Natanael Arndt
German National Library, Germany
RECORDING
SLIDES
WORKSHOP #01:
Exploring Dilemmas in the Archiving of Legacy Webportals: An Exercise in Reflective Questioning
Daniel Steinmeier
Sophie Ham
National Library of the Netherlands, Netherlands
1:00pm
LUNCH
2:05pm
SESSION #02
: Crawling Tools
Chair
: László Tóth, National Library of Luxembourg
Lessons Learned Building a Crawler From Scratch: The Development and Implementation of Veidemann
Marius André Elsfjordstrand Beck
National Library of Norway, Norway
RECORDING
(MOA) |
SLIDES
Experiences of Using in-House Developed Collecting Tool ELK
Lauri Ojanen
National Library of Finland, Finland
RECORDING
SLIDES
Better Together: Building a Scalable Multi-Crawler Web Harvesting Toolkit
Alex Dempsey
, Adam Miller, Kyrie Whitsett
Internet Archive, United States of America
RECORDING
SLIDES
Lowering Barriers to Use, Crawling, and Curation: Recent Browsertrix Developments
Tessa Walsh
Ilya Kreymer
Webrecorder, United States of America
RECORDING
SLIDES
SESSION #03
: Advocacy & User Engagement
Chair
: Mark Phillips, University of North Texas Libraries
Insufficiency of Human-Centric Ethical Guidelines in the Age of AI: Considering Implications of Making Legacy Web Content Openly Accessible
Gaja Zornada
, Boštjan Špetič
Computer History Museum Slovenia (Računališki muzej), Slovenia
RECORDING
SLIDES
Web Archives for Music Research
Andreas Lenander Ægidius
Royal Danish Library, Denmark
RECORDING
SLIDES
IXP History Collection: Recording the Early Development of the Core of the Public Internet
Sharon Healy
, Gerard Best
, Lara Díaz Martínez
1: Independent Researcher, Ireland; 2: University of Barcelona, Spain
SLIDES
Lost, but Preserved - A Web Archiving Perspective on the Ephemeral Web
Sawood Alam
, Mark Graham
Internet Archive, United States of America
RECORDING
SLIDES
WORKSHOP #02:
Web Archive Collections As Data
Gustavo Candela
Abbie Grotke
Olga Holownia
Jon Carlstedt Tønnessen
, Chase Dooley
, Rachel Trent
Helena Byrne
, Emily Maemura
1: University of Alicante, Spain; 2: Library of Congress, United States of America; 3: IIPC, United States of America; 4: National Library of Norway, Norway; 5: British Library, UK; 6: University of Illinois Urbana-Champaign, United States of America
3:40pm
BREAK
4:40pm
POSTER SLAM
Chair
: Olga Holownia, IIPC
‘We Are Now Entering the Pre-election Period’: Experimental Twitter Capture at The National Archives
Jake Bickford
The National Archives (UK), United Kingdom
The BnF DataLab Services and Tools for Researchers Working on Web Archives
Sara Aubry
, Dorothée Benhamou-Suesser
Bibliothèque nationale de France, France
POSTER
Designing Art Student Web Archives
Katherine Martinez
The New School, United States of America
POSTER
Next Steps Towards A Formal Registry Of Web Archives For Persistent And Sustainable Identification
Eld Zierau
Royal Danish Library, Denmark
POSTER
Using Web Archives to Construct the History of an Academic Field
Tegan Pyke
University of Bergen, Norway
POSTER
Consortium on Electronic Literature (CELL)
Hannah Ackermans
University of Bergen, Norway
POSTER
Arquivo.pt Annual Awards: A Glimpse
Daniel Gomes
Arquivo.pt, Portugal
Arquivo.pt Api/Bulk Access and Its Usage
Vasco Rato
, Daniel Gomes
Arquivo.pt, Portugal
Failed Capture or Playback Woes? A Case Study in Highly Interactive Web Based Experiences
Mari Allison
Smithsonian Libraries and Archives United States of America
HAWathon: Participants Experience
Ingeborg Rudomino
, Anamarija Ljubek
National and University Library in Zagreb, Croatia
POSTER
Supporting Best Practices for Archiving Social Media by Heritage Institutions in Flanders (and Beyond)
Ellen Van Keer
Katrien Weyns
1: meemoo, Flemish Institute for Archives, Belgium; 2: KADOC at Catholic University of Leuven, Belgium
Planning Web Archiving Within a Four-Year Scope: Making the New Collection Plan for the Years 2025-2028 in the National Library of Finland
Sanna Haukkala
National Library of Finland, Finland
POSTER
Redirects Unraveled: From Lost Links to Rickrolls
Kritika Garg
Sawood Alam
, Michele Weigle
, Michael Nelson
, Mark Graham
, Dietrich Ayala
1: Old Dominion University, United States of America; 2: Internet Archive, United States of America; 3: Filecoin Foundation, Netherlands
Use of Screenshots as a Harvesting Tool for Dynamic Content and Use of AI for Later Data Analysis
Gaja Zornada
, Boštjan Špetič
Computer History Museum Slovenia (Računališki muzej), Slovenia
POSTER
Asynchronous and Modular Pipelines for Fast WARC Annotation
Pedro Ortiz Suarez
, Thom Vaughan
Common Crawl Foundation, United States of America
POSTER
Politely Downloading Millions of WARC Files Without Burning the Servers Down
Pedro Ortiz Suarez
, Thom Vaughan, Greg Lindahl
Common Crawl Foundation, United States of America
POSTER
Robots.txt and Crawler Politeness in the Age of Generative AI
Sebastian Nagel,
Thom Vaughan
Common Crawl Foundation, United States of America
POSTER
Experiences Switching an Archiving Web Crawler to Support HTTP/2
Sebastian Nagel
Common Crawl Foundation, United States of America
4:40pm
POSTER SESSION
7:00pm
DINNER
Pre-registration required for this event.
Watch WAC2025 recordings
Thursday, 10 April 2025
view abstracts
view slides at UNT Digital Library
*all times in CEST
09:00am
MORNING COFFEE
09:20am
LIGHTNING SESSION #3
Chair
: Helena Byrne, British Library
The Practice of Web Archiving Statistics and Quality Evaluation Based on the Localization of ISO/TR 14873:2013(E): A Case Study of the NSL-WebArchive Platform
Zhenxin Wu
, Jiali Zhu
2,3
, Jiying Hu
1: National Science Library, Chinese Academy of Sciences, China; 2: Zhejiang Economic & Information Center, China; 3: Zhejiang Economic & Information Development Co., Ltd, China
RECORDING
SLIDES
Modifying ePADD for Entity Extraction in Non-English Languages
Pierre Beauguitte,
Tita Enstad
National Library of Norway, Norway
RECORDING
SLIDES
Arquivo.pt Query Logs
Pedro Gomes
, Daniel Gomes
Arquivo.pt, Portugal
RECORDING
SLIDES
What You See No One Saw
Mat Kelly
, Alex H. Poole
, Michele Weigle
, Michael Nelson
, Travis Reid
, Christopher B. Rauch
, Hyung Wook Choi
1: Drexel University, United States of America; 2: Old Dominion University, United States of America
RECORDING
SLIDES
LIGHTNING SESSION #4
Chair
: Dorothée Benhamou-Suesser, National Library of France
Collaborative Collections at Arquivo.pt: Four Years of Recordings from the City of Sines (Portugal)
Ricardo Basílio
Arquivo.pt, Portugal
RECORDING
SLIDES
Participatory Web Archiving: The Tensions Between the Instrumental Benefits and Democratic Value
Cui Cui
1,3
, Stephen Pinfield
, Andrew Cox
, Frank Hopfgartner
1: University of Sheffield, United Kingdom; 2: Institute for Web Science and Technologies (WeST), Germany; 3: Bodleian Libraries, United Kingdom
RECORDING
(MOA) |
SLIDES
A Minimal Computing Approach for Web Archive Research
Alan Colin-Arce
, Rosario Rogel-Salazar
1: University of Victoria, Canada; 2: Universidad Autónoma del Estado de México, Mexico
RECORDING
SLIDES
Where Fashion Meets Science: Collecting and Curating a Creative Web Archive
Elisabeth Thurlow
University of the Arts London, United Kingdom
RECORDING
SLIDES
09:55am
BREAK
10:05am
SESSION #04:
Discovery & Access (News/Newspapers)
Chair
: Tita Enstad, National Library of Norway
Unlocking the Archive: Open Access to News Content as Corpora
Jon Carlstedt Tønnessen
, Magnus Breder Birkenes
National Library of Norway, Norway
RECORDING
| SLIDES
Recently Orphaned Newspapers: From Archived Webpages to Reusable Datasets and Research Outlooks
Tyng-Ruey Chuang
Chia-Hsun Wang
, Hung-Yen Wu
1,2
1: Academia Sinica, Taiwan; 2: National Yang Ming Chiao Tung University, Taiwan
RECORDING
SLIDES
NewsWARC: Analyzing News Over Time in the Web Archive
Amr Emara
, Khaled Ezz
, Shaden Hazem
Youssef Eldakar
1: Bibliotheca Alexandrina, Egypt; 2: Alamein International University, Egypt
RECORDING
SLIDES
Zombie E-Journals and the National Library of Spain
José Carlos Cerdán Medina
Biblioteca Nacional de España, Spain
RECORDING
SLIDES
SESSION #05:
Sustainability
Chair
: Bjarne Andersen, Royal Danish Library
42 Tips to Diminish the CO2 Impact of Websites
Tamara van Zwol
Lotte Wijsman
Jasper Snoeren
, Tineke van Heijst
1: National Archives of the Netherlands, Netherlands; 2: Dutch Digital Heritage Network, Netherlands; 3: Netherlands Institute for Sound and Vision, Netherlands; 4: Van Heijst Information Consulting, Netherlands
RECORDING
(MOA) |
SLIDES
Building Towards Environmentally Sustainable Web Archiving: The UK Government Web Archive and Beyond
Jane Winters
, Eirini Goudarouli
Jake Bickford
1: University of London, United Kingdom; 2: The National Archives (UK), United Kingdom
RECORDING
SLIDES
Preservation of Historical Data: Using Warchaeology to Process 20 Years of Harvesting
Andreas Børsheim
, Marius André Elsfjordstrand Beck
National Library of Norway, Norway
RECORDING
(MOA) |
SLIDES
Analysing the Publications Office of the European Union Web Archive for the Rationalisation of Digital Content Generation
Alexandre Angers
Publications Office of the European Union, Luxembourg
RECORDING
SLIDES
WORKSHOP #03:
Introduction to Web Graphs
Sebastian Nagel
Pedro Ortiz Suarez
Thom Vaughan
, Greg Lindahl
Common Crawl Foundation, United States of America
11:15am
BREAK
11:45am
PANEL #2:
Cross-Institutional Collaborations
Chair:
Abbie Grotke, Library of Congress
“Past, Present & Future of Cross-Institutional Collaboration in Web Archiving: Insights from the Norwegian and Danish Web Archive, the NetArchiveSuite Community, & Beyond”
Anders Klindt Myrvoll
Thomas Langvann
Sara Aubry
José Carlos Cerdán Medina
, Niels Ørbæk Chemnitz
, Colin Samuel Rosenthal
Abbie Grotke
1: Royal Danish Library, Denmark; 2: National Library of Norway, Norway; 3: Bibliothèque nationale de France, France; 4: Biblioteca Nacional de España, Spain; 5: Analysis & Numbers, Denmark; 6: Library of Congress, United States of America
RECORDING (MOA) |
SLIDES
SESSION #06
: Curating Social Media
Chair
: Tom Smyth, Library and Archives Canada
Developing Social Media Archiving Guidelines at the National Archives of the Netherlands
Lotte Wijsman
, Geert Leloup, Susanne Van den Eijkel, Sander Wellens
National Archives of the Netherlands, Netherlands
RECORDING
SLIDES
Archiving the Social Media Profiles of Members of Government
Ben Els
National Library of Luxembourg, Luxembourg
RECORDING
SLIDES
From Posts to Archives: The National Library of Singapore’s Journey in Collecting Social Media
Shereen Tay
Meiyu Lee
National Library Board Singapore, Singapore
RECORDING
SLIDES
Innovative Web Archiving Amid Crisis: Leveraging Browsertrix and Hybrid Working Models to Capture the UK General Election 2024
Nicola Bingham
Jennie Grimshaw
British Library, United Kingdom
RECORDING
SLIDES
WORKSHOP #04:
How to Develop a New Browsertrix Behavior
Ilya Kreymer
Tessa Walsh
Webrecorder, United States of America
1:15pm
LUNCH
2:15pm
SESSION #07
: Research & Access
Chair
: Marie Roald, National Library of Norway
From Pages to People: Tailoring Web Archives for Different Use Cases
Andrea Kocsis
Leontien Talboom
1: Cambridge University Libraries, United Kingdom; 2: National Library of Scotland, United Kingdom
RECORDING
SLIDES
Making Research Data Published to the Web FAIR
Bryony Hooper
, Ric Campbell
University of Sheffield, United Kingdom
RECORDING
SLIDES
Enhancing Accessibility to Belgian Born-Digital Heritage: The BelgicaWeb Project
Christina Vandendyck
Royal Library of Belgium (KBR), Belgium
RECORDING
SLIDES
Using Generative AI to Interrogate the UK Government Web Archive
Chris Royds
Tom Storrar
The National Archives (UK), United Kingdom
RECORDING
(MOA) |
SLIDES
SESSION #08
: Handling What You Captured
Chair
: Meghan Lyon, Library of Congress
So You’ve Got a WACZ: How Archives Become Verifiable Evidence
Basile Simon
, Lindsay Walker
Starling Lab for Data Integrity, Stanford-USC, United States of America
RECORDING
SLIDES
Warc-Safe: An Open-Source WARC Virus Checker and NSFW (Not-Safe-For-Work) Content Detection Tool
László Tóth
National Library of Luxembourg, Luxembourg
RECORDING
(MOA) |
SLIDES
Detecting and Diagnosing Errors in Replaying Archived Web Pages
Jingyuan Zhu
, Huanchen Sun
Harsha Madhyastha
1: University of Michigan, United States of America; 2: University of Southern California, United States of America
RECORDING
SLIDES
Building a Toolchain for Screen Recording-Based Web Archiving of SVOD Platforms
Alexis Di Lisi
Institut national de l'audiovisuel (INA), France
RECORDING
SLIDES
PANEL #3:
Cross-Institutional Collaboration: the End of Term Archive
Chair:
Jeffrey van der Hoeven, National Library of the Netherlands (KB)
“Coordinating, Capturing, and Curating the 2024 United States End of Term Web Archive”
Mark Phillips
Sawood Alam
, James Jacobs
Ilya Kreymer
1: University of North Texas, United States of America; 2: Internet Archive, United States of America; 3: Stanford University, United States of America; 4: Webrecorder, United States of America
SLIDES
3:40pm
BREAK
4:10pm
CLOSING KEYNOTE: Quantifying Complexity: Using Web Data to Decode Online Public Debate
Håvard Lundberg and Ida Haugen-Poljac, Analysis & Numbers
Chaired by Jon Carlstedt Tønnessen, National Library of Norway
RECORDING
5:05pm
CLOSING REMARKS
#iipcGA25
#iipcWAC25
netpreserve.bsky.social
@nasjonalbiblioteket
nettarkivet.bsky.social
GA&WAC 2025 Home
Web Archiving in Norway
FAQ
WAC
Abstracts
WAC 2025 Call for Proposals
Keynotes