Web archiving partners meet in Cambridge

September 8th, 2022

by the Archive-It team

Archive-It partners and Internet Archive staff met in-person for their first annual review of web archiving progress and plans since 2019, immediately preceding the Society of American Archivists’ (SAA) annual conference in Boston.

Approximately 40 participants congregated at the Massachusetts Institute of Technology’s Hayden Library in Cambridge for a program of technology and community program updates, followed by in-depth conversation and feedback about the topics that will shape Archive-It software development in the year to come. Small focus groups honed in on the next developments to come for Archive-It’s public access site, models for collection management, and efficiencies for quality assurance.

Photograph of Senior Web Archivist Sylvie Rollason-Cass introducing a focus group exercise to attendees at the annual Archive-It Partner Meeting in Cambridge

Senior Web Archivist Sylvie Rollason-Cass introduces a focus group exercise to attendees at the annual Archive-It Partner Meeting in Cambridge.

Save the date: In case you missed us in Cambridge, mark your calendars for a virtual partner meeting to cover these and new topics on Wednesday, November 2, 2022. Read the Call for Proposals here for more details.

Jefferson Bailey, Director of Archiving & Data Services, opened the program with an overview of developments made and planned next for the Internet Archive’s partner services, including Archive-It. Highlighted upgrades to Archive-It included the launch of live chat support, new features to support moving and sharing web archive seeds, and a beta testing phase for sharing across Archive-It accounts, for more collaborative collecting among partnering institutions.

Screenshot of a slide about Archive-It from Jefferson Bailey’s presentation about Internet Archive service and community developments

A slide about Archive-It from Jefferson Bailey’s presentation about Internet Archive service and community developments.

Watch the recording and download the slides anytime for these and updates to the Internet Archive’s web data analysis platform, digital preservation service, scholarly research corpus, and web archiving community programs.

Archive-It partners then shared their expectations and desires for a revitalized public access site in small focus groups. While a great deal of web archiving technology has changed and improved in the last ten years, this site has remained largely the same.

Screenshot of a slide showing the timeline of select Archive-It feature developments, including public website designs

Timeline of select Archive-It feature developments, including public website designs.

Rather than reinvent the wheel, participants focused on targeted improvements. In particular, they identified the presentation of descriptive metadata and full-text search results as key factors that would open web archives to wider audiences than the current and prospective collecting partners whom the current site is understood to serve the most.

Likewise, partners were interested in expanding the presentation and use of administrative metadata to express the relationships among different documents, seeds, and seed groups within their collections. Internet Archive staff brainstormed how these various and often hierarchical relationships could look differently in the Archive-It web application in order to better align with partners’ ideal workflows and their web archives’ records in other systems or finding aids.

Screenshot of a slide visualizing Archive-It entities and their relationships, as they are represented in curation, replay, and storage

A visualization of Archive-It entities and their relationships, as they are represented in curation, replay, and storage.

When it comes to workflows, few activities demand more time and labor than quality assurance (QA). Archive-It partners and staff compared their processes in an effort to find the most meaningful places to intervene with new automation. Recurring suggestions across focus groups included improvements to existing tools, like crawl reports and summary emails, as well as ambitious new ideas for visualizing the volume and character of collection contents as they change over time.

Screenshot of a digital pinboard documenting the examples and suggestions shared in one partner focus group on quality assurance

A digital pinboard documenting the examples and suggestions shared in one partner focus group on quality assurance.

We were thrilled to have these important discussions live and in-person again. And we are especially thankful to Archive-It partner Joe Carrano and the team at MIT for hosting us. Thanks to all who attended and those who contributed from afar. We look forward to much more of both very soon, and to reporting back on our progress from this meeting. Visit our events page to find an upcoming opportunity to join us.