Archiving the Web @EBRPL: Creating and following a web collecting policy in a public library

June 19th, 2018

By Emily Ward, Digital Archivist, East Baton Rouge Parish Library

During the summer of 2016, Baton Rouge witnessed the shooting of Alton Sterling, the mass shooting of Baton Rouge law enforcement, and the Great Flood of 2016. While watching these events unfold from our smartphones and computers, we in Special Collections at the East Baton Rouge Parish Library realized this information might be in jeopardy of never being acquired and preserved due to a shift in the way information, ephemeral or not, is being created and disseminated. We simply had not begun to deal with born-digital content but we knew we needed to start and quickly, and this is when our relationship with Archive-it began. I learned a lot from our work on these three collections. Because of the nature of these events, seed selection and capture happened quickly, and very little attention was given to rights and scoping parameters. I knew however, this system was not sustainable or compatible with the type of collecting we hoped to do in the future. It was time to create a collection development policy that addressed what we intended to capture and preserve, how we would handle copyright, and guidelines for appraisal.

Screenshot of YouTube capture from South Louisiana Flood of 2016 Collection

Amateur drone footage collected for the South Louisiana Flood of 2016 Collection.

It was important that we create our policy in a way that maintained the integrity and objective of our mission. This meant defining a scope that reaches beyond our institution, yet remains within the borders of the parish, and which addresses issues with ownership of content created by individuals and organizations outside of the library system. This was probably due to so few public libraries archiving the web at the time (enter Community Webs), but during my research I wasn’t able to find one public library that had similarly created a collection development policy for their web archiving programs. However, we eventually found an example of a collection development policy that had a mission similar to our own: we decided to adapt portions of the Bentley Historical Library Collection Development Policy for Web Archives as our model in conjunction with the East Baton Rouge Parish Library Collection Development Policy. In particular, we learned from the section where they have listed the topical areas around which they base their collecting practices.

I liked the idea of collecting topically for a few reasons. First, Special Collections already maintains an extensive and topically-arranged vertical file. These topics can be repurposed for our web archives fairly easily. By basing our collecting around a topical model that is already well established, our web collections can potentially act as a digital companion to our physical vertical file, opening up an entirely new avenue of research for our patrons. Second, the purpose of our vertical file is to collect ephemera and other loose materials that generally come to us as a single items, unrelated/unattached to any other items in our collection. When I was selecting seeds for the first collections I noticed that many of the sites were essentially the digital surrogate, or rather, the natural successor to the physical materials you would find in our vertical file. This might be a single news article (i.e. newspaper clipping), a website that advertises a type of service or business (i.e. brochures and flyers), or even a video clip you can find on YouTube (i.e. CD or DVD). I would argue that we should expect to see less of the physical materials and more of the digital in the coming years. By practicing this type of collecting now, we will be prepared to meet the challenge of curating our digital vertical file as this shift takes place.

The first iteration of the EBRPL Collections Development Policy for Web Archives was modeled after the Intellectual Property Rights section of the Bentley Historical Library. However, this proved to be impractical for the time and staff we could allot to these activities. Special Collections’ policy stated that EBRPL would gain permission from content creators before crawling their sites. By waiting for permission, the risk increases that sites will be updated or taken down, and it was impossible to expect the one archivist conducting all web archiving activities to also contact and interact with potentially hundreds of creators. The Section 108 Study Group recommends collectors give content creators the right to opt out of having their content collected or made publically available. Special Collections concluded it was not feasible to work with each creator individually; however, it was important to work one on one with those creators who might establish sizable collections. These collections could be only web-based or a hybrid of digital and non digital materials.

We had established in our first in-person Community Webs cohort meeting at the Internet Archive last November that in order for a community archive to be truly deserving of the designation, the community had to be consulted in the process of creating the archive. My attempts at working directly with the community have had mixed results but nonetheless might inform the next public library performing this kind of outreach. So far, I’ve worked with two local artists and two creative small businesses that rely heavily on the web to distribute their products. The first individual, Dylan Krieger, is a local poet who will have published three books in the last two years, one of which was reviewed and received high praise from the New York Times Book Review. The second, Osa Atoe, is a local potter, musician, and blogger that is heavily involved in our local maker movement and whose ceramic style is well recognized in Southern Louisiana. The process was simple, albeit time consuming: I contacted the individuals and organizations, told them my intentions, asked them to send along links to sites they thought should be included, and performed the crawls.

Photo of decorative wall hanging by Osa Atoe

Decorative wall hanging by Osa Atoe on display at Baton Rouge’s Firehouse Gallery in May, 2018. Photo by Raegan O. Labat.

Dylan sent a link to her website where she had maintained a list of all the places she had published her poems and short stories. What I thought was going to be a few links to a business website and some social media pages has turned into roughly 30 crawls and counting of e-publications that have published individual poems as well as YouTube videos of her reading her work at poetry readings or book signings. Both sent along links to press that reviewed their work or interviewed them. These seeds were crawled because there was no guarantee that the e-publications, many of which would be considered small, paperless and perhaps financially volatile, were archiving the content they publish; there is no way to predict how long they would stay in business and even if they do–are they properly archiving their content? Special Collections weighed the risks of not capturing these sites and decided it was safer to preserve the sites as they highlight an important part of Dylan’s poetry career.

Through email communications with Osa, Special Collections learned that along with pottery, Osa was also heavily involved in the punk music scene in New Orleans when she lived there and maintained a blog that focused on historial female punk musicians of color. Like Dylan, Osa had informed Special Collections that her business site maintained links to sites that she had contributed to. She also sent info for her blog and a site called Bandcamp where she and her bands had posted their original music. I had no previous knowledge of Osa’s artistic endeavors beyond her ceramics work which she is so well known for in Baton Rouge. I had to ask myself if it was pertinent to her web collection to collect content that fell beyond the borders of Baton Rouge. If I was only interested in her ceramic art I might have crawled her website and social media and called it a day. But I realized that if Osa didn’t think those other sites were relevant she probably wouldn’t have sent them. Special Collections came to the conclusion that the additional content was important to understanding Osa’s current art endeavors and lent context to who Osa is as a person and artist in the city of Baton Rouge.

Two takeaways from these outreach experiences:

The individuals I worked with tended to be more excited and responsive to my request. I’m not sure why, but I’d argue this was much more personal for them than the business owners; therefore, they were much more invested in seeing it to completion.

Working on a one-on-one basis is a great way to advertise your web archive. Once Special Collections had completed all the crawls and have them described to satisfaction we plan to send along bibliographies with links to their archived sites with the suggestion they use them for their own personal digital archives.

So far, my overarching takeaways for new web archivists are:

Create a collection development policy! It makes you think critically about what you think should be collected and sets boundaries that will assist you in the decision making process.

Subscribe to a local news resource (and keep up with it). This advice was offered to me on my first day as an archivist, and I’ve found it to be most helpful in directing my  collecting practices for our web archive.

As librarians, we need to take the time to educate the public on how to use archived websites for their research. Once they understand what archived websites are and have used them for their research they’ll be more likely to participate in the collection process.

And finally, web archiving is a daunting task but by taking the time to create policies and guidelines, consulting with others who have experience, and simply practicing yourself you’ll get the hang of it.