URLs Aren’t Archives ¯\_(ツ)_/¯, and Other Stories

I have not spent much time in recent months following the travails of media organizations such as Gawker and the Gothamist other than to casually peruse tweets on my timeline. A retweet caught my eye the other day, and here we are. Today’s post is mainly in response to “Digital Media and the Case of the Missing Archives,” written by Danielle Tcholakian who in turn seems to have been inspired by an article in the Columbia Journalism Review.

Tcholakian’s article provoked a strong reaction from me – sharp, keen frustration. I found the assumptions made by the author to be frustrating. The lack of input in the piece by any institutional archivist, records manager or content management administrator was frustrating. The absence of details, such as the ownership of material posted on sites such as Gawker, was frustrating. The expectation of action on the part of institutions such as the Library of Congress was frustrating. Frustration all around.

I do not in any way wish to devalue the anxiety that journalists or their readership must feel when the URLs to their articles are moved or deleted. Those of us in academic and legal environments have been dealing with link and citation rot for ages. Artists, too, are experiencing the fragility of their online portfolios.

Journalists are not alone.

A Question of Vocabulary

So let us start a mutual conversation with me first asking journalists, what do you mean by “archives”? What are your archives? How are you employing the term? To describe the platform on which your articles are published and disseminated? A collection of PDFs saved on a networked server? Printouts of the articles neatly bound in a Trapper Keeper? Are you including the records of the organization in your definition of archives? The records in which the history of hiring practices, revenue sources, internal policy and decision-making is documented?

Archives the word is a challenging concept. Within the context of archives and records management, archives can refer to:

  • verb, “to transfer records from the individual or office of creation to a repository authorized to appraise, preserve, and provide access to those records”.
  • noun, “an archives”.

Information technologists, data librarians, and information governance professionals may broaden those definitions to include data backups, but generally, archivists tend to shy away from “Big Data” and instead focus on that small bit of material that is deemed archival.

Institutional archives do not have indefinite financial resources. Archivists and librarians are often overworked, underpaid, underresourced, and frankly, undercited. The provision of access and long-term sustained preservation go hand-in-hand. Services such as Archive-It require institutions to make a financial commitment towards server space and the employment of technical archivists to manage institutional collections.

Importantly, modern archivists do not make it a practice of taking things, or blindly capturing online records, without first attempting to identify and secure the right to do so. Violating this principle is wrong, legally and ethically.

I think it would also behoove us to discuss “vital records” for a moment. The Electronic Code of Federal Regulations defines vital records as the essential agency records that are needed to meet operational responsibilities under national security emergencies or other emergency conditions (emergency operating records) or to protect the legal and financial rights of the Government and those affected by Government activities (legal and financial rights records). While important, newspapers are not vital records. Janice Okubo of the Hawaii Health Department was most likely talking about records such as birth certificates and taken far out of context.

Media Archives

Since newspapers and media publication serve a variety of business functions, extant newspapers do not exist purely by chance. In the past, publishers recognized the business value of their print and retained copies for their own identified business needs. Perhaps they wanted to have a reference resource, as shown in the Oscar-winning film Spotlight. Maybe their intention was more mercantile.

Circulation and subscription models expanded to include the sale, or rental, of microfilmed versions of these publications. Publishers retained the original long-lasting microfilm masters to make even more copies from, or add to their business archives, rendering the retention and management of paper versions moot. Computers made it possible to digitize that microfilm, secure it in a database, distribute publications even more widely.

Unlike print newspapers, digital-only news has no physical form. A subscription to digital content usually provides an institution or reader with rented, limited access to files that are managed by the newspaper producer via a digital asset management system, and the legal terms associated with access. There is a critical difference between this short-term access model and long-term ownership. Under this model, archives and libraries usually do not take custody of the digital objects that comprise the “news”— including images, websites, social media, text, apps,  and other content forms.

This is not to say that there are no media archives. Many media outlets maintain internal corporate archives or employ records managers to manage the CMS. There is a degree of archiving required of these folx in their work, but much of their work is curation – making sure that assets are discoverable and maintained.

Examples of media archives who have made this transition include:

WNYC is a smashing exemplar of how institutional archives can partner with the community it serves. While Gawker is under siege by political and economic forces outside the scope of this post, the Gothamist will continue to exist. WNYC received funding from anonymous sources to purchase the intellectual property rights along with the published material. It is crucial to note that the WNYC archives did not take, or “capture,” the Gothamist website. WNYC worked with the Gothamist to obtain the legal right to retain and disseminate the archives for the future.

The Freedom of the Press Foundation is also doing impressive work. They have recently set out to capture Gawker.com with the understanding that the articles disseminated for public consumption are not intellectual assets of Gawker. In other words, Gawker.com is no more protected property than copies of old newspapers found in your grandparents’ attic.

What can journalists do?

Brush up on your information literacy, for one. If your work is changing the world, then you need to carve out some time. Look into services such as Perma.cc and the Wayback Machine. Practice good hygiene in the management of your records. Ask questions at work: does the organization have an archivist or records manager? Who maintains the content management system? What would happen to your work in the event of bankruptcy or a change in ownership? Is our website even technically archivable? Look for opportunities like Personal Digital Archiving for Journalists to expand your knowledge about managing your media for the long haul. Most importantly, please always feel free to reach out to archivists and librarians! Society of American Archivists is one resource. ARMA is another. Explore Open Scholarship with a renewed commitment to maintaining your body of work.

One thought on “URLs Aren’t Archives ¯\_(ツ)_/¯, and Other Stories

  1. Pingback: Web Archiving Roundup: May 7, 2018 | Web Archiving Section

Leave a comment