National Agenda for Digital Stewardship

The National Digital Stewardship Alliance (NDSA), a voluntary membership organization of leading government, academic, and private sector organizations with digital stewardship responsibilities who collaborate to “establish, maintain, and advance the capacity to preserve our nation’s digital resources for the benefit of present and future generations,” just released their 2014 National Agenda for Digital Stewardship.

In the Agenda, they recognize that “it has become increasingly difficult to adequately preserve valuable digital content because of a complex set of interrelated societal, technological, financial, and organizational pressures.”  Among the identified “pressures” are the usual suspects: lack of time, funding, staff, priorities, etc.  Here at the University of Maryland Libraries, we are just now formalizing our digital preservation policies and procedures, despite the fact that we have been creating and managing digital collections for close to a decade. We are not unusual.  Groups like the NDSA, who are actively communicating with each other, developing standards, and encouraging collaboration, are helping to demystify the complicated world of digital preservation and to make it seem an attainable goal.

The Agenda identifies four areas of digital content that they feel need special attention this year: electronic records, research data, web and social media, and moving image and recorded sound.  All of these content areas are first and foremost on our minds at the University of Maryland Libraries.  In the past year, we have joined forces with our colleagues in the Maryland Institute for Technology in the Humanities (MITH) to form a Born-Digital Working Group to develop policies and procedures for working with born-digital content, including electronic records.  In 2012, we purchased a FRED workstation.   We do not have everything figured out, not by a long shot, but the fact that we are taking incremental steps towards tackling this issue is important.  In 2012 we also hired a Research Data Librarian, who is in the process of working with a project team to develop a business case for research data services at the University of Maryland.  We have been archiving web content for several years using the Internet Archive’s Archive-It tool.  And in the past year, we have greatly increased our digitization of audio recordings, including creating a digitization lab for in-house work.

So we can pat ourselves on the back.  It is sometimes difficult to recognize and appreciate the work that we do when it seems like there is still so much left to be done. We need to develop better strategies for providing access to our digital content, for maintaining and preserving that content, and for planning into the future.  We are working on it.

Crowdsourcing Transcription

I really, really want to do this. Crowdsource transcription, that is.  I have been following Ben W. Brumfield’s Collaborative Manuscript Transcription blog for a while, and it has just about all of the information we need in order to figure out a path forward.  Beginning to explore how transcription might tie into our Digital Collections is on our project agenda for the fall.

In a recent presentation about crowdsourcing, Brumfield emphasizes the point that while many view crowdsourcing as “free” labor, it is more like being “free like a puppy,” meaning that there are many, many costs associated with crowdsourcing transcription, and a primary motivation should be about engaging the community with collections.

Currently, the University of Maryland Libraries has several transcriptions of parts of collections available online.

Stepter Family Papers: a researcher/genealogist transcribed these letters, and provided them to us in PDF format for sharing on our website. We uploaded the PDFs to the directory where we store supplementary finding aid materials and linked to them through the finding aid.  While searchable within each page, these letters are difficult to discover outside of the collection and it is not possible to cross-search.

Diary of Susan Mathiot Gale: I transcribed this years ago for a research paper. It is sitting on my computer.

Katherine Anne Porter correspondence: A group of students transcribed selections for a course project but these are not available online at present.  In the next year, OCRd versions of select Porter correspondence will be available online.

Sterling Family Papers: A group of students transcribed and encoded these letters in TEI for a course project in 2006.

And there are more, of this I am certain.  My dream is to develop an interface that will allow us to, at minimum, ingest and index basic, textual transcriptions and display them in conjunction with digital images.  A second phase would involve developing crowdsourcing capabilities to allow users to add, edit, and annotate.  A third might involve allowing for more specific transcriptions (TEI instead of full-text, for example).  There are so many interesting projects out there to explore.

Jennie Levine Knies is the Manager, Digital Programs and Initiatives, at the University of Maryland Libraries.