Cool Tools: High Performance Sound Technologies for Access and Scholarship (HiPSTAS!)

I was delighted and intrigued to read an article in the March 26, 2014 web edition of the Chronicle of Higher Education: Scholars Collaborate to Make Sound Recordings More Accessible.  It described a project spearheaded by Tanya Clement, former University of Maryland employee, creator of In Transition: Selected Poems by the Baroness Elsa von Freytag-Loringhoven, and now assistant professor at the University of Texas at Austin.

I am always on the lookout for “cool tools” that we may consider using some day for our own work, and their are a lot out there. The HiPSTAS Research and Development with Repositories (HRDR) project is funded by an NEH Institute for Advanced Topics in the Digital Humanities grant to develop and evaluate a computational system for librarians and archivists for discovering and cataloging sound collections.  From the HiPSTAS blog:

The HRDR project will include three primary products: (1) a release of ARLO (Automated Recognition with Layered Optimization) that leverages machine learning and visualizations to augment the creation of descriptive metadata for use with a variety of repositories (such as a MySQL database, Fedora, or CONTENTdm); (2) a Drupal ARLO module for Mukurtu, an open source content management system, specifically designed for use by indigenous communities worldwide; (3) a white paper that details best practices for automatically generating descriptive metadata for spoken word digital audio collections in the humanities.


I, for, one, am looking forward to the output of this project, and at the prospect of a faster way to increase access to our fragile sound recordings.

Stew of the Month: February 2014

Welcome to a new issue of Stew of the Month, a monthly blog from Digital Systems and Stewardship (DSS) at the University of Maryland Libraries. This blog provides news and updates from the DSS Division. We welcome comments, feedback and ideas for improving our products and services.

General Announcements

DSS is very pleased to announce that Ann Levin has joined the Division as a project manager. Ann has already started to work with the Prange Project and has taken on a number of new responsibilities. In addition to managing projects, Ann will devise policies, procedures and best practices for managing DSS projects, in collaboration with other divisions. We are hoping that these policies and procedures will evolve to a form that will be suitable and acceptable for adoption by all Library divisions, so we can all operate under the same guidelines to coordinate our efforts for delivering better products and services.

We are sorry to see that Jill Fosse has decided to retire from the Libraries. Her invaluable service is appreciated across the Libraries and the campus. In addition to her many daily responsibilities, Jill was very active in various committees in the Libraries and on campus. Her retirement party will be on Tuesday, March 25. We hope you all will stop by to wish her well.

Jie Chen too decided to leave her post as Director of Consortial Library Applications Support (CLAS). During her short tenure at UMD Libraries, Jie put in place several procedures for better managing our ILS resources and for better communication with the Council of Library Directors (CLD) of USMAI.

In response to the security breach at the UMD campus, DSS started immediately working with our Campus Division of Information Technology (DivIT), to make sure our security software is up to date and to perform internal audits to safeguard access to our data and to our infrastructure. We will continue to work with DivIT to ensure compatibility and compliance with general campus  security directives.


The article co-written by Jennie Knies and Robin Pike, Catching Up: Creating a Digital Preservation Policy After the Fact has been accepted to the journal Archival Practice for publication in 2014.

Karl Nilsen reviewed Research Data Management: Practical Strategies for Information Professionals (Purdue UP) for The Journal of Librarianship and Scholarly Communication.

Department Updates

Consortial Library Application Services (CLAS)

The team completed implementation of Paging for Towson University and it is now live. We are in the process of setting up Single Sign On for Salisbury University.   We worked with SSDR on implementing the new database finder tool for College Park by extracting subscription database information from Metalib via the x-server.  This new tool allows users to view database options without having to log into Research Port.  Additionally we did the semi-annual extract and upload of USMAI’s serial holdings data for OCLC Local Holdings Record (LHR) batch processing.  About 61,000 LHRs were generated.

Digital Conversion and Media Reformatting (DCMR)

In fall 2013, the Hornbake Digitization Center started a concentrated effort to digitize fragile, early university publications from the University Archives. Regulations of the Maryland Agricultural College from 1901 was digitized as part of this effort, as well as for an exhibit. This document and many others in Digital Collections are currently being used for a class taught by university archivists Anne Turkos and Jason Speck–“MAC to Millennium: History of the University of Maryland,” a history course about the University of Maryland, which uses primary sources in the classroom.

Throughout February, DCMR worked with DPI, Special Collections, and Metadata Services to modify an Internet Archive in-house batch upload script and solidify metadata creation and digitization workflows for university publications and other special collections materials. While DPI had previously uploaded these publications one-by-one, DCMR will now be able to make more publications accessible via the Internet Archive more efficiently. Future uploads to the Internet Archive will include: university publications, French pamphlets, and early issues from the Carpenter Magazine.

DCMR also concluded the digitization phase of several vendor-based projects during February including digitizing over 300 wire recordings from the Arthur Godfrey Collection, over 4,300 pages of correspondence from the Katherine Anne Porter papers, and 43 open reel recordings from the Paul Traver papers. In the coming months, staff will collaborate with collection managers and DPI to prepare the metadata for batch-ingest into Digital Collections.

Digital Programs and Initiatives (DPI)

The Historic Maryland Newspapers Project is currently planning a summer project to examine the extent to which Chronicling America, the Library of Congress database that is home to all newspapers digitized by our project and others from around the country, is represented in Wikipedia. To this end, we are advertising a position for a temporary, part-time Wikipedian-in-Residence. The Wikipedian will also propose scenarios and/or tools for increasing the representation of Chronicling America in Wikipedia. For more information, see this DigiStew post ( or the job posting ( The posting closes on April 4, 2014.

Karl Nilsen gave a presentation about NSF data management plans and related topics to grant administrators and contract managers at the Office of Research Administration. Strengthening our ties with research administrators is an important part of building campus-scale networks to support research data management and curation.

More than 200 theses and dissertations from the fall 2013 semester have been deposited in DRUM bringing the total number to 9,091.  Requests for embargoes have increased over the past year.  For the fall semester, 110 students requested that access be restricted to their research for either a 1-year or 6-years, with a majority requesting a 1-year embargo.  Are more and more students publishing their research, which we do see evidence of, or is it a knee-jerk reaction to making their research widely available?

Thanks to Nedelina Tchangalova, 22 undergraduate students deposited their research in DRUM as part of the Library Award for Undergraduate Research; the highest number of applicants since the awards inception in 2011.  Congratulations to Nedelina and other members of the award committee on their success.

Josh Westgard spent much of February focused on data-loading and data-migration for various projects: First, the development phase of the Libraries’ 2014 Serials Review tool was brought to a close with the launch of a customized version of a database system originally developed at NC State ( Josh customized the look and feel of the tool for UMD Libraries, modified some of its functions to work with the campus LDAP authentication set up by SSDR, added a subject filter on the splash page to prevent the loading of the entire dataset upon launch and thereby improve the tool’s performance, and finally loaded all the data into the mySQL database on the backend.  Josh also researched and tested a script for the batch loading of items to the Internet Archive, and worked with staff from technical services, special collections, and DCMR to develop a workflow for the tool and to train them on its use. Finally, Josh assisted the team of the Katherine Anne Porter Correspondence project with metadata validation and troubleshooting of more than 6000 lines of metadata, and image inventorying and archiving of some 4300 images.

Alice Prael, a student in the iSchool’s Digital Curation program, began work in DPI as a student assistant in February. She will be assisting with gathering of documentation related to the implementation of the Libraries’ Digital Preservation Policy, and other related duties as assigned.

Fun statistics! According to Google Analytics, Digital Collections had 6,482 visitors in February, 76% of whom were first-time visitors.  These visitors viewed 20,658 pages.  Close to 1,200 of these visitors were on campus, and the remaining came to us from around the world. The single most popular item in February was the film, Mujeres de América latina, and thanks to Google Analytics, we are willing to bet that there was a class assignment involving this film the week of February 17. Coach Jerry Claiborne with four players, University of Maryland football, 1981 was the most popular image for February – specifically on February 5. The most popular manuscript award goes to the John Jacob Omenhausser, Civil War sketchbook, Point Lookout, Maryland, 1864-1865, a beautifully-illustrated document by a Confederate prisoner during his time in the Union prison camp at Point Lookout, Maryland.

DRUM had a total of 18,444 visitors in February, and those adventurous souls viewed over 44,000 pages. DRUM’s most popular document? The dissertation Influence of Subject Matter Discipline and Science Content Knowledge on National Board Certified Science Teachers’ Conceptions, Enactment, and Goals for Inquiry, with close to 1200 views!

Software Systems Development and Research (SSDR)

Shian Chang (lead) and Cindy Zhao continued to enhance the Libraries’ Website.  They added minor fixes and enhancements to the recently released Database Finder and  Subject Specialist pages.  They also completed implementation of a new website banner alert system.  Hippo managed content can be used to schedule critical messages which appear in the banner of every page on the Libraries’ Website and Libraries’ Website (Mobile). They also completed initial coding of a new Legacy Bookplate feature to highlight contributions to the Libraries, which has now moved into a testing and refinement phase.

Irina Belyaeva (lead) and Paul Hammer completed development of a new loader program for select OCRd correspondence from the  Katherine Anne Porter collection.  This loader also serves to prototype a potential new generic program for batch loading digital objects in the Digital Collections Fedora repository.

DSS has been trialling a new software service to the University of Maryland community.  We provide software development support to researchers who have an idea for an application or perhaps some code they would like to distribute.  The support can take the form of coding or simply guidance on software development practices. Irina Belyaeva worked with the Bill Fagan Lab in the Department of Biology to turn their existing code into an R package prototype for analysis of population-level animal movement patterns.  The prototype will be used to apply for grant funding to complete development of the package for release to the ecology community.

User Systems and Support (USS)

Stephanie Karunwi and Neha Rao participated in Love our Gadgets Event on March 13, 2014 at the McKeldin Terrapin Learning Commons (TLC).  Both Stephanie and Neha both had the opportunity to demonstrate and inform several students on the many functions and abilities of the google glass.

Preston Tobery during the “Love our Gadgets” event last week had the opportunity to demonstrate the functions of the Makerbot 3D printer. He had a bunch of completed prints available for students to get a closer look at while watching a live print on the 3D printer. Even though the Libraries have had the printer for almost a year, it is still great to see the amount of interest and awe that this technology provides. During the demo, Preston reported that several students had talked about printing tools or replacement pieces for broken items around the house, and others were miniature models to put on the shelf as a conversation piece. It was really a great experience for everyone.

USS is currently prototyping a new system for students that would allow simple and easy recording of presentations.  The new system called a One Button Studio would allow the students to more easily record themselves giving a presentation by just pressing a single button.  The current system is a manually process the students follow to setup, record and save their presentations.  The new system would allow students to simply plug in their thumb drive, hit a button to start the recording, hit the same button to stop the recording, and then take their thumb drive with their presentation already saved to it.  We hope to have the system in place before the end of the semester.

Now Hiring: Wikipedian-in-Residence

The Historic Maryland Newspapers Project is hiring a Wikipedian-in-Residence for the summer months. Our overall goal in bringing a seasoned Wikipedian on board is to improve the quality of Wikipedia articles by increasing the number of relevant citations and links to the rich newspaper content of Chronicling America.

This position will be a little different from the typical Wikipedian-in-Residence gig. Most Wikipedians are brought into an organization in order to teach the staff how to edit Wikipedia, to edit and upload content to Wikipedia or Wikimedia, or to hold edit-a-thons–at least this is what I’ve gleaned while perusing other Wikipedian job listings. Our Wikipedian may do a little of this, but their work will mostly be research-based and will result in a written report of recommendations for our project and other National Digital Newspaper Program (NDNP) awardees to implement.

First, our Wikipedian will complete an analysis of how Chronicling America is currently being represented in Wikipedia. Linkypedia is one tool that could be used during the analysis. It will be important that our Wikipedian can utilize this and other tools–perhaps even tweak these tools–in order to gather relevant statistics.

The next step will be analyzing these statistics. This step is crucial because the conclusions drawn will guide the Wikipedian’s most significant responsibility–to explore different scenarios, tools, or methods for how we might effectively increase Chronicling America‘s presence on Wikipedia. For example, they may be as simple and low tech as authoring a comprehensive guide for NDNP awardees to start editing Wikipedia; or they could require developers to add some code to the open source application behind Chronicling America in order to automatically generate wiki markup needed to cite a newspaper page in Wikipedia. (The National Library of Australia has built this functionality into their digital repository, Trove.)

Screencap of a newspaper page from Trove, showing the site's ability to generate wiki markup to cite the newspaper page.

The Wikipedian will also have to investigate the cost and resources needed to realize their proposed solutions. The Wikipedian will prioritize and make recommendations for which tools should be implemented in upcoming months based on their feasibility and estimated effectiveness.

In order to accomplish all this in four short months, the Wikipedian will have to have experience conducting research and analyzing data; knowledge of existing tools and APIs for Wikipedia; and a firm understanding of the written–and more importantly, the unwritten–rules of editing Wikipedia. This is a part-time, paid position and cannot be performed remotely.

To view the complete job posting and apply, see We hope to hear from you soon!

Project Planning: Rate of Digitization

Staff in the Hornbake Digitization Center have been carefully tracking digitization statistics for several years. Over the last two years, I have used and expanded the collection of statistics so we can calculate an average rate and cost of digitization, including preparation, digitization, metadata creation, and ingest for our two major format types: “image” (including still images and text) and “audio.” My overall goal is to use these statistics to plan an annual, complete-able queue of projects for the level of staffing we can provide.

In January 2014, we digitized 896 images. Knowing the amount of digitized assets in one month is helpful, but that alone cannot determine a rate of digitization, which is needed to plan projects. Dividing the total number of assets created by hours worked during that period, and then dividing the average hourly student salary by the above total results in the cost of digitization. Knowing 8 students digitized and created metadata for 896 images in 190 hours at $10/hour is infinitely more helpful when planning digitization projects and requesting staffing for the next fiscal year. Using established rates, we can:

  • Calculate the hours needed to complete specific digitization projects
  • Create an in-house project timeline for all proposed projects in one year
  • Request enough student hours (and money) to complete the list of proposed projects
  • Provide guidance to donors who wish to provide money for in-house digitization of specific materials

For example, a librarian has proposed that we digitize 50 pamphlets at 15 pages each. We can estimate that it would take our students about 160 hours to complete the project at our current rate of digitization. From there, we can estimate that if we assign the project to one student who works 20 hours a week, that the project will be completed in approximately 8 weeks. If the project is a greater priority, we can assign it to more than one student for more than 20 hours per week.

Now that we have well-established rates, over the next several months, as I solidify fiscal year 2015 projects, I will be able to compile a projected production timeline for all in-house projects in a queue, considering projects not solely in the number of pages or hours of audio, but in in the hours it will take to digitize them.