FRED Update: Facing Challenges with Born-Digital Media

Last September, Graduate Assistants, Amy Wickner from SCUA and myself from DSS, began working with the Forensic Recovery of Evidence Device (FRED), with the goal of creating a workflow for processing born-digital media. Since born-digital material is still fairly new to libraries there is no widely accepted ‘way of doing things’ and there are few case studies on how libraries and archives have handled the issue. Ours was a process of trial and error, informed by the literature and online forums. We kept a daily log, which featured the word ‘failed’ regularly, but we have continued to make progress despite these challenges.

Our first question – which disk imaging program is most effective; BitCurator or Forensic Toolkit (FTK) Imager? BitCurator is an open source Linux-based environment for creating and analyzing forensic disk images, which are bit-by-bit copies of digital media. FTK Imager creates both forensic images and logical images (only containing the logical volume, excluding deleted and orphan files) and is the free imaging component of the larger proprietary FTK.

We began with a small flash drive containing 6 files, including many file types (jpg, xls, doc, pdf). Since the item was so small there was little difference in imaging time, but BitCurator won by 14 seconds. Next we imaged a much larger item, a hard drive containing a terabyte of data. This is where we faced our first major hurdle, one we continue to struggle with: mapping the local area network (LAN). Since this hard drive was too large to image directly to FRED, we needed the LAN to store the disk image. Since FRED is partitioned to include Linux and Windows we needed to connect both sides to the LAN. This Windows partition connected fairly easily, however the Linux partition requires the user to connect via the terminal. Since neither of us are Linux experts we relied on a script provided by the original Born-Digital Working Group. Although this script was unsuccessful, it was a good starting point and after a self-led crash course in Linux and Bash scripting we were able to connect BitCurator to the LAN and we began disk imaging. Surprisingly, FTK was faster in imaging the larger volume; 2- 4 hours faster depending on if the image was forensic or a logical image.

The next step was in analyzing the images. Although FTK can provide some information about the image, BitCurator has a more robust reporting tool which scans for Personally Identifying Information (PII), reports on file types, and shows paths for deleted files. The first report matched what we already knew about the original media. When we ran the reporting tool a second time on the same disk image we discovered inconsistencies with the previous report. We are currently in communication with the BitCurator team, but this problem has not yet been resolved. Until we have consistent reports we cannot rely on this tool for analyzing disk images.

After the most recent update to BitCurator we had to re-connect the Linux partition of FRED to the LAN. Although we thoroughly documented our previous work with the issue, this new attempt was more problematic and we ultimately failed in fixing the problem ourselves. We are currently working with User Systems and Support to resolve the problem, but in the meantime we cannot access disk images stored on the LAN for analysis in BitCurator.

These challenges show how much the success of one project relies on a network of experts both internal to UMD Libraries and external groups like the BitCurator team and other users who posted their experiences online. All of these groups have been vital to continuing our work. Although there were challenges, many of which are still problematic, we continue to make progress and hope to have a revised workflow complete by the end of the summer.

Introduction to SSDR

I provided the following introduction at the recent Digital Systems and Stewardship (DSS) Town Hall meeting and thought I would redistribute it as a blog post.


Hello, my name is Ben Wallberg and I manage the Software Systems Development and Research department, or SSDR.  The department has 5 software developers, 1 web developer, and 3 Graduate Assistant software developers devoted to supporting College Park specific services.  We also have 2 systems analysts who support consortial services, with joint membership in the Consortial Library Application Support (CLAS) department.

The best way to understand what we do is to break down the department name into its four component words.

Software – SSDR operates and maintains many of the DSS server based applications.
We install and configure those applications, perform upgrades, and provide development support.  Each application has an owner with whom we coordinate on changes and future directions for the application.  For consortial applications like Aleph and Metalib, the owner is the CLAS department.  For Digital Collections and DRUM it is the Digital Programs and Initiatives (DPI) department and for the Libraries’ Website it is the Web Advisory Committee.  Note that Libi does not currently have an application owner, which we are working with Library Assembly to rectify.

Systems – Websites and other server based applications are generally not run in isolation but as collections of systems and services.  You are familiar with the end user facing applications but to operate, these require an unseen infrastructure.  Relational databases supporting multiple applications are an example of these backend systems.  Fedora Commons is a metadata and asset management tool behind Digital Collections.  Solr is a search platform used by DRUM, Digital Collections, and the new SCPA Scores database.  This web of interconnected services extends also to external systems such as Wufoo hosted forms which are embedded into the Libraries’ Website or the Shibboleth distributed identity management systems used for login to consortial applications.

Development – Also referred to as programming or coding.  We write some code from scratch but this is expensive work so we try to minimize the amount of original code we write by building our applications from existing applications, toolkits, and services and then using locally developed code to integrate them.  The majority of time taken on any development project is not in the actual coding; it is the project planning, requirements gathering, issue tracking, version control, testing, and documentation.

Research – In this context research primarily means mean investigation into new software and systems.  This could cover front- or back-end applications, cloud services, web service APIs, or development tools.  Adoption of new software is often constrained by the desire to get the highest return on our investment which is based on factors such as how well we can support the underlying technology and how well can reuse the tool for multiple purposes.  To learn how applications work we trial our own tools and applications as well as offer sandbox environments for Library staff to run applications they have an interest in evaluating.

Finally, these four functions: software, systems, development, and research are performed in the context of the Libraries’ strategic plans, needs of application owners, maintaining availability of critical systems, and multiple major and minor projects with overlapping timelines and dependencies.

Stew of the month: June 2015

Welcome to a new issue of Stew of the Month, a monthly blog from Digital Systems and Stewardship (DSS) at the University of Maryland Libraries. This blog provides news and updates from the DSS Division. We welcome comments, feedback and ideas for improving our products and services.

Digitization and Conversion Activities

The following outsourced projects were digitized during FY15 (not including smaller patron requests); these projects were funded through the DIC project proposal process:

Internet Archive (SCUA, SPCA, McKeldin): 820 volumes, 123,365 pages, $18,619

Includes the following serial titles or genres of works, plus additional works: Mason and Root tunebooks, French pamphlets, Swing, NBC Chimes, Radio Stars, Radio Digest, Radio Doings, The Keynoter, Mid-Atlantic Archivist, Mid-Atlantic Archivist Conference programs, Biennial Reports (Maryland Agriculture College), Maryland Agriculture Experiment Station Annual Reports, UMD Media Guides, AFL-CIO Proceedings, The Lather, miscellaneous university publications, Werk, Notizie degli scavi di antichità, US Department of the Treasury publications, Reliable Poultry Journal, The Union Signal, Izvi︠e︡stīi︠a︡ Imperatorskago russkago geograficheskago obshchestva.

AFL-CIO News (oversize, bound) (SCUA): 12,874 pages, $7,080.70

Schedule of Classes (oversize, bound) (SCUA): 3,251 pages, $1,788.05

Schedule of Classes (microfilm) (SCUA): 15,282 pages, $3,265.40

Hebraica (books and serials) (McKeldin): 75,859 pages, $14,379.39

WAMU (1/4″ open audio reels), Godfrey (wire recordings)(SCUA): 154 reels, 39 wires, $12,179.87

Liz Lerman Dance Exchange Archives (VHS tapes) (SCPA): 98 tapes, $8,279.29

Library Media Services deteriorating films (16mm film) (LMS): 40,070 feet, 42 films, $16,028

Diamondback Photo Morgue (SCUA): 7,532 photos, $5,646

Tentative FY15 Totals: 230,631 pages, 7,532 photos, 154 reels, 39 wires, 98 VHS tapes, 42 films=$87,265.70

On June 11 Babak Hamidzadeh, Robin Pike, Liz Caringola, Judi Kidd, and Doug McElrath (SCUA) met with Linda Tompkins-Baldwin of Digital Maryland to share information about our respective projects and to discuss future collaboration. As a result of that meeting, Liz, Robin, and Doug will be speaking about the newspapers project at a series of regional meetings this summer to discuss cultural heritage issues, state initiatives, and opportunities for collaboration.

Judi Kidd and Eric arranged for the setup of sturdy new shelving in Hornbake 4210V, the purpose of which is to hold audiovisual equipment. Eric and digitization assistants Rachel Dook and Caroline Hayden arranged carts, boxes, and equipment. Work on this project will continue this summer, with the goal of creating an inventory of audiovisual equipment that may be used in digitization activities on campus.

Alice Prael is reviewing analytics data on our Digital Collections to determine the most popular holdings and how our patrons are finding them. This research will help inform future decisions on digital projects and how we can best promote them.

Software Development

In partnership with WAC and the Discovery group, the website search tabs have been modified to begin submitting searches to the WorldCat Discovery interface.

The Persian Digital Humanities website, implemented using our Hippo CMS based Exhibit template, is now available.  The UMD Libraries are hosting the website on behalf of the Roshan Institute for Persian Studies.  We are also exploring, along with MITH, additional collaborations with the Roshan Institute in the areas of digitization, text mining, and social media and web archiving.

Initial development of the new online student application submission form and supervisor database is completed and the application has been put into production. Student submissions are already being received and Human Resources and student supervisors are providing feedback for requested changes which we will review and install before the Fall semester.

As part of standing up a DSpace instance for the Maryland Shared Open Access Repository (MD-SOAR) we created a GitHub based code repository forked from the core DSpace code repository.  The production instance came online using DSpace version 5.1 with customizations to the XMLUI/Mirage2 interface for the MD-SOAR theme and for institutional branding based on each top-level community per participating institution.

DSS has entered into a partnership with the National Socio-Environmental Synthesis Center (SESYNC) with DSS providing software development services and open data expertise in support of their mission to accelerate scientific discovery at the interface of human and ecological systems.  Development is underway on a new Integrated Discovery Platform for automating the ingest and cataloging of socio-environmental data.

User and System Support

On May 29, 2015, User and Systems Support migrated the Windows infrastructure from the Libraries’ LIBLAN active directory to DivIT’s AD.UMD.EDU active directory. The migration was successful and only had few a problems, which we were expecting. When changing a complete infrastructure that’s been in place for approximately 15 years, there will be hidden problems. User and Systems Support quickly resolved all problems.

The migration to DivIT’s AD.UMD.EDU active directory solved a few issues.  First, by migrating to the new active directory, staff no longer have to keep two passwords. The password used for email, timesheets, VPN, and other university resources is now the same password used for logging into the Windows workstations. Secondly, the LIBLAN domain was only created because of a need of the Libraries that DivIT didn’t provide at that time. Now, DivIT has their own stable active directory which contains accounts every librarian, staff, and student that’s on campus. And, their active directory security policies have been reviewed by IT auditors. There is no longer a need for the Libraries to create duplicate accounts or spend time copying the same security polices as DivIT.

Along with the change of passwords, there were other changes that were a result of the migration.  All network staff printers were renamed, and all printers were installed on every computer so staff now sees every network staff printer in their printer list. The new naming convention for the staff printers changed from department/unit name to BLDG_FL_PR# (Building_Floor number_Printer number). All the printers were labeled to match the same names as the printers shown on the staff workstations printer list. This change to the network printers allows for greater flexibility. Staff no longer need to contact the DSS Helpdesk to have a network printer installed. Printers do not need to be renamed if a department’s name changes. And it allows staff that float around the ability to print no matter where they are.

Also, because of the new security policies, staff can no longer have admin rights to their workstation. Anything that requires admin rights in ordered to be installed, now must be installed by User and System Support. Likewise, because of audited securitiy policies, DivIT does not create generic accounts in their active directory. So, departments can no longer have a generic account for their students to share. Each student must have their own username and password to be able to log into the workstations.

The entire domain migration took part in 3 major steps using scripts—premigration, ADmigration, and postmigration.  These scripts made it possible for USS to automate the process as much as possible, eliminating the need to touch all staff Windows desktop computers. The scripts didn’t run automatically on Windows laptops because most would either be away from campus, using wireless (which has its own problems), or simply turned off.  And also, the method of removing and adding Mac desktops and laptops to the new domain used an entirely different process.

The goal of the premigration script was to copy user data to try to make the migration as painless as possible. Since some of the user data could only be accessed when staff were logged in, this script had to be initiated by the staff before the migration. When initiated, the script copied over Firefox configurations, Internet Explorer Favorites, and Chrome configurations to a server. It also copied over the Microsoft Outlook configurations and data from a few specialized applications.

On Friday, May 29, 2015, DSS remotely pushed the ADmigration script to every staff Windows desktop that was powered on and had network connectivity. The script kicked off a 3 step process that rebooted the computers after each step was completed. These steps automatically removed the computers off the LIBLAN domain, renamed the computers to meet DivIT naming conventions, and added the computers to the AD.UMD.EDU domain, ready for the staff to login on Monday. It took weeks of planning, testing, and work to do these 3 steps without any human intervention at all. Each and every USS staff and student employee contributed in some way on this project.

The postmigration script was automatically ran when the staff logged into the computer for the first time. This script copied all the user data that was copied from the premigration script, and placed them in their correct locations on the computer. However, as with any process that involves so many computers, not all the scripts or steps ran successfully on every computer. Some staff didn’t have the opportunity to run the premigration script, and some desktops were able to run all 3 steps of the ADmigration scriptes. These machines had to be migrated individually by USS staff. The week following the migration was used to individually migrate all Windows laptops and Mac laptops/desktops.

The Library staff cooperated wonderfully throughout the migration.  By them doing the premigration steps, the number of potential problems were significantly reduced. And the staff that did have problems, they were patient and gave User and Systems Support staff the necessary time to resolve the issue. If not for the cooperation of the staff, the migration could have been unsuccessful with many frustrated staff.

USMAI (University System of Maryland and Affiliated Institutions) Library Consortium

Support to USMAI

The CLAS team responded to 75 Aleph Rx submission and 28 e-resource requests from across the consortium’s libraries in June. This included projects like configuring EZproxy for UM HS/HSL’s investigation of Callisto and modifying loan rules for SU’s iPad checkout service.

Fiscal Year End Closeout

June brings the end of one fiscal year and the beginning of the next. The CLAS team assisted USMAI campuses with the fiscal year transition, closing out FY2015 budgets, creating FY2016 budgets, and rolling encumbered orders from the old budget to the new budget.

New User Request Forms

In an effort to make the submission of Aleph user creation/deletion requests as simple and accurate as possible, the CLAS team has streamlined the input of information on the four user request forms (Circulation, Cataloging, Acquisitions, and Cross-Functional) and migrated the forms to a new survey platform (Wufoo). The new forms offer a number of hints when the requester hovers over corresponding fields. These hints along with the display of condition-specific fields should help guide requests.

Kuali OLE

CLAS continued work on OLE, meeting with USMAI testers to facilitate the consortium’s evaluation of OLE. The team also implemented an authentication method on their development server in order to test loading real data (i.e. patrons, financial, etc.) in OLE.



The Maryland Shared Open Access Repository moved into production status on June 15th. The repository is available and ready for participating campuses to begin loading repository items and collections.


Student assistant turned C1 Jordan Lee’s last day was June 30. Jordan accepted a full-time position with the UMD College of Behavioral and Social Sciences (BSOS) Undergraduate Advising Office, where she worked as graduate administrative coordinator while earning her MLS. Congrats, Jordan!

Conferences, workshops and professional development

Heidi Hanson attended ALA Annual Conference in San Francisco as Chair of LITA’s Christian Larew Scholarship Committee.

Liz Caringola was accepted into the 2015-2016 cohort for the Advancing Professional Track Faculty Program sponsored by the ADVANCE Program for Inclusive Excellence and the Office of Faculty Affairs. This program provides access to knowledge regarding policies governing professional track faculty; offers knowledge through concrete examples and models; and expands participants’ on-campus peer networks.

On June 4, Liz Caringola moderated the session “Wikipedia: Helping Us Reach Users and Build Partnerships” at the Research and Innovation Forum. The panelists included Laura Cleary, Felicity Brown, Jen Eidson, Steve Henry, and Jessica Abbazio. Robin Pike presented “Managing Audiovisual Digitization,” Josh Westgard presented “CSV Validation for Metadata Wrangling,” Karl Nilsen presented “Comparing Data Production in Social Sciences, Area Studies, and Humanities Fields Using Terminology in Literature,” and Eric Cartier presented a poster titled “Establishing the In-House Internet Archive Digitization Workflow” at the UMD Libraries Research and Innovative Practice Forum on Thursday, June 4. Eric’s poster was also accepted for the Society of American Archivists Research Forum, part of the Annual Meeting in Cleveland, OH.

Liz Caringola and Josh Westgard from DSS and Amanda Hawk from SCUA judged projects for the National History Day competition held on campus June 15-16.

Eric and Dr. Laura Schnitker attended the Cultural Heritage Information Management Forum at The Catholic University of America on Friday, June 5, and delivered the presentation, “Saving College Radio,” during the morning session.


Robin Pike, Eric Cartier, and student assistants Audrey Lengel, Caroline Hayden, and Cecilia Franck visited the National Public Radio headquarters in Washington, DC on Thursday, June 11 to deliver files digitized over the past two years from the NPR Archives. Hannah Sommers, Director of the Research, Archives, and Data Strategy group provided a tour of the facility and operations.

Eric met with Jaime Mears, a recent iSchool graduate currently working as a National Digital Stewardship Residency resident for the DC Public Libraries, on Wednesday, June 24. Jaime’s task is to develop a personal digital archiving workstation for the public, so she and Eric spoke about all of the procedures, processes, and workflows that are the foundation of Hornbake Digitization Center (HDC) digitization operations and how they may translate to a public library setting.