Stew of the month: March/April 2016

Welcome to a new issue of Stew of the Month, a monthly blog from Digital Systems and Stewardship (DSS) at the University of Maryland Libraries. This blog provides news and updates from the DSS Division. We welcome comments, feedback and ideas for improving our products and services.

Digitization Activities

Historic Maryland Newspapers Project

In March the following Maryland newspapers were uploaded to Chronicling America:

We’re also excited to announce that we’ll be co-hosting a Wikipedia edit-a-thon at the Maryland State Archives on May 2, 2016. For more details and registration information, please visit the event page: https://en.wikipedia.org/wiki/Wikipedia:Meetup/MD/UMD_MSA_Newspapers

Other Digitization Activities

DCMR staff continued to review files for the William Kapell collection, Football films, Library Media Services films, and Djuna Barnes microfilm. They began the review of the Jackson Bryer videos.

The Digitization Initiatives Committee, chaired by Robin Pike, presented its FY17 budget to the Resources Group for approval on March 21. Pike contacted project managers whose projects were approved, modified, or rejected (due to the amount of proposals received), and will be presenting this information to the Libraries in the coming months. She will also be working with collection managers on project planning meetings.

Graduate Assistant David Durden completed his analysis of UMD Digital Collections usage statistics from 2013-2015 and has compiled annual reports of his findings. David has also begun an analysis of targeted LAN locations for SCUA and SCPA to begin to analyze what files are saved on the LAN that should be described and moved into UMD Digital Collections for access and preservation.

Digitization Assistant Brin digitally transferred a specially curated box set of compact discs. The University of Maryland Symphonic Wind Ensemble’s “Live Performance Project, Wakefield Years 1983-2005” was compiled by Professor John E. Wakefield with the assistance of University Archivist Anne Turkos and Curator of Special Collections in Performing Arts Vin Novara. Metadata Librarian Bria Parker described the music at track level. The streaming files will soon be available in Digital Collections.

Robin, Eric, Digitization Assistants David Durden, Caroline Hayden, Brin Winterbottom and iSchool students Amanda Brent, Monique Libby, and Maya Riser-Kositsky digitized Filipino family documents and photographs as part of the 2016 Maryland Day community archives digitization event for the “Preserving Your Family Treasures & D.C. Filipino Americans Before the Beltway” event. Digitized items will be added to UMD Digital Collections through the end of May and will be a part of the Filipino American Community Archives collection in SCUA.

Digital  Programs and Initiatives

UMD Libraries Supports Open Library of Humanities

We are pleased to announce that the UMD Libraries has recently joined the Open Library of Humanities (OLH) as a supporting institution.  OLH is dedicated to support and extend open access to humanities scholarship and provides an alternative for humanities researchers who are interested in making their research widely available.  UMD authors can submit an unlimited number of articles for publication each year without any article processing charges.  Submissions are accepted for a wide range of humanities subject areas and undergo a double-blind peer-review process. OLH’s editorial policies are available online if you are interested in learning more.  Also read the complete UMD press release.

DRUM Upgrade

DRUM was recently upgraded to DSpace version 5.4, bringing it in line with the same version running on MD-SOAR.  No major differences are visible to users but we were able to consolidate the DRUM statistics with this upgrade.  Prior to the upgrade we were gathering two sets of DRUM statistics and we decided it would be more efficient to use one system moving forward.  With the upgrade, which was just completed 28 March, we moved over to a newer statistics system that has been running on DRUM since June 2014.  What this means is that you might have noticed a drastic drop in the number of downloads currently displayed for records.  The number currently displayed only reflects downloads from June 2014 onward.  But no need to panic, we plan to add the number of downloads from the older system, so no numbers will be lost.  Thanks to SSDR, we hope to have this completed by the end of April. The upgrade also sets the stage to explore new features like ORCID integration, which is timely given the new University of Maryland ORCID premium membership brokered through the Committee on Institutional Cooperation.

Digital Collections

DPI’s work on a new digital collections repository based on Fedora 4 continues, with various components of the system slated to go into production service later in 2016.  Toward that end, at the 2016 Code4Lib conference in Philadelphia, Josh Westgard took part in two pre-conference Hydra training workshops, and also helped to organize a post-conference “birds of a feather” session on Fedora. He also represented the University of Maryland Libraries at the DuraSpace Summit in Washington, DC.  He is a regular participant in the community effort to develop an API extension architecture for Fedora 4 (API-X, see https://wiki.duraspace.org/display/FF/Design+-+API+Extension+Architecture).

OA Publishing Fund Update

The UMD Libraries Open Access Publishing Fund closed out the fiscal year 12 April.  With additional funds from the Office of the Provost and many of the deans, 32 applications were processed for a total of $48,000; an average of $1500 per article.  Here’s breakdown of the number of applications from each college/school:
3 – A. James Clark School of Engineering
3 – College of Agriculture & Natural Resources
1 – College of Arts & Humanities
7 – College of Behavioral & Social Sciences
10 – College of Computer, Mathematical & Natural Sciences
2 – College of Education
1 – College of Information Studies
5 – School of Public Health
We anticipate applications will open up again in early fall for the next fiscal year.  Contact Terry Owen (towen@umd.edu) if you have any questions.

e-Publishing

DPI staff have been working behind the scenes on several forthcoming e-Publishing projects. One such project is The Early Americas Digital Archive, a collection of open-access primary materials written in or about the Americas between 1492 and 1820, which has been relaunched as a project of the Libraries’ e-Publishing Initiative. Originally developed at MITH in the early 2000s, the EADA site has been entirely redesigned and updated by the application owners for this relaunch. One often-mentioned critique of open digital scholarly publications is that they lack the durability and longevity of traditional print publications.  One goal of our e-Publishing initiative is to combat technical obsolescence and neglect and to ensure the continued viability and availability of legacy digital projects as they mature.

Backfile Theses and Dissertations

April also saw the release of a backfile of electronic versions of some 600 dissertations from the early to mid-20th century through DRUM.  A custom metadata extraction and batch-loading workflow was created to handle the records supplied by the digitization vendor.

Software Development

We have selected Ruby on Rails as a new core tool for use in creating web applications. Ruby on Rails is gaining wide spread adoption in the Academic Libraries community and there are a number of existing applications and toolkits we are interested in supporting (eg, Hydra, ArchivesSpaceAvalon).  Ruby on Rails also will make it much easier to build certain types of web applications from scratch and rapidly prototype new applications.  Three of our developers have completed four weeks of training and are beginning work on migrating some of our existing applications.  We will also be hiring a Contingent-I Ruby on Rails developer to assist in this effort.

SSDR developers have been learning about the Apache Camel tool for implementation of enterprise integration patterns, which we first learned of from the Fedora Commons Repository development community, for use in message passing from the repository to various indexing tools. Implementation of the replacement for our Wufoo to SysAid connector is underway as well as investigation into integrations between our selected document store (Box.com or Google Drive) and Apache Solr for use in the new Hippo CMS based staff intranet.

The Web Advisory Committee has worked with SSDR to complete the wireframes and mockups for improvements to the Staff Directory and Subject Specialists pages on the Libraries’ Website.  The pages will receive visual improvements and more consistent presentation of contact information.  We will also add the capability for photographs and profiles for each staff member, to be initially populated for Subject Specialists.  Development work will begin in May.

User and System Support

Last year, the University of Maryland took notice of the 13 separate email and calendar systems in use across the campus. Maintaining all 13 systems is costing the university a lot of money. As a big cost savings, the university decided it was best for the university to consolidate to one email, calendar and collaboration platform across campus. Last fall, a committee composed of IT leaders from across campus was formed to evaluate and recommend a common email solution. The committee worked diligently for five months and recommended to the IT Council that the university should move forward with Google Apps for Education (GAFE). The GAFE suite of core services are Gmail, Calendar, Classroom, Contacts, Drive, Docs, Forms, Groups, Sheets, Sites, Slides, Talk/Hangouts and Vault. The IT Council reviewed this recommendation and decided to move forward. Those departments that were using the Division of IT supported Exchange email and calendar system would be the first to be moved to GAFE.

In preparation of the Libraries move to GAFE, the User and System Support (USS) team became early adopters in January of this year. As an early adopter, USS was able to experience the migration process and use GAFE so that they could provide local help to staff when the rest of the Libraries migrated. USS planned and provided two “Google Migration Show and Tell” sessions, as well as sent multiple emails to staff in order to provide as much information to make the migration as painless as possible.

Division of IT planned to migrate the Libraries between April 1, 2016 – April 4, 2016. On Monday, April 4, USS staff visited departments and library branches across campus to provide any assistance that may have been needed from staff. The migration proved mostly successful, with most staff being migrated without incident. Unfortunately, as with any large and complex technical change, small problems cropped up that needed to be addressed.

Library staff can use any of the 13 core services in GAFE. Although the Libraries currently uses Microsoft Lync for chat services, library staff can also use Google Talk/Hangouts. A determination will be made in the future regarding the decommissioning of Lync in favor of Google Talk/Hangouts.

USMAI (University System of Maryland and Affiliated Institutions) Library Consortium

Support to USMAI

The CLAS team responded to 120 Aleph Rx submissions and 16 e-resource requests from across the consortium’s libraries in March.

SFX Database Upgrade

Ex Libris ended SFX support for MySQL, requiring a migration to MariaDB. Hans Breitenlohner performed the migration on March 18th.

Aleph Upgrade

As noted in the February 2016 Digistew post, CLAS is upgrading Aleph from version 20 to version 22. Version 22 has been installed in a development environment and is currently being tested by team members. Once initial testing is complete, Aleph TEST will be upgraded and made available for testing by USMAI constituencies. The upgrade is planned for completion prior to the Fall semester.

Kuali OLE

David Dahl participated in weekly meetings of the OLE Technical Council. The group’s last meeting was March 24th and has been disbanded as the project transitions to a new governance model. A regularly-occurring “community forum” is currently being developed as a mechanism to gather input from project partners. David will continue to monitor the project for USMAI as it starts its new phase of development.

MDsoarLOGO

MD-SOAR

Joseph Koivisto explored Google Tag Manager (GTM) as a mechanism to deploy Google Analytics to the Maryland Shared Open Access Repository (MD-SOAR) and collect more detailed usage data from the repository. GTM will be added to MD-SOAR in the May upgrade release. An option for users to add a Creative Commons license to their repository submissions will also be included in the upgrade.

Staffing

Kate Dohe started as the Digital Programs & Initiatives Manager on March 21st. She comes to UMD Libraries from Georgetown University, where she was the Digital Services Librarian in the main campus library for nearly three years. Prior to working at Georgetown, she was the digital librarian for an academic publishing company in California. She earned her MLISc. from the University of Hawai’i, and also holds a BSEd. in Speech and Theater from Missouri State University, and still considers herself a debate coach at heart.

Conferences, workshops and professional development

In late April, Josh Westgard attended the DC area Fedora Users Group meeting at the National Library of Medicine, where he, together with Ben Wallberg and Peter Eichman, presented on the Libraries’ progress in implementing a Fedora-4-based repository system.  http://umd-lib.github.io/dcfug2016/

Liz Caringola, Eric Cartier, and Robin Pike attended the Mid-Atlantic Regional Archives Conference from April 14-16. On April 15, Eric moderated a debate in a panel session with the topic “Should Archivists Be Required to Take Continuing Education Courses?” Also on April 15, Robin presented in the session “Archival Impact: Increasing Connections to Collections through Digitization,” discussing how UMD Libraries prioritizes digitization projects.

On April 28th, Kate Dohe presented with Laura Leichum, Georgetown University, to the Digital Initiatives Symposium in San Diego, CA on library support models for student publishing initiatives.

On May 6th, Kate Dohe co-presented a workshop on using improv techniques in library collaborations to LOEX in Pittsburgh, PA with Erin Pappas, Georgetown University.

Stew of the Month: January 2015

Welcome to a new issue of Stew of the Month, a monthly blog from Digital Systems and Stewardship (DSS) at the University of Maryland Libraries. This blog provides news and updates from the DSS Division. We welcome comments, feedback and ideas for improving our products and services.

Digitization Activities

With Jennie Knies’s departure, the Historic Maryland Newspapers Project has moved from Digital Programs and Initiatives to Digital Conversion and Media Reformatting.

Our film digitization vendor delivered the files for 47 deteriorated films from the Library Media Services collection, which will be ingested into the Films @UM collection next month. This project was funded through the work of the Digitization Initiatives Committee.

In December, Neil Frau-Cortes and Robin packed 300 books from the brittle Hebraica collection, to be shipped to a digitization vendor. This is the first digitization shipment of a multi-year project to digitize the unique books and serials in this collection.

Eric trained student digitization assistants Rachel Dook, Massimo Petrozzi, and Brin Winterbottom how to digitize open reel audio tapes by transferring episodes of National Public Radio’s “All Things Considered” from the 1970s. Their new capabilities will provide us more flexibility when scheduling in-house audio digitization.

Digital  Programs and Initiatives

The new year started out on a difficult note for DPI, with the official announcement that our fearless leader, Jennie Knies, would depart before the end of the month to take up the position of head librarian at the library of Penn State University, Wilkes-Barre campus. While we wish Jennie all the best in her new endeavors, and look forward to continued contact with her on the conference circuit out there in library land, her departure — after nearly 15 years of service to the UMD Libraries and more than two years as founding manager of our department — leaves a big hole in the division that will not be easily filled.  After we spent much of the month repeatedly attempting to execute the following command to no avail:

    mysqldump -f jennies_brain.db > filingcabinet/knowledge.sql

Jennie mercifully agreed to write it all down. 😉  Thanks, Jennie, for everything!

January also saw the adoption of our first official (non-journal) publication associated with our nascent e-publishing program. A Colony in Crisis: The Saint-Domingue Grain Shortage of 1789 (http://colonyincrisis.lib.umd.edu), is a series of “primary sources from an episode in the history of Saint-Domingue,” translated and curated by Abby Broughton, Kelsey Corlett-Rivera, and Nathan Dize. We say ‘adoption’ rather than ‘release’ or ‘publication’ because the Colony in Crisis website has been around for a while already, associated with a larger effort to make freely available a series of French Pamphlets from the Libraries’ Rare Books collection, the fruits of which can be browsed in the UMD French Pamphlets collection on the Internet Archive. By adopting the site into the e-publishing program, the Libraries are taking responsibility to support and ensure the continued accessibility of the authors’ work into the future.  Additional e-publications are in the works.

Software Development

The upgrade to Hippo CMS 7.8 has been completed.  We have taken advantage of the new Solr integration feature in 7.8 to migrate Jim Henson Works to Solr and, working with Special Collections in Performing Arts (SCPA) staff, to add the new Score Collections Database.  Hippo CMS 7.8 also provides new application architecture features which will be implementing to increase performance and reliability.

Libi, the staff intranet,  was upgraded to Drupal 6.33 and a number of minor bugs were fixed.  This completes the first round of changes in response to coordination with the Libi subcommittee of the Library Assembly Advisory (LAAC).  The subcommittee is currently soliciting Libraries staff feedback on the future of Libi which will inform our future plans for Libi, which may include an upgrade from the outdated Drupal 6 to Drupal 7/8 or possibly a new implementation.

We have begun a project, along with Libraries’ HR,  to develop an online student application submission form and supervisor workflow integrated with the Libi.  This system will replace the current paper workflow for students to submit their applications and supervisors to review the entries for matching skills and available hours, which is a manual, time-intensive processing of reading through stacks of paper entries.

Database Finder was released roughly one year ago as an easier to use alternative to databases in Research Port, providing un-authenticated access to database information and improved search and discovery.  Working with Nevenka Zdravkovska, the Web Advisory Committee, and Subject Specialists we have specified a second round of improvements for Database Finder: 1) Research Port categories and sub-categories will be added to the database information along with a faceted browse, and 2) Categories will be linked to Subjects for contextual help and links from Database Finder to Subject Specialists.  The implementation project is still to be scheduled.

User and System Support

Mobile carts for conference rooms

As User and System Support (USS) spoke to users of the Libraries conference rooms, it became apparent that these spaces are hosting both static and dynamic events. In some cases people, chairs and tables remain in place for the whole event whilst other meetings involve movement of attendees and rearrangement of furniture to facilitate discussion. When the time came for updating the technology in The Dean’s Conference Room and rooms 7113 & 7121, USS decided to provide portable and easily used equipment.

cart1

Each of the three carts have:

  • 70 inch TV
  • HDMI laptop connection with adapters for multiple types of connections: Mac, Windows, Android phone…
  • DVD/VCR player
  • high definition web camera and a high quality microphone
  • wireless keyboard and mouse
  • USB hub to transfer documents to a mini-PC (for example, take your PPT on a thumb-drive and upload it – remember to delete it from the mini-PC afterwards)
cart3
Clear instructions for users.
Back of unit: the small red box is the mini PC
Back of unit: the small red box is the mini PC

3D printing was utilized on the carts; Preston created and printed custom brackets to hold the laptop and microphone cables neatly.

The main benefit of these carts is mobility; they can be quickly unplugged and moved to another space in the library as needed. [Carts must be returned to their original room at the end of meetings.]  A battery backup keeps things powered on for up to 10 minutes while the cart is being moved or if we lose power. Other benefits include web meetings and recording using Adobe Connect. The “all-in-one” piece design makes them visually appealing and easy to use.

We have had positive feedback on their use: Tim Hackman (Director, User Services & Resource Sharing), says “they’re easy to use and self-explanatory”. Eric Bartheld (Director of Communications) reports that “It’s very easy for a presenter to plug in his or her Apple laptop – which wasn’t the case before.  Great improvement”.

USMAI (University System of Maryland and Affiliated Institutions) Library Consortium

The Consortial Library Applications Support (CLAS) team continues to support the consortium’s existing shared systems while preparing for the next generation of systems. In January, the team responded to 113 Aleph Rx submissions and 25 e-resource requests. Timely responses to these requests have helped libraries in the consortium perform their daily operations and pursue new initiatives to make their work more effective and efficient.

David Steelman researched some workflow issues with Aleph Rx submissions and, with the team’s review, has modified the interface to make submitting requests more intuitive. No issues have arisen with request submissions since the modifications were implemented.

The team continues its investigation of Kuali OLE, participating in regular meetings with other partners and testing system functionality in all functional areas. Hans Breitenlohner and David S. developed and shared with the OLE community a PERL script for copying bibliographic import profiles, making the process of creating new bib loaders much less error prone. In anticipation of expanded testing by College Park staff and members of the USMAI Next-Gen ILS Working Group, a Zoho Projects site has been created to facilitate communication and documentation during the testing. Plans to provide a stable testing environment have also been made.

Metalib (a.k.a. ResearchPort) is in the middle of a migration from aging hardware to its new virtual machine environment. With Hans leadership, the team is currently testing and fixing bugs in order to prepare for a production cutover tentatively scheduled for late March.

Staffing

David Dahl joined us in early January as Director of the CLAS team. He has had a busy first month as can be seen from his USMAI/CLAS blog entry above.

The Historic Maryland Newspapers Projects hired two new student assistants to assist in metadata collation, quality review, and outreach. Melissa Foge and Kerry Huller are both candidates for Master of Library Science in the College of Information Studies.

The Hornbake Digitization Center welcomed three new digitization assistants, Caroline Hayden, Ryan Jester, and Marlin Olivier, all first-year students in the College of Information Sciences.

Welcome to DSS, David, Melissa, Kerry, Caroline, Ryan, and Marlin!

Aaron Ginoza has joined SSDR as a participant in the job enrichment program.  Aaron will be working roughly four hours per week for six months, with these enrichment objectives:  1) participate in the responsive design project for the Libraries’ website by increasing web technology skills (HTML/CSS/Javascript) and researching responsive design options, 2) implementing social media improvements for the website and 3) learning the software engineering workflows for web development.

Conferences, workshops and professional development

“‘Is This Enough?’ Digitizing Liz Lerman Dance Exchange Archives Media,” a session that Bria Parker, Vin Novara, and Robin Pike proposed to the ICA-SUV conference, was accepted. They will co-present in July 2015.

Francis Kayiwa attended OCLC’s Developer House in Dublin OH from December 1-5, 2015. Developer House is a place where library technologists can gather together for five days to share their perspectives and expertise as they hack on OCLC web services. Working with other technologists from other organizations, Francis helped create a “Today in History” beta application. See for more information on Developer House Project.

Francis Kayiwa also attended the Code4lib Conference in Portland Oregon from February 9th – 12th. Francis co-taught a half day pre-conference on the use of Docker technology. Docker is relatively new software containerization software that is used to provide software as service.

Appointments

Babak Hamidzadeh has accepted an invitation to serve as Senior Advisor of Information Science for SESYNC, the National Socio-Environmental Synthesis Center based in Annapolis. The center is “dedicated to accelerating scientific discovery at the interface of human and ecological systems” with projects as diverse as storm water management and North America Beavers. Babak hopes, with this appointment, to align and link directly technical development in both organizations, namely UMD Libraries and SESYNC.

Stew of the Month: August 2014

Welcome to a new issue of Stew of the Month, a monthly blog from Digital Systems and Stewardship (DSS) at the University of Maryland Libraries. This blog provides news and updates from the DSS Division. We welcome comments, feedback and ideas for improving our products and services.

New Technologies

This past summer, User Services and Systems (USS) initiated a project with the Public Services Division to convert the former Reference desk space in the front of McKeldin Library into a “Laptop Bar” to provide seating and power for students using their personal laptops in the library.  USS acquired power surge protectors in the shape of pyramids to be placed on the tables for student use. PSD acquired bar-style chairs for the area. The Laptop Bar was completed by the beginning of the Fall 2014 semester and has been a major success. Students started using the space immediately. Below are before and after photos:

Before:

 b41b43 b42

After:

aft1 a2  a3af4

Collection-building

Statistics

In the process of gathering our ARL statistics for FY2014, we can note the following increases in our Digital Collections and DRUM holdings since June 30, 2013 (2013 numbers in brackets):

  • Images/Manuscript records in Digital Collections: 17,376  [13,990]
  • Film Titles in Digital Collections: 2,673 [2232]
  • Audio Titles in Digital Collections: 356 [200]
  • Internet Archive titles: 4,382 [3,906]
  • Prange Digital Children’s Book Collection: 7,936 [4,450]
  • DRUM (e-theses and dissertations): 9,511
  • DRUM (technical reports & other): 5,581
  • DRUM TOTAL: 15,092

Those numbers are the result of hard work from staff throughout DSS, as well as content selectors and creators from throughout the Libraries.

ArchivesSpace

ArchivesSpace is the open source archives information management application for managing and providing web access to archives, manuscripts and digital objects.The UMD Libraries has been running a sandbox version of ArchivesSpace for use by Special Collections and University Archives for many months.  In August, DSS completed a Service Level Agreement for the production version of ArchivesSpace, and Paul Hammer (SSDR) converted the existing sandbox server to a production instance.

Prange Digital Children’s Book Collection

We are proud to announce that all of the Prange Digital Children’s Books (8082 of them) have been loaded into our Fedora Digital Collections repository.  However, as is often the case, the final cleanup takes the longest amount of time.  Paul Hammer (SSDR) and Jennie Levine Knies (DPI) worked together with Amy Wasserstrom  and Kana Jenkins in the Prange Collection to troubleshoot the final 200 books that have load issues. Graduate Assistant Alice Prael (DPI) also assisted in cleaning up duplicates and comparing data lists in order to help identify the problem records.

Aeon

On August 1, Special Collections and University Archives officially began using a hosted version of Atlas System’s Aeon software. Aeon is automated request and workflow management software specifically designed for special collections, libraries and archives. Jennie Knies and Paul Hammer worked with Special Collections staff to implement request buttons in both ArchivesUM and Digital Collections to pass metadata to Aeon forms to automate the patron request process.

Digitization Activities

Robin Pike worked with vendors and collection managers to solidify digitization contracts for materials that will be sent to digitization vendors during FY15. The formats represented in the digitization projects include books, serials, pamphlets, photographs, microfilm, open reel audio tape, wire recordings, VHS tape, and 16mm film. The collection areas represented in the projects include Special Collections and University Archives (labor collections, university archives, mass media and culture, rare books, Prange collection materials), Special Collections in Performing Arts, Library Media Services, and Hebrew language materials from the general collection.

Digitization assistants completed projects for the campus community. Audrey digitized Athletics media guide covers that will be used to produce posters, which will be gifts for an upcoming alumni event. Several assistants digitized photos of Terrapin football players, which will be used in the new Terrapins in the Pros interactive exhibit at the Gossett Team House.

Abby digitized Mid-Atlantic Regional Archives Conference programs. Additional MARAC publications will be digitized this year, both in-house and through the Internet Archive, making this regional resource more available to archivists everywhere.

Software Development

Working with the Web Advisory Committee, Shian Chang and Cindy Zhao completed a refresh of the Libraries’ Website interface.  The update includes addition of the new UMD responsive wrapper, as required by a new campus brand integrity program (see http://brand.umd.edu/websitepresentation.cfm), change of the main menus seen on every page to a new “mega menu” dropdown style, enabling users to view more options with integrated explanatory text, and new social media image bar on the bottom of homepage.  This refresh is part of a general plan for constant, iterative improvements to the website and a specific plan to ultimately convert the entire site to a responsive design.

SSDR has been planning on adding Solr client capabilities to Hippo CMS for some time, but discovered recently that Hippo CMS 7.8 comes with a  Solr Integration feature out-of-the-box, supporting both index/search for internal Hippo documents and search for external documents.   Mohamed Abdul Rasheed reviewed the functionality and determined the external search feature capable of handling our needs.  He started work migrating our existing Digital Collections interfaces (Digital Collections, Jim Henson Works, World’s Fair) to the new Solr based search as well as adding new database searches for Special Collections in Performing Arts (SCPA) scores and recordings databases. The databases will continue to be maintained by SCPA staff in FileMaker Pro but exported to CSV, imported into Solr, and exposed through the Libraries’ Website for search and discovery.

Services

USMAI (University System of Maryland and Affiliated Institutions Consortium)

Kuali OLE (Open Library Environment) implementation: Consortial Library Applications Support (CLAS) team members have been participating in weekly teleconferences with University of Pennsylvania staff who are working on UPenn’s OLE implementation. Both groups are discovering that key implementation documentation necessary for bringing up a test instance is missing. At present, we have OLE software installed on a local server, but it is populated with demo data. We have not yet been able to load our own data for testing. We are hopeful that forthcoming teleconferences will provide the information and guidance we need to proceed.

USMAI Advisory Groups: As interim Chair of the Digital Services Advisory Group, Mark Hemhauser completed a first meeting with the Reporting and Analytics Subgroup and the Metadata Subgroup, where he shared the information from CLD about Advisory Group funds and reporting plans. Mark also shared information on membership terms and the group chairs with the USMAI Executive Director. The CLAS team also compiled a list of current email lists and reflectors supporting USMAI communications and sent it to the Executive Director. Linda Seguin revised the Groups page on the USMAI staff web site, added new group pages, and created and distributed editing logins to each advisory group/subgroup.

SFX support: Linda revised SFX parsers to get both Romanized and vernacular text in Aeon request form for College Park’s Prange collection. Linda revised the Aleph Source Parser to get publication information from the new(ish) MARC 264 field for use in SFX linking. Linda and Ingrid Alie added the HathiTrust local target to Salisbury University’s and the UM Health Sciences and Human Services library’s SFX instances.

Circulation support for USMAI: David Wilt set up new Item Statuses in Aleph for the University of Baltimore and College Park; produced ad hoc reports for Frostburg, Bowie, Towson, University of Baltimore, College Park, Saint Mary’s, and UMBC; and completing a patron load for Eastern Shore. David also worked on setting up the booking function in Aleph for Shady Grove.

Acquisitions/serials support for USMAI: Mark exported data from the USMAI licensing database for College Park’s licensing evaluation project; produced a variety of subscription reports for College Park as part of a database clean-up project; produced a special claims report for Morgan State; and helped staff at the University of Baltimore identify a problem with dirty order data after fiscal rollover and provided training on order closing procedures and order clean-up. Mark also flipped the budget code to make corrections on 75 orders, saving UB staff a lot of manual effort.

Aleph database support for USMAI: Linda and Hans Breitenlohner ran a new extract of College Park holdings for their participation in HathiTrust. Linda sent a sample file of book records to RapidILL for UMBC. Linda also deleted withdrawn/purged items for UMBC, College Park and Health Sciences, and with assistance from Heidi Hanson, loaded bibliographic record sets for UMBC, the Center for Environmental Science, and Health Sciences.

Aleph system support: The CLAS team and DSS staff are monitoring a recent pattern of Aleph slowdowns that have been occurring this month. We are currently restarting the Aleph server manually when slowness is reported.

Staffing

Peter Eichman joined DSS as a Contingent-I Systems Analyst in SSDR, providing broad software development support for UMD and Consortial applications. Peter is a UMD alumnus (B.A.s in Linguistics and Philosophy), and has also worked for the ARHU Computing Services office and the National Foreign Language Center as a web application developer.   Peter started on August 19 and is currently working on improvements to Aleph Rx, the DSS issue tracking tool for Aleph.

On August 22, Josh Westgard, graduate assistant in DPI, graduated from the iSchool’s MLS program in Curation and Management of Digital Assets.

Ann Levin, the DSS Project Manager, left the UMD Libraries in August.  Ann made a significant impact during her time with DSS, developing documentation procedures and working on several projects, most notable the Prange Digital Children’s Book Collection.

Amrita Kaur joined the DSS staff as the Coordinator. Amrita has worked for the University Libraries for many years, and was most recently in the Architecture Library. Welcome, Amrita!

Events

The Historic Maryland Newspapers Project hosted UMD Libraries’ first public Wikipedia edit-a-thon on August 18. 24 people attended, either in-person or virtually through an Adobe Connect meeting (recording available here https://webmeeting.umd.edu/p37wtrvy3iw/). We invited speakers from Wikimedia DC, the Library of Congress, as well as our own Doug McElrath, Jennie Knies, and Donald Taylor, to share information about resources to be used during the editing portion of the event. Participants enhanced and added articles related to Maryland newspapers and Wikimedia DC’s Summer of Monuments project and uploaded digitized images from our National Trust Library Historic Postcards Collection to WikiCommons.

Conferences and Workshops

Trevor Muñoz, Karl Nilsen, Ben Wallberg, and Joshua Westgard attended the Code4Lib DC 2014 conference at George Washington University on August 11-12.  Josh Westgard led a session on spreadsheets.  This was a topic he suggested at the start of the unconference planning, so the unconference protocol was for him to moderate the discussion.  The participants in the session talked about strategies and tools for managing data stored in spreadsheets, or data that must pass through a spreadsheet while migrating from one storage location to another.  One highlight of the discussion was the description of csvkit (https://csvkit.readthedocs.org), a Python module for the cleanup and manipulation of data stored in csv files. A breakout group split off in order to begin learning csvkit later in the conference.

Josh Westgard attended a one-day workshop on “Building Data Apps with Python” offered by District Data Labs (http://www.districtdatalabs.com).  The workshop covered application set up, best practices for application design and development, and the basics of building a matrix factorization application.

Jennie Knies, Liz Caringola, Robin Pike and Eric Cartier attended the Society of American Archivists annual conference in Washington, DC on August 11-16. Robin currently serves as the chair and Eric serves on the steering committee of the Recorded Sound Roundtable. Robin chaired and presented on the panel session Audiovisual Alacrity: Managing Timely Access to Audiovisual Collections. Eric contributed audiovisual clips from UMD’s collections for the first AV Archives Night, a networking event featuring content from attendees’ repositories, hosted by Audiovisual Preservation Solutions at the Black Cat. Liz Caringola was a panel speaker for the session “Taken for ‘Grant’ed: How Term Positions Affect New Professionals and the Repositories That Employ Them.” Karl Nilsen gave a talk on database curation and preservation as a part of a panel on stewarding complex objects. Download the slides from DRUM: http://hdl.handle.net/1903/15573. His talk was based on Research Data Services’ efforts to curate and preserve the Extragalactic Distance Database, an online data collection that was created by astronomers at UMD and other institutions.

Liz Caringola attended one of the weeklong Humanities in Learning and Teaching (HILT) workshops offered by MITH “Crowdsourcing Cultural Heritage.”  Karl Nilsen completed the HILT digital forensics course.

Solr System

We are in the process of integrating Apache Solr to work with our Fedora-based digital repository on the back-end.  I am not going to pretend that I know and understand all of the technical details about Solr, as listed on their home page.  My layperson interpretation of its features are as follows:

  1. Solr is a standalone enterprise search server with a REST-like API. I think this means that Solr runs on its own and can be accessed via a URL in a web browser.
  2. You put documents in it (called “indexing”) via XML, JSON, CSV or binary over HTTP. At UMD, our “documents” are the FOXML xml files where Fedora stores our metadata
  3. You query it via HTTP GET and receive XML, JSON, or CSV results. We can use a web browser and a URL to construct queries.

At UMD, we ingest content into our Fedora repository via two methods: a home-grown web-based administrative interface for adding images and via batch ingest (which currently requires developer assistance). We use the administrative interface to manage the metadata for our digital objects. However, the administrative interface has always lacked robust reporting capabilities. Solr includes a robust administrative interface of its own that allows for the construction of complex queries and reporting outputs. For me, as a user, this is Solr’s greatest benefit for us. Our Software Systems Development and Research team try whenever possible to put as much knowledge in the hands of the users.  It is a win-win situation.  For them, it eliminates having to answer and investigate really basic questions for me, and for me, it enables me to achieve results and do my work without having to depend on others.

Solr requires first the development of a schema, which is essentially a file that explains what we want to index and how.  Understanding how to read and interpret the schema is a first step to understanding how Solr works. First, you define fields, and these fields are related to our metadata.  In a simple example, a “Title” field in Solr is an index on the <title> tag in our Fedora metadata.  Within a field, we can define how the field acts.  For example, we have defined a field type of “umd_default” that runs a series of filters on our data.  These filters are the key to understanding how searching works in Solr. I’m going to use the following piece of correspondence as an example: Letter by Truman M. Hawley to his brother describing Civil War battle. Includes envelope, September 26, 1862. When Solr indexes this title it does a number of things.  Many of these things are customizable, and this is what is important to understand.

  1. It separates and analyzes each word and assigns locations to them. “Letter” is in location 1 and takes up spaces 0-5 (the space at the end of the word is included in the word)
  2. It determines the type of word (is it alphanumeric? Or just a number? 1864 is just a number)
  3. It removes punctuation. Finally, a place where no one cares about commas.
  4. It removes stopwords. We apply a “StopFilterFactory” filter to remove stopwords. These can be customized. In our system, “by,” and “to,” are considered stopwords and we do not index them.
  5. It converts everything to lower case. Solr does not have to do this. We apply a “LowerCaseFilterFactory”  with the assumption that our users will not need to place emphasis or relevancy on case in searches.
  6. We apply an “AsciiFoldingFilterFactory” that converts alphabetic, numeric, and symbolic Unicode characters which are not in the “Basic Latin” Unicode block into their ASCII equivalents, if one exists. So, for example, a search on “Munoz” will match on “Muñoz”
  7. We apply the “PorterStemFilter” to the Title field.  This filter applies an algorithm that essentially truncates words based on assumptions about endings. In the example above, “describing” becomes “describ” and “battle” becomes “battl.”
What we are left with is indexing on the following terms:

letter truman m hawlei hi brother describ civil war battl includ envelop septemb 24 1864 This means that I could run the following query in Solr q= Title:(truman AND civil AND describ AND battl) and receive this letter as a hit.  Solr still allows for the capability of phrase queries (“Letter by Truman M. Hawley”), or for wildcard searches: (Truman AND Hawl*). Our implementation of Solr currently assumes a boolean “OR” as the default operator in a search string. So, if I thought to myself, I am interested in looking for content having to do with the Civil War in the month of September, I might type into a search box something like “civil september.” How this translates based on our configuration is “Search the Title field for anything containing the term “civil” OR “septemb.” Here are just a few examples out of my over 300 results:

  • The Greek beginning,Classical civilization
  • The Classical age,Classical civilization
  • Ancient civilizations The Vikings
  • Ancient civilizations The Aztecs
  • Ancient civilizations The Mayans
  • The Civil War in Maryland Collection
  • The Celts,Ancient civilizations
  • Acts of faith,Jewish civilization in Spain
  • Brick by brick: a civil rights story

How is this possible? Well, if I investigate how our PorterStemFilter analyses “civilization,” I discover that it becomes “civil.”  Also, as a user, in my brain, I am thinking that I want results that have to do both with the Civil War AND September, and Solr is returning results that have to do with either.  If I manually adjust my search to be a boolean “AND” search – Title:(civil AND September), I only see three relevant results. This might lead me to believe that we should instantly change our default search to “AND” instead of “OR” since obviously, if I type a search into a box and it has two terms, I want to see records with both those terms.  Our current default in our public interface is “AND.” And also, we should turn off the PorterStemFilter because all of those “civilization” hits are annoying. If I want to search for “Civil*” I will search for “Civil*.”

But is it so simple? What is best for the user? What default settings will be most useful for our users? This is a different discussion and I will be working with my colleagues on the Collections side of things to try to answer some of these questions. Solr is so robust, and can be used to fit so many different situations, that truly configuring it in the most effective way is overwhelming, but also exciting.