National Digital Newspaper Program: 2016-2018 Selection

Introduction

The UMD Libraries were awarded a National Endowment for the Humanities (NEH) $250,000 grant for the third phase of the Historic Maryland Newspaper Project, beginning September 1, 2016. Between 2016-2018, the project will digitize approximately 100,000 pages of newspapers published in the State of Maryland, adding to the over 200,000 pages from Maryland already in Chronicling America, the Library of Congress digitized newspaper database. The state partners contributing content for the third grant are the Maryland State Archives, also a partner on the second grant, and Frostburg State University Library. UMD’s theme for the third award is to include newspapers of greater diversity, including one Polish language paper and several labor papers, as well as newspapers with contrasting political viewpoints of those digitized during the first two grant cycles.

Title Selection

Project staff consulted with the Advisory Board to select the list of titles that may be selected during the 2016-2018 phase:

  • The Baltimore County Union (1865-1909), Towsontown, MD
  • Catoctin Clarion (1923), Mechanicstown, MD
  • The Citizen (1895-1922), Frederick, MD
  • Czas Baltimorski (1940-1941), Baltimore, MD
  • Democratic Messenger (1881-1922), Snow Hill, MD
  • Evening Capital, Evening Capital and Maryland Gazette (1884-1922), Annapolis, MD
  • Frostburg Mining Journal (1871-1917), Frostburg, MD
  • The Frostburg Forum (1897-19??), Frostburg, MD
  • The Frostburg Gleaner (1899-19??), Frostburg, MD
  • The Frostburg Herald (1903-19??), Frostburg, MD
  • The Frostburg News (1897-18??), Frostburg, MD
  • The Frostburg Spirit (1913-1915), Frostburg, MD
  • Greenbelt Cooperator (1937-1943), Greenbelt, MD
  • Maryland Independent (1874-1934), Port Tobacco, MD
  • The Midland Journal (1885-1946), Rising Sun, MD
  • Voice of Labor (1938-1942), Cumberland, MD
  • Worcester Democrat and Ledger-Enterprise (1921-1953), Pocomoke City, MD

The list may be modified as the project student assistants collate the microfilm and discover that the images may be of too poor quality for digitization.

Mutilated pages from the Maryland Independent
Mutilated pages from the Maryland Independent

Copyright Research

In July, NEH announced the expansion of date ranges for the NDNP program, to include 1690-1963. For newspapers published between 1923-1963, project staff need to perform copyright research to determine whether the newspaper issue was registered with the copyright office, and if it was registered, whether the copyright was renewed 28 years later, according to the law. Project staff decided to utilize the resources available through the Copyright Office to determine whether these titles are in the public domain:

  • Catoctin Clarion (1923), Mechanicstown, MD
  • Czas Baltimorski (1940-1941), Baltimore, MD
  • Greenbelt Cooperator (1937-1943), Greenbelt, MD
  • Maryland Independent (1874-1934), Port Tobacco, MD
  • The Midland Journal (1885-1946), Rising Sun, MD
  • Voice of Labor (1938-1942), Cumberland, MD
  • Worcester Democrat and Ledger-Enterprise (1921-1953), Pocomoke City, MD

With guidance from the Library of Congress on how to perform copyright research, Doug McElrath (SCUA) and Robin Pike developed instructions for Doug, Robin, Judi Kidd, and Amy Wickner (SCUA) to perform the research and track their results, providing evidence to the Library of Congress and NEH that the titles are in the public domain. The project staff will primarily be searching in the pre-1978 Catalog of Copyright Entries, but may also have to search in the Copyright Catalog (1978-Present) for renewed registrations. Unlike a book which is a single entity, newspapers are copyrighted by the issue, so project staff will have to ensure that they do title searches across the entire date range of publication to ensure the issues are in the public domain.

You’re Invited to the Historic Maryland Newspapers Wikipedia Edit-a-thon on May 2!

Today’s post is by Amy Wickner, student assistant and iSchool field study for the Historic Maryland Newspapers Project.

As part of an ongoing initiative to connect digital collections with Wikipedia, the Historic Maryland Newspapers Project (HMNP) will co-host a  Wikipedia Edit-a-thon (May 2, 1-4pm) focusing on Maryland newspapers. We’ve set up an event page and advance registration form (strongly recommended) with all the details.

Photo from HMNP’s last edit-a-thon on August 18, 2014, at UMD Libraries.

Liz Caringola and I are working with special collections staff at the Maryland State Archives in Annapolis, who have been kind enough to provide space, computers, and guided tours of their collections. Maria Day and Allison Rein from MSA will highlight historic newspapers in their collections, while Liz will introduce edit-a-thon participants to Chronicling America and HMNP’s ongoing work. I’ll give short tutorials on editing Wikipedia and adding images to Wikimedia Commons. We’re hoping to draw participants from across the state and DC / Baltimore metro areas. All are welcome, and word-of-mouth promotion would be much appreciated.

Many edit-a-thon pages have a Goals section, conventionally a list of articles needing to be drafted, added, or improved. Our page has such a list, but we’d also like to help participants depart with at least some impulse to continue editing Wikipedia. (We’ll have a day-of participant survey of some kind to get at what brings people to our event.) Sparking a lifelong passion for editing Wikipedia using archival material as evidence would of course be fire, but growing sustainable participation more realistically involves a lot of small steps. Which is why it’s exciting to see that this is just one of many DC-area Wikipedia events this spring, with themes ranging from accessibility to labor to #ColorOurHistory.

Stew of the month: November 2015

Welcome to a new issue of Stew of the Month, a monthly blog from Digital Systems and Stewardship (DSS) at the University of Maryland Libraries. This blog provides news and updates from the DSS Division. We welcome comments, feedback and ideas for improving our products and services.

Digitization Activities

Historic Maryland Newspapers Project

On November 13, Robin Pike and Liz Caringola visited Frostburg State University to discuss the digitization of the Frostburg Mining Journal and other Frostburg newspapers held in print by their Special Collections. Digitization of these important Western Maryland newspapers will move forward contingent on the award of a third NDNP grant, which would begin on September 1, 2016.

Other Digitization Activities

The vendor digitization projects went out including: over 10,000 pages to the Internet Archive from SCUA collection materials and diaries from the William Kapell collection in IPAM. These projects were funded by the DIC project proposal process.

Eric Cartier worked with Cindy Frank, Director of the Visual Resources Collection in the School of Architecture, Planning, and Preservation, to arrange a Digital Data Services digitization request with an architecture professor. Digitization assistants are scanning more than 100 color slides featuring images of buildings across the French countryside.

GA David Durden completed a reference spreadsheet of the most prominent grants that support digitization and digital projects. Robin will use this resource as she meets with librarians and staff to discuss funding sources for future digitization projects.

Digital  Programs and Initiatives

Over the course of the fall, DPI carried out a pilot to test the technical feasibility of hosting the International Children’s Digital Library on DSS servers. ICDL is a free and open repository of Children’s Literature in various languages that was developed by faculty in the iSchool. The pilot was a success, so with Collections Strategies and Services having expressed their interest in supporting this important collection, the Libraries are now moving ahead to provide web hosting services for the ICDL. For more information on the ICDL, see http://childrenslibrary.org.

Transcribe Maryland is a pilot project to test the workflows and procedures for crowdsourced transcription of Digital Collections materials. In November, Josh Westgard carried out the migration of more than 17,000 images making up over 800 documents from our digital collections repository to a platform to support public transcriptions of those documents. The pilot project will take place in the spring semester 2016 in support of a course being offered in the English Department.

DPI, with help from DSS colleagues, is about to launch REDCap an open source web application created by Vanderbilt University for building and managing online surveys and databases. REDCap will be offered as a part of Research Data Services and available to UMD faculty and researchers. Please contact lib-research-data@umd.edu for more information.

Software Development

Hippo CMS  has been successfully upgraded to version 7.9.  The primary improvements for content creators are the new CKEditor for making HTML content changes and the channel manager options to preview pages on various device screen sizes.  Also, automatic updates for database finder and the staff directory have been restored.

The project to move the website to a Responsive Web Design template is now entering its final phases.  The majority of the template development work has been completed and being prepared for promotion to production.  We are also working with the Web Advisory Committee to test the new template and create training opportunities for staff on how to update their content in preparation for the January 18 release date.

Initial development of the Fedora 4 authorization module based on the emerging Web Access Control (WebAC) standard for RDF based Access Control has been completed.  This new feature is being incorporated into the design for our Fedora 4 repository instance and the new Digital Collections administrative interface based on Hydra.

Staffing

Barbara Percival joined DCMR in November. A first-year iSchool student, she is currently producing digital files, and she’ll take over quality assurance inspections in 2016.

Conferences, workshops and professional development

Liz Caringola was appointed to the MARAC Web Editing Team, effective January 1, 2016, for a two-year term.

Chronicling America surpasses 10 million pages!


The University of Maryland Libraries joins the Library of Congress and the National Endowment for the Humanities in celebrating a major milestone for Chronicling America, a free, searchable database of historic U.S. newspapers. The Library of Congress announced on October 7 that more than 10 million pages have been posted to the site. This number includes 117,082 pages of Maryland newspapers digitized by the Historic Maryland Newspapers Project and its content partners, the Maryland State Archives and Maryland Historical Society, from the following titles:

Titles are added on a rolling basis, so check back often, or subscribe to Chronicling America’s RSS feed to receive alerts when new titles are added.

For more information about the Historic Maryland Newspapers Project, please visit our website: http://ter.ps/newspapers.

Reusing Newspaper Data from Chronicling America

The National Digital Newspaper Program’s (NDNP) goal in digitizing U.S. newspapers from microfilm isn’t to simply create digital copies of the film—it’s to make the content of the digitized newspapers more usable and reusable. This is made possible through the creation of different kinds of metadata during digitization. (You can read my post from 2013 for the nitty gritty details of NDNP metadata, or go straight to the source.) The addition of robust metadata means that the Library of Congress’ Chronicling America website isn’t just a digital collection of newspapers—it’s a rich data set—and our project’s contributions to Chronicling America represent Maryland in this data.

Newspaper data is being used in exciting ways by scholars, students, and software developers. Here are a few of my favorite examples:

Data Visualization: Journalism’s Journey West
Bill Lane Center for the American West, Stanford University
http://www.stanford.edu/group/ruralwest/cgi-bin/drupal/visualizations/us_newspapers

Map of Maryland showing newspapers that were publishing in the 1790s.
Image from http://www.stanford.edu/group/ruralwest/cgi-bin/drupal/visualizations/us_newspapers

This visualization plots the 140,000+ newspapers that are included in Chronicling America’s U.S. Newspaper Directory. Read about the history of newspaper publication in the U.S., and watch as newspapers spread across the country from 1690 through the present.

An Epidemiology of Information: Data Mining the 1918 Influenza Pandemic
Virginia Tech
http://www.flu1918.lib.vt.edu/

Excerpt from newspaper reads

The 1918 influenza pandemic, or Spanish flu, killed 675,000 in the U.S. and 50 million worldwide. An Epidemiology of Information used two text-mining methods to examine patterns in how the disease was reported in newspapers and the tone of the reports (e.g., alarmist, warning, reassuring, explanatory). Visit the project website for more information, or read the project’s January 2014 article in Perspectives on History.

Image from http://www.flu1918.lib.vt.edu/wp-content/uploads/2012/11/NLM-Presentation-Ewing-30April2013.pdf

Bookworm
The Cultural Observatory, Harvard University
http://bookworm.culturomics.org/ChronAm/

Graph that shows the occurrence of the word
Image from https://twitter.com/1918FluSeminar/status/577082239479115776

Bookworm is a tool that allows you to “visualize trends in repositories of digitized texts,” including Chronicling America. In the graph above, Tom Ewing of the aforementioned Epidemiology of Information project used Bookworm to visualize instances of the word “influenza” in the New York Tribune between 1911 and 1921. You can create your own visualizations of Chronicling America data using this tool.

Viral Texts: Mapping Networks of Reprinting in 19th-Century Newspapers and Magazines
NULab for Texts, Maps, and Networks, Northeastern University
http://viraltexts.org/

A visualization of the networks that exist between newspapers based on how the poem
Image from http://networks.viraltexts.org/1836to1860-Inquiry/

In the 19th century, the content published in newspapers was not protected by copyright as it is today. As a result, newspaper editors often “borrowed” and reprinted content from other papers. This project seeks to uncover why particular news stories, works of fiction, and poetry “went viral” using the Optical Character Recognition (OCR) text of the newspapers in Chronicling America and magazines in Cornell University Library’s Making of America.

Everyone is welcome to use Chronicling America as a dataset for their research. There’s no special key or password needed. Information about the Chronicling America API can be found here. For additional projects and tools that use Chronicling America data, see this list compiled by the Library of Congress.

If you reuse Chronicling America data, especially from Maryland newspapers, in your research, please leave a comment or drop us a line. We’d love to hear from you!

Historic Maryland Newspapers Project receives funding for Phase 2

It’s our pleasure to announce that the Historic Maryland Newspapers Project at the University of Maryland Libraries has received funding for Phase 2 and will continue through August 2016 thanks to a generous $290,000 National Digital Newspaper Program (NDNP) grant from the National Endowment for the Humanities.

The Historic Maryland Newspapers Project was first awarded an NDNP grant in 2012 to digitize 100,000 pages of newsprint published between 1836 and 1922. To date, approximately 107,375 pages of Maryland newspapers have been digitized and nearly 86,000 are available on the Library of Congress database Chronicling America. The bulk of these pages is from the prominent German-language Baltimore paper Der Deutsche Correspondent. The time frame of the digitized Correspondent spans 1858 to 1913.The following titles were also digitized during Phase 1 of the project:

Baltimore

  • The American Republican and Baltimore Daily Clipper, 1844-1846
  • The Baltimore Commercial Journal, and Lyford’s Price-Current, 1847-1849
  • Baltimore Daily Commercial, 1865-1866
  • The Daily Exchange, 1858-1861
  • The Pilot and Transcript, 1840-1841

Western Maryland

  • Civilian and Telegraph (Cumberland), 1859-1865
  • The Maryland Free Press (Hagerstown), 1862-1868

During Phase 2, we will complete digitization of Der Deutsche Correspondent (1914-1918) and will digitize a variety of English papers that reflect the regional diversity of Maryland. We look forward to collaborating with our colleagues at the Maryland State Archives during the second phase of the project.

See the press release from NEH: http://www.neh.gov/news/press-release/2014-07-21.

Now Hiring: Wikipedian-in-Residence

The Historic Maryland Newspapers Project is hiring a Wikipedian-in-Residence for the summer months. Our overall goal in bringing a seasoned Wikipedian on board is to improve the quality of Wikipedia articles by increasing the number of relevant citations and links to the rich newspaper content of Chronicling America.

This position will be a little different from the typical Wikipedian-in-Residence gig. Most Wikipedians are brought into an organization in order to teach the staff how to edit Wikipedia, to edit and upload content to Wikipedia or Wikimedia, or to hold edit-a-thons–at least this is what I’ve gleaned while perusing other Wikipedian job listings. Our Wikipedian may do a little of this, but their work will mostly be research-based and will result in a written report of recommendations for our project and other National Digital Newspaper Program (NDNP) awardees to implement.

First, our Wikipedian will complete an analysis of how Chronicling America is currently being represented in Wikipedia. Linkypedia is one tool that could be used during the analysis. It will be important that our Wikipedian can utilize this and other tools–perhaps even tweak these tools–in order to gather relevant statistics.

The next step will be analyzing these statistics. This step is crucial because the conclusions drawn will guide the Wikipedian’s most significant responsibility–to explore different scenarios, tools, or methods for how we might effectively increase Chronicling America‘s presence on Wikipedia. For example, they may be as simple and low tech as authoring a comprehensive guide for NDNP awardees to start editing Wikipedia; or they could require developers to add some code to the open source application behind Chronicling America in order to automatically generate wiki markup needed to cite a newspaper page in Wikipedia. (The National Library of Australia has built this functionality into their digital repository, Trove.)

Screencap of a newspaper page from Trove, showing the site's ability to generate wiki markup to cite the newspaper page.

The Wikipedian will also have to investigate the cost and resources needed to realize their proposed solutions. The Wikipedian will prioritize and make recommendations for which tools should be implemented in upcoming months based on their feasibility and estimated effectiveness.

In order to accomplish all this in four short months, the Wikipedian will have to have experience conducting research and analyzing data; knowledge of existing tools and APIs for Wikipedia; and a firm understanding of the written–and more importantly, the unwritten–rules of editing Wikipedia. This is a part-time, paid position and cannot be performed remotely.

To view the complete job posting and apply, see https://ejobs.umd.edu/postings/25127. We hope to hear from you soon!