Wednesday, November 10, 2010

HATHITRUST 101: An Introduction to the Shared Digital Repository

Presented by Maria Bonn, University of Michigan at Ann Arbor

The seed for Hathitrust was Michigan's agreement with Google, who wanted to digitize the university's research collection. As part of the agreement, the niversity got the digital copies from Google so that they could share them with their peer institutions. In addition, UM has lots of Internet Archive content and locally digitized content. Hathitrust focuses exclusively on text content. The system was developed from scratch and the interface is open source. The goal of the collections development policy was to replicate a research library.

Hathitrust consists of a partnership that includes Columbia, Cornell, Dartmouth, New York Public Library, Yale and the University of California system. This is a universal digital library, a single repository governed by an executive committee and strategic advisory board.

The repository has over 7 million volumes; 24 percent are in the public domain, and 86 percent are in 10 different languages. It has 75 volumes in the Hawaiian language. In 2008, it contained 2.5 million volumes and Hathitrust projects a total of 14 million by 2012.

Services include bit-level preservation. It is primarily a digital preservation archive, but has an access system. It includes a rights database and copyright review. It focuses on scholarly resources, supports bibliographic search using MARC records, full text search, the creation of personal collections and full-text PDF download.

It has an interface for users with disabilities. Any book, with copyright or not, can be checked out by users with disabilities.

One of its services is the collection builder. Faculty can create collections and share them with their students. If you create an account, you can make your collections permanent.

Hathitrust partners have the rights to full pdf downloads. Digital storage is in two places in Michigan and one in Indiana. 

The effort is supported by project managers, a communications working group and a metadata group.

In June 2009 research libraries showed 19 percent duplication among their collections. In 2010 this was up to 31 percent duplication among American research libraries. Currently there is a substantial overlap in shared repositories.

Hathitrust is very good at ingesting content from Google and from the Internet Archive.  Its future directions include developing usage reporting, methods of quality assessment, more services through shibboleth and focusing on born-digital content.

-- Sunny Pai

No comments:

Post a Comment