Library Technology Guides

Document Repository

Serials Solutions partners with HathiTrust to Expand Web-scale discovery through full-text books

Smart Libraries Newsletter [May 2011]

Breeding, Marshall.

Copyright (c) 2011 ALA TechSource

Abstract: Serials Solutions continues its ambitious plan to populate its Summon discovery service with the widest array of content that represents the content within library collections. The company has created a massive index of content from a wide range of e-journal publishers and providers, e-books, references and other materials. Serials Solutions continues to make partnerships and collaborations to grow the Summon index.


Serials Solutions continues its ambitious plan to populate its Summon discovery service with the widest array of content that represents the content within library collections. The company has created a massive index of content from a wide range of e-journal publishers and providers, e-books, references and other materials. Serials Solutions continues to make partnerships and collaborations to grow the Summon index.

An agreement made in March 2011 with HathiTrust propels Summon into the realm of full-text book content. Led by the University of Michigan, HathiTrust brings together the materials scanned through the Google Book Search into a single massive digital collection. As Google scans the materials from each of the GBS partner libraries, the each library receives a digital copy. How those digital copies can be used depends on the copyright status of the material and the terms of the agreement with Google.

Through an arrangement with the HathiTrust, Summon will gain access to this vast collection of full-text content to exploit in its discovery service. The addition of the materials from HathiTrust into Summon will significantly increase Summon's ability to provide better exposure to book content in the same way that it has already done for scholarly articles.

Although Summon already has the ability to ingest metadata from the MARC records of a library's physical collection, it will soon offer library users the ability to provide discovery for its book collection through full-text searching. The content in HathiTrust that will be available through the Summon service will include:

  • A total of 8.4 million volumes
  • 4.6 million books
  • 200,000 serial titles
  • 3 billion pages of text

According to HathiTrust Executive Director John Wilkin, the HathiTrust, motivated by the desire to provide maximum exposure to the scanned digital materials, will expose its SOLR index to Serials Solution to extend Summon, which can in turn make those materials discoverable through its own service. The terms of the agreement specify that use of the HathiTrust index does not allow extraction of the text to display snippets or other representations of the text. Rather, once discovered within Summon, the user will be linked to an appropriate source for the item. Full-text searching of book content profiled to the library's holdings should enable increased use of the library's physical collection since all words and phrases in a book become access points. For books in the public domain, users may be directed to an electronic copy of the book at HathiTrust. Items in search results still in copyright may direct to an electronic copy if the library owns it or to physical copies in the library if they do not.

HathiTrust's arrangement with Serials Solutions is not exclusive. According to Wilkin, the organization aims to provide the highest exposure possible to the content and is willing to partner with other organizations able to agree to the terms about how the index may be used. The arrangement between HathiTrust and Serials Solutions does not involve financial compensation; providing access to the index does not involve significant cost to HathiTrust.

The expansion of Summon to include full-text indexing from HathiTrust reflects Serials Solutions ambitious strategy for discovery. As the first commercial discovery service based on a massive aggregated index of primarily article content, Summon was the first entrant into what is now a growing genre of services that use this approach; EBSCO Discovery Service and Primo Central follow a similar strategy.

The addition of this enormous body of full-text primarily book material stands to catapult the exposure of a library's book content through Summon. Full-text book searching has been commonplace on commercial sites such as Amazon.com for quite some time. Most library book search relies on indexes of MARC records, which excel more for structured searches than for keyword retrieval. Most online catalogs and discovery systems offer keyword searches by default. The combination of the existing representation of a library's collection of MARC indexing alongside the upcoming full-text indexing shows promise as a very powerful approach for library book search. This approach will also come with significant challenges. The announcement of the collaboration is just the beginning of the process of delivering a discovery service with this capability. Executing the integration will not be a trivial undertaking.

Many details are yet to be seen on how Summon will blend this new full-text book content into its existing discovery service. If successful, this new development represents a significant step of advancement to the state of the art of library discovery services.

Publication Year:2011
Type of Material:Article
LanguageEnglish
Published in: Smart Libraries Newsletter
Publication Info:Volume 31 Number 05
Issue:May 2011
Page(s):5-6
Publisher:ALA TechSource
Place of Publication:Chicago, IL
Company: ProQuest
Products: Summon
ISSN:1541-8820
Permalink: http://librarytechnology.org/ltg-displaytext.pl?RC=16133
Record Number:16133
Last Update:2014-01-08 08:11:29
Date Created:2011-10-07 11:35:37