The pirate website called Anna's Archive is illicitly hosting and distributing a large portion of the HathiTrust collection, and is offering a financial bounty to parties who can obtain and provide additional HathiTrust data to them. Their possession and distribution of this material are unauthorized by HathiTrust, may have been enabled through unlawful activity, and may constitute illegal use. Some of the data was posted to Anna's Archive after a targeted attack on HathiTrust systems in June 2025. HathiTrust has not been subjected to a ransomware attack, and the HathiTrust collection preserved in its digital repository has not been altered or damaged in any way.
The 8.5 million HathiTrust items now in Anna's Archive represent both public-domain and copyrighted material. The majority of the items are believed to be in the public domain in the United States and originated from Google's digitization efforts. Anna's Archive appears to have obtained data from HathiTrust at different dates and via different methods.
The first and smallest dataset was posted in December 2024, but may have been obtained earlier through undetected web-scraping activity in violation of HathiTrust's terms of use.
The largest dataset, comprising exclusively public-domain materials, may have been obtained legitimately from HathiTrust in early 2025 via our established researcher dataset request service. However, we are not certain how the data ultimately made its way to Anna's Archive without HathiTrust's knowledge or authorization.
The third dataset was posted to Anna's Archive after the targeted attack on HathiTrust systems in June 2025. HathiTrust resolved the initial attack quickly, but not before a subset of the collection data was copied and transferred to Anna's Archive.
Late in 2025, HathiTrust became aware that Anna's Archive possessed HathiTrust data and was soliciting such attacks. HathiTrust immediately began working with security experts and the Office of General Counsel at the University of Michigan, its host institution, to investigate these incidents. HathiTrust has deferred or paused some planned projects in order to ensure that data remains secure and to strengthen infrastructure and protocols across all its systems and services. HathiTrust has notified its member libraries whose deposited collections are found on Anna's Archive. Anna's Archive has previously been sued by OCLC, Spotify, and by a group of publishers for distributing materials it obtained through illicit means.
"Our community has long contended with shadow libraries, but despite their rhetoric, shadow libraries like Anna's Archive undermine the preservation of the published record by operating unlawfully and disrupting the operations of legitimate libraries," Executive Director Mike Furlough says. "Many of us are aware of the impact of web-crawling bots on our digital services, but Anna's Archive goes further by soliciting and financially rewarding attackers who can illicitly obtain books, and they have successfully targeted and obtained data from other digital library services. HathiTrust was founded to empower a diverse collective of research and teaching libraries around the globe to preserve the published record for long-term, lawful access. We remain deeply committed to our mission, even in the face of illicit activity to undermine it."