This article is Part 2 of a 4 part series on potential paths for opening content in HathiTrust.

There are close to one million volumes in HathiTrust that have been given “undetermined” copyright status by the HathiTrust bibliographic rights algorithm because they lack requisite or accurate metadata and are thus closed for reading access. Half of these have been deposited by the University of California. 

In addition, there are potentially tens of thousands of US federal government documents that are closed on HathiTrust simply because the correct marker is missing from their metadata records. UC could potentially remediate the issues with these metadata records and open significantly more HathiTrust content for reading access. 

Missing or Malformed Metadata

Metadata that describes digitized volumes can vary greatly in accuracy and completeness across the records submitted to HathiTrust from contributing member institutions. If the publication year and country do not appear in expected record locations describing a given volume, the HathiTrust bibliographic rights determination algorithm does not have sufficient information to determine whether it is in the public domain, and assigns an undetermined rights status. There are close to a million HathiTrust volumes with undetermined rights status, and while the majority of these are likely in copyright, it’s possible that a good chunk of them are not.

If you suspect that the metadata for a volume you want to access is missing essential publication date or location information, or that its information is incorrect and it should be in the public domain, file a ticket and CDL staff  will see if we can help get it opened. 

US Federal Government Documents

With a few categorical exceptions, United States federal documents are in the public domain. The HathiTrust bibliographic rights determination algorithm looks for both the country of publication and an indication of federal document status in expected locations in submitted records for digitized volumes. Without these, the described volumes will not be considered US federal documents for rights determination purposes, resulting in some volumes being incorrectly closed.

Even with perfectly formatted metadata, there are a number of reasons why some US Federal Government Documents may be closed for access in HathiTrust. The publications of some US federal institutions (and some organizations often incorrectly assumed to be part of the federal government) are not in the public domain, and the HathiTrust algorithm correctly classifies these as in copyright. These include: 

  • The Smithsonian
  • The National Technical Information Service
  • The National Standard Reference Data Series
  • The Federal Reserve
  • The National Research Council
  • The Armed Forces Communications Association
  • The Armed Forces Communications and Electronics Association
  • The National Gallery of Art

In addition, some US federal government documents must remain closed because they contain copyrighted materials by contractors who contributed to the publication. And some are closed because they contain sensitive information, such as social security numbers or the location of archeological digs. Unlike federal works, the works of state and local governments vary as to their rights status, and are not automatically opened by the HathiTrust algorithm.

If you suspect that a digitized HathiTrust volume you are interested in gaining access to has not been correctly identified as a public domain US federal government document, please submit a ticket. We might be able to help you gain access.


This article is part 2 of a 4 part series on Potential Paths to Opening Content in HathiTrust:

  1. Overview
  2. Remedying Copyright Determinations With Updated Metadata
  3. Copyright Review
  4. HathiTrust Creative Commons Declaration Form