Digital Preservation Framework Linked Open Data

The idea of the “Semantic Web” for the internet is that organizations and individuals can share information in a format that can be read automatically by computers, and can link similar types of information across multiple web sites. This collection of related datasets on the web is referred to as Linked Open Data.

The National Archives and Records Administration (NARA) has a long history of releasing information about its collections available as datasets and making information machine-accessible through Application Programming Interfaces (APIs).

NARA’s Digital Preservation Unit has been collaborating closely with colleagues at other national archives and libraries as well as with the WikiData for Digital Preservation initiative to make our Digital Preservation Framework available as a Linked Open Data Resource to complement the preservation plans made available on GitHub. The Framework provides Preservation Plans for several hundred formats, including links to specifications and standards, links to community documentation about the formats, and NARA’s proposed preservation actions and tools.

You can currently make use of our Linked Open Data in three ways:

  1. A bulk download of the full Digital Preservation Framework File Format Plans and supporting documentation needed for dataset research use are available as Linked Open Data in the RDF Turtle (ttl) format:
  2. Browse the full list of file formats to reach the Linked Open Data file for a specific format. Several formats are part of multiple categories.
  3. Browse the lists of formats by Record Category:

Several related resources are available from NARA about file formats:


We always welcome feedback. Please use the issues feature on our GitHub page to leave a specific comment or question or to just start a discussion. You can read more about how to contribute on GitHub here. Alternatively, you can email NARA staff will respond as quickly as they can.