Frequently asked questions about Selecting Sustainable Formats for Electronic Records
This FAQ is provided to Federal agencies to assist them in meeting their records management responsibilities under 44 U.S.C. Ch. 31. Agencies can use this information when selecting and implementing formats for long-term electronic records, and when business needs necessitate formats not specifically addressed by existing NARA guidance. Visit the Toolkit for Managing Electronic Records to access existing guidance under the topic of "Specific Records Technologies" (e.g., digital photographic records).
1. Why should agencies use sustainable formats for electronic records?
Agencies create and maintain increasing volumes of Federal records in electronic formats. Typically, agencies select electronic formats for Federal records based on business needs and current technical requirements. Once selected, those formats must be sustainable, that is accessible both throughout their lifecycle and as technology evolves. Formats that are not sustainable may cause Federal records to become obsolete and inaccessible before they are eligible for deletion as authorized in the approved records schedule.
2. What is a sustainable format?
The term "sustainable format" means the ability to access an electronic record throughout its lifecycle, regardless of the technology used when it was originally created. A sustainable format is one that increases the likelihood of a record being accessible in the future.
3. What are characteristics of a sustainable format?
When the records need to be maintained over the long term (sustainability), agencies should consider each of the following characteristics of formats:
a. Published Documentation and Open Disclosure
Publicly and openly documented formats adhere to specifications that are published and accessible. Publicly accessible specifications allow developers to create a wide variety of applications to read, process, and validate files. Openly documented specifications assist developers in creating tools to access the information in obsolete formats, and/or assist in migrating files to future formats.
Tagged Image File Format (TIFF) is an example of a format based on a publicly available, authoritative specification for scanned images.
b. Widespread Adoption and Use
Formats adopted for widespread use have a higher probability of being sustainable over time. When a format has been widely adopted by users, multiple software tools are created to open, read, and access the records and the market supports ongoing sustainability of the file format. This extends the time that the information can be maintained in the format using readily-available tools. The adoption of a file format by information creators, disseminators, and users is an indicator of sustainability.
Hyper-text Markup Language (HTML) is an example of a format that has been widely adopted for Internet use.
c. Self-describing Formats
Self-describing formats contain metadata needed to interpret the content, context, and/or structure of the record. Metadata embedded within the format minimizes reliance on external documentation and the risk of disassociation of metadata from the file over time. While self-describing formats provide the capability for including metadata (e.g. in the file header or through tags within the file structure), they may not necessarily mandate it in the format specification. If present, the metadata should be easily accessed. This ensures that descriptive information about the record is sustainable.
Extensible Markup Language (XML) is an example of a self-documenting format because it describes its structure and field names.
When agencies use formats that exhibit these characteristics, they increase the likelihood that the information will be accessible over the long term.
4. How do agencies enhance the sustainability of formats?
When creating electronic records or converting source data, agencies can enhance sustainability by maintaining the original quality of source data. The following methods are typically applied through software settings and vary depending on the format being used.
a. Technical Protection Mechanisms
Long-term records should be unrestricted and/or unencrypted so that user IDs and/or passwords are not needed to maintain the file. User IDs and passwords can be lost over time. For more information, see NARA's Bulletin 2007-02, guidance concerning the use of Enterprise Rights Management (ERM) and other encryption-related software on Federal records.
b. Maintain Integrity of Source Data
When using compression to reduce file size, agencies should use lossless compression to maintain the integrity of source data. Lossless compression produces smaller file sizes without removing any information. Maintaining the original quality of source data can facilitate future migration and conversion. Minimizing subsequent modification of the records after production is also recommended to maintain integrity. See Frequently Asked Questions (FAQ) about Digital Audio and Video Records for a discussion of lossless compression.
5. Will selecting appropriate formats guarantee sustainability?
No. In addition, agencies need to follow record policies and procedures governing creation and management of electronic records and follow approved records schedules.
Section 7 of ISO 15489, "Information and Documentation -- Records management -- Part 1: General;" discusses other factors that should be considered.
6. Who can I contact for further assistance?
- In the Washington, DC, area, the NARA RM Life Cycle Management Division can provide assistance. See List of NARA Contacts for Your Agency at http://www.archives.gov/records-mgmt/appraisal/
- Outside the immediate Washington DC area, the NARA Records Management Staff in NARA's regional offices, can provide assistance. A complete list of NARA's regional facilities can be found at http://www.archives.gov/locations/index.html.