NARA Guidance on Scheduling Web Records
January 2005
Return to main page - NARA Guidance on Managing Web Records
SCHEDULING WEB RECORDS
- What records should be covered in a web schedule?
- What purposes does a web schedule serve?
- How is a web schedule developed?
- How does risk assessment help with developing a records schedule?
- What is the structure of a web schedule?
5.1 Single Item Schedule for Web Content and Site Management and Operations Records
5.2 Multiple Item Schedule for Web Content and Site Management and Operations Records
5.3 Web Snapshots
5.4 Examples of Scheduling Options - How are retention periods for web site-related records determined?
- Can the GRS or existing agency schedules be used for web records?
Introduction
This guide is intended to assist agency staff, especially records officers and webmasters, in developing disposition schedules for the records relating to agency Internet and intranet web sites. This guidance will discuss such matters as
- the types of records that must be scheduled,
- how risk assessments may be used in making scheduling decisions,
- how web schedules should be structured, and
- the factors records officers should consider in determining retention
periods for web records.
- What records should be covered in a web site schedule?
A web schedule should cover web content records that document the information on the site itself. A web schedule should also include web site management and operations records, which provide the site's context and structure.
More detailed information concerning the types of records agencies accumulate in connection with their web sites is available in GENERAL BACKGROUND, RESPONSIBILITIES, AND REQUIREMENTS, section 5.
Web content records include:
- the content pages that compose the site, inclusive of the HTML markup;
- records generated when a user interacts with a site; and
- if the agency chooses to document its site this way, lists of the URLs referenced by site's hyperlinks.
Web management and operations records that provide context to the site include:
- web site design records,
- records that specify an agency's web policies and procedures by addressing
such matters as how records are selected for the site and when and how
they may be removed,
- records documenting the use of copyrighted material on a site,
- records relating to the software applications used to operate the site,
and
- records that document user access and when pages are placed on the site, updated, and/or removed.
Web management and operations records that provide structure related to the site include:
- site maps that show the directory structure into which content pages
are organized and
- COTS software configuration files used to operate the site and establish its look and feel, including server environment configuration specifications.
- the content pages that compose the site, inclusive of the HTML markup;
- What purposes does a web schedule serve?
A web schedule fulfills an agency's statutory responsibilities as spelled out in the Federal Records Act. In addition, a web schedule mitigates the risks associated with the agency web site by ensuring that records needed to prove its trustworthiness are maintained for an appropriate period of time. A web schedule also provides you the legal authority to destroy web records at the end of their NARA-approved retention period. Finally, the scheduling process will identify any web-related records that warrant permanent retention and eventual transfer to the National Archives. - How is a web schedule developed?
Developing web schedules involve several distinct steps. The most important of these are:- determining the structure of the web schedule,
- describing the specific series to be included, and
- specifying retention periods for each series.
You will use the risk assessment as a key tool in performing these steps.
Developing a web schedule is the responsibility of the agency records officer (or his or her designee), who takes the lead in carrying out these steps. At all stages of schedule development, the records officer works closely with program staff (who are responsible for the site's content and who are most familiar with the business needs and risks associated with the site) and with webmasters and IT staff (who are responsible for web operations). In the first stages, they provide the records officer with key data concerning the site and how it is used. They continue to work with the records officer in carrying out risk assessments, and at the end of the process, they must review the final schedule to ensure that it meets business needs and mitigates risks.
Scheduling of databases supporting content management systems is a sufficient means of addressing back-end, dynamically created content. Back-end programmatic databases for which a web page serves as the interface are normally scheduled as program records, separate from the web schedule(s). - determining the structure of the web schedule,
- How does risk assessment help with developing a records schedule?
The analysis that is performed in a risk assessment helps you gather information about the web site, the agency programs that the web site supports, and the records that must be scheduled. Some of the information you will gather includes:
- how your agency uses its web site (see GENERAL BACKGROUND, RESPONSIBILITIES,
AND REQUIREMENTS, section 1, for examples),
- how often the site and specific portions of the site are changed or updated,
- the degree to which the information on a web site is unique or is readily
available in other agency records, and
- whether the web site or portions of the site are considered high risk.
More detailed information concerning risk and risk assessment is included in MANAGING WEB RECORDS, section 2. Trustworthy records and their characteristics (i.e., reliability, authenticity, integrity, and usability) are discussed in MANAGING WEB RECORDS, section 1.
- how your agency uses its web site (see GENERAL BACKGROUND, RESPONSIBILITIES,
AND REQUIREMENTS, section 1, for examples),
- What is the structure of a web schedule?
Deciding on the level of analysis for risk assessment will resolve one aspect of schedule structure: whether the schedule describes records at the level of the entire web site or whether individual portions of the site are scheduled separately. If an agency chooses the latter approach, the schedule items should describe pages or groupings of pages broadly in terms of their content or function. Describing portions of the site too narrowly increases the likelihood that the schedule will become out of date as the site changes over time. Drastic changes in site content and/or function would likely require revisions to the schedule, just as significant changes in the content or function of a traditional record series typically warrant changes to a previously approved schedule.
There is no hard and fast rule as to what number of items is appropriate in a web schedule. Web management and operations records should be grouped, based on business needs and level of risk, into no more than three or four series for the entire site or applicable unit of analysis (i.e., portion of the site).
The following are two approaches to structuring a web schedule.5.1 Single Item Schedule for Web Content and Site Management and Operations Records
You can use a single schedule item to describe the web content records (either the entire site or portions of it) along with the related records that pertain to site management and operations. This option would be appropriate if all of the records related to the site warrant the same retention period in order to meet business needs and mitigate risks. For the sake of simplicity and ease of management of the web site, an agency may also choose to use a single item and retention period for web records even if there are variations in business needs and risk. In this case, the retention period chosen must ensure that records are retained and remain usable for the appropriate period of time. This approach would require that some records with shorter business needs and lower levels of risk be maintained for a period of time longer than is necessary. See APPENDIX C, option A, for an example.
5.2 Multiple Item Schedule for Web Content and Site Management and Operations Records
If business needs and the mitigation of risk mandate different retention periods for the site content records and the management and operations records, you can schedule them separately. Follow these guidelines:- If all management and operations records associated with a site (or individual portions) are needed for a uniform period of time in order to mitigate risk, then a single item for all such records might be appropriate. Web content records would be covered by one or more separate items. APPENDIX C, option B provides an example of multiple items for web content records and one item for all web management and operations records.
- If the assessment of risk dictates that records are needed for different
periods of time, then records that need to be kept for the same amount of
time in order to mitigate risk should be grouped together and each grouping
assigned an appropriate retention period. Such variation is likely in higher
risk situations. Multiple items should then be developed for the site management
and operations records, regardless of whether web content records are included
in single item or in multiple items. See APPENDIX C, option C and APPENDIX
C, option D for examples.
When setting up the structure of a web schedule, decide if it should include an item for web snapshots that capture the content pages and related site map as they existed at particular points in time. Business needs and the need to lessen risk determine whether or not such snapshots are warranted and their frequency. However, in determining when snapshots should be taken, an agency should also consider how frequently the information on a site changes. A snapshot should be taken each time the site changes significantly.
5.4 Examples of Scheduling Options
APPENDIX Ccontains examples of the different options that agencies may employ in scheduling their web site records.
- How are retention periods for web site-related records determined?
When determining retention periods for web site-related records, as with other records, the agency needs to assess how long the information will be needed to satisfy business needs and mitigate risk, taking into account Government accountability and the protection of legal rights. If specific web content is available in places other than the web, consider whether the existence of the information in other records affects the retention needs for the web records. In the case of information unique to the web site, the web version is the only recordkeeping copy.
In many cases, particularly where the risk is low, the web content and the related site management and operation records should be assigned a retention period that allows disposal as soon as records are no longer needed in the conduct of agency business.
In instances where risk levels are higher, web content and the related web management and operations records would probably warrant retention for a period of time that exceeds the time needed to satisfy all business requirements. The extra time needed in order to mitigate risk would usually not to be more than 3-5 years beyond the retention period mandated by business needs alone. However, the mitigation of risk may require an even longer retention period in selected instances.
As with other agency records, most web records do not warrant permanent retention and should be scheduled for disposal in accordance with the guidance provided above. In instances where NARA determines that a site or portions of a site has long-term historical value, NARA will work with the creating agency to develop procedures to preserve the records and provide for their transfer to the National Archives.
- Can the GRS or existing agency schedules be used for web records?
There are currently no items in the General Records Schedules that were developed to specifically cover web records. However, some items in the GRS may be applied to web site management and operations records. For example, GRS 14, items 1 and 2 (Information Service Records - Information Request Files and Acknowledgment Files, respectively can be used to cover transactions of this sort that are generated via an agency web site.
Another GRS item that may be used for web records is GRS 21, item 6 (Audiovisual Records - Graphic Arts, Routine Artwork for Handbills, Flyers, Posters, Letterhead, and Other Graphics).
Likewise, records relating to training staff regarding the agency web site might be covered by subitems in GRS 1, item 29 (Civilian Personnel Records - Training Records).
A variety of items included in GRS 24 (Information Technology Operations and Management Records) may be relevant to web management and operations records. Examples include items 1, oversight and compliance records; item 3, IT asset and configuration files; item 5, files related to maintaining the security of systems and data; and item 6, user ID, profiles, authorizations, and password files.
As these examples demonstrate, a GRS item may be used for web management and operation records if the basic content and function of the records is consistent with the GRS and the retention period is appropriate to meet business needs and mitigate risk. This principle also governs whether or not an existing agency-specific schedule item can be used for web management and operation records. For example, a schedule item covering graphics design can be used for records generated in connection with this function as it relates to the agency web site. Likewise, an agency schedule item for internal committee records could be applied to records accumulated by a committee established to advise the agency webmaster.