Frequently Asked Questions About Imaged Records
ATTENTION! This page has been superseded. The information listed below is no longer accurate. For NARA's list of current FAQs please visit http://www.archives.gov/records-mgmt/faqs. Please note that this page is available only as a technical and historical reference.
- What is imaging?
- What factors must be considered in determining whether to image records?
- What are possible advantages and disadvantages of maintaining records as images?
- What are some of the cost factors that should be considered before starting an imaging project?
- What are the recordkeeping requirements for imaged records?
- When do records converted to images need to be scheduled?
- Do indexes created for imaged records need to be scheduled?
This FAQ provides a high level overview and general outline of imaging issues relating to management of Federal government records. Since this is an evolving area, additional answers will be posted as they are developed. Before actually choosing an imaging project/system, a detailed review of the issues and the literature would be required, as well as an analysis of specific business needs.
Imaging is a process by which a document (primarily on paper, although any medium can be used) is converted from a human-readable format to a computer-readable digital image file. A digital image consists of pixels (picture elements or tonal values in binary code) arranged in columns or rows. The number of pixels per inch determines the image's resolution (clarity and definition of the image expressed as height by width in pixels for image files or as dots per square inch (dpi) for prints).
These imaged pictures of documents can be stored on a variety of media. The most common types of storage are magnetic media (such as tapes, disks, and magnetic cartridges) or optical media (such as CD-ROM and other removable disks known as "platters"). When combined with effective indexing, imaging the files can shorten information retrieval time and allow access to materials for multiple users at various locations.
Image files come in many different types of software-dependant formats, such as .gif, .jpg, and .tif. Most formats are proprietary, so computers need software to convert the images back to a human-readable format. Proprietary file formats may not be supported long term by manufacturers and may vary from vendor to vendor. Many file formats use compression to force more data into less storage space and speed image precision, storage, and transmission. Compression may be lossless (less compression but no data loss) or lossy (deep compression with subsequent data loss). Lossy file formats, such as JPEG (.jpg) files, don't necessarily look the same after compression.
Without special software, computers generally cannot use the informational content of a raw image file to search for or retrieve a specific image. Search and retrieval normally depends on some form of indexing, which assigns specific metadata to each document, such as author, recipient, date, title, and content keywords. This index, or metadata, can be simple or sophisticated, and is typically an electronic database that is linked to the images. Useful indexing requires careful planning and forethought before any actual imaging begins.
Images of textual records can be converted to searchable electronic text using optical character recognition (OCR) technology. OCR is accomplished by a software application that reads the images and produces text based on recognized patterns in those images. Then, the electronic text can be stored in computer-readable form for search and retrieval purposes. OCR doesn't work for all kinds of documents, particularly for handwritten, poor contrast, unusual type font, or mixed text/image documents. Even when using good quality originals, there will be costs for post-scanning clean up of raw OCR text.
The decision to adopt a document imaging system should be based on business needs. Agencies must justify the implementation of a system based on an analysis of their work processes and business needs balanced against costs. The decision to implement an imaging system should be based on improvements in productivity and efficiency or quality of service. Simply automating an existing process may not lead to significant savings or improved performance. Costs include more than just the initial purchase of an imaging system. You may also incur migration costs if the information has to be retained for periods longer than five to ten years.
Before starting any imaging project, know the project's mission, users, priorities (speed, image quality, and quantity), and functional goals (reference, web use, publication, other). Additionally, assess staff expertise and availability (to do scanning, manage infrastructure, migrate data, and build metadata), and address content issues, such as physical condition, format, nature and attributes to be captured.
There are advantages to instituting imaging systems, such as increased storage capacity, elimination of "out-of-file" problems, shortened retrieval times, improved retrieval by multiple users, and ease of information dissemination.
There are also disadvantages, such as expensive hardware and resource-intensive indexing requirements, as well as rapid technological changes that require frequent upgrades of hardware and software. Migration and conversion of records in imaged format may also be needed to protect the information in records not yet eligible for disposal.
In addition to the cost factors, other factors that should be considered when determining whether to image documents include:
- Volume of records. Imaging is generally used for large volumes of records.
- Reference use. Imaging is most effective on highly referenced collections where a short retrieval time is important or where there are multiple users accessing the same records. Combined with effective indexing, imaging records can facilitate retrieval.
- Relationship to records on other media. Consider whether the records to be imaged have to be used with records on other media.
- Records and information usage. Consider how the information is used and how long the record is needed. Required retention periods are specified in records schedules.
- Legal acceptability. Following established procedures and maintaining the documentation of audit trails and other business practices will ensure that information is kept that may be needed to document record authenticity and reliability.
- Ease of maintenance. Balance storage costs and capacity with indexing, conversion, quality control, and migration costs.
- Staffing requirements. Increased imaging and indexing of records and quality control procedures may require additional staff training.
- Work process and information flow. Would imaging facilitate the work process? Considerations include how records are routed, how information is added to records or files, and when records (finals or drafts) need to be captured.
- Verification of signatures. If signature verification is a requirement, consider that forensic analysis of signatures is not possible with imaged records.
- Document preparation. Determine how much work needs to be done to make the files ready for imaging. Document preparation for voluminous files may be significant.
- Quality control issues. QC procedures must be instituted both while preparing documents for imaging and while verifying and validating imaged information.
- Condition of original records. The condition of the records will affect their handling during imaging as well as the quality of the imaged record that can be produced. This will particularly be a factor for records that are:
- In-house operation versus contracting operations with a service bureau.
- Image requirements (resolution, compression, headers, etc.) will vary depending on how images will be used.
- Indexing requirements and metadata fields are determined by analyzing how users will access images.
- Requirement to convert permanent records to an acceptable format prior to transfer to the National Archives of the United States.
Several states, including Kansas, Minnesota, and Missouri have developed good products that provide additional information on digital imaging systems and issues.
- Electronic Recordkeeping Resources, Kansas State Historical Society,
- Minnesota Historical Society,Digital Imaging FAQs
- Missouri Secretary of State's Office, Records Management and Archives Division, Draft Digital Imaging Guidelines, April 2000 http://mosl.sos.state.mo.us/rec-man/resource.html
What are possible advantages and disadvantages of maintaining records as images?
- Ability to use very high-density storage media.
- Shorter retrieval time than hard copy when the images are well indexed.
- Multiple users and access levels are possible.
- Low shipping costs and ease of information dissemination.
- Ease of use of imaged copies of records in vital records and disaster recovery plans.
- Legal uses. Organizations that need to retrieve information efficiently during discovery and litigation may find that using imaged records can assist in the effort.
- Ease of making copies of the imaged records.
- Digital images don't lose quality from generation to generation. Well made copies and derivatives can be as good as the original images.
- Digital images are not human-readable without computer equipment.
- Significant equipment costs, including hardware and software.
- Potential for hardware and software obsolescence. Generally, systems change every 18 months to 5 years, software changes every 2-3 years, and the life expectancy of media is relatively short.
- Indexing requirements may be more extensive than is required with other formats. Unless records are arranged in a logical sequence or clearly indexed, it may be difficult to identify a series or use groups of records as a series.
- Different types of scanners must be used to scan text, oversize items, photographic prints, slides, and other formats.
- Digital quality control, metadata capture and management, and image capture and management are complex and time consuming processes requiring expertise and constant vigilance.
- Complex disposition and potential problems in implementing dispositions.
- If records are stored without regard to retention periods on an individual disk or in an individual directory, each record must be selected for destruction or to move to off-line storage.
- When agencies use write-once-read-many (WORM) optical media, records should be grouped by like retention periods on individual disks or in individual directories.
A cost benefit analysis should be completed before choosing to implement any imaging system. Your organization may already have a standard method of conducting cost-benefit analysis, for example, as part of your Information Technology systems development lifecycle methodology or as part of your standard procedures for purchases and acquisitions.
For more information, review the Fast Track product, Analysis of Costs and Benefits for ERM/ERK Projects at http://www.nara.gov/records/fasttrak/prod8.html. You may also find it helpful to review the cost-benefit analysis methods and guidelines used by various Federal agencies that are cited below.
- Office of Management and Budget. OMB Circular No. A-94 provides general guidance for conducting cost-benefit analysis for all types of projects.
- General Accounting Office. The GAO Information Technology Investment Evaluation Guide describes the analysis of benefits, costs and risks in the context of an overall IT investment management process.
- National Institutes of Health. The Cost Benefit Analysis Guide for NIH IT Projects is a comprehensive guide for performing cost-benefit analysis. The NIH IT Cost-Benefit Analysis web page provides access to examples of cost-benefit analysis.
- U.S. Patent and Trademark Office. The USPTO Economic Analysis (Technical Guideline IT-212.3-10) provides the framework and requirements for completing a business case and economic analysis of IT investments, including analysis of benefits and costs and evaluation of alternatives.
" Recordkeeping requirements" are statements in statutes, regulations, and agency directives that provide general and specific requirements for Federal agency personnel on particular records to be created and maintained in the file (36 CFR Part 1220.14). NARA has not yet developed requirements for images; however, NARA requirements for electronic systems can be found in 36 CFR Part 1234. Each agency must develop their own recordkeeping requirements for imaging systems based on their specific business needs for the system, agency IT architecture, and overall agency mission. Each agency should review their particular records, take note of specific agency uses of the records, and determine which records are needed to document their activities. Then, each agency must develop a plan to protect the information in those records for the entire retention period. When imaging records, document the procedures and provide audit trails to serve as the record that the images were created properly and validated, similar to the steps required by 36 CFR 1230.12(a) - (c). A migration plan may also be needed to ensure that the information in the images can be accessed throughout the entire retention period of the records. NARA must approve retention periods. For additional information on recordkeeping requirements, consult Agency Recordkeeping Requirements: A Management Guide, on the NARA web site. Since this is a rapidly evolving area, agencies should consult with their appraisal archivist to determine whether NARA knows of any pilot projects or best practices that might apply to their imaging implementation.
Imaged copies of records already scheduled as temporary do not need to be scheduled if the nature and content of the records remain identical to the description in the schedule.
- Apply the disposition authority approved by NARA for the paper records to the image files.
- Retain the paper copies of temporary records that have been imaged only when there is a compelling business reason.
- Retention of the paper copies after imaged copies have been verified adds costs by requiring
- Organization of the files
- Periodic file cutoffs
- Retirement to a records storage facility or disposal.
Retention schedule items for temporary textual records must be added or revised whenever the nature or content of records changes.
- Reengineered work processes or other changes may result in creation of a different series of records.
- When the nature or content of a records series changes, both the paper and imaged copies of the records must be scheduled, even if all of the records are temporary.
When unscheduled records are imaged, the image file and the paper records must both be scheduled and the paper copies may not be disposed of until an approved schedule for covering the records is approved. The schedule should provide for the disposition of both the paper and imaged copies and specify which version is the recordkeeping copy.
When paper records that are scheduled as permanent are imaged, the imaged files must also be scheduled. The agency may not dispose of the paper copies until NARA has approved a new schedule for them. The schedule should provide the disposition of both the paper and imaged copies and specify which version is the recordkeeping copy.
- Indexes and other finding aids for temporary records are disposable under General Records Schedule 23, item 9, unless the finding aid contains an abstract or other information that can be used independently of the related records.
- Indexes and finding aids for permanent records that have been imaged must be scheduled.
- Indexes and finding aids for unscheduled records that have been imaged must be scheduled along with both the paper and imaged copies of the records.