National Archives Applied Research

National Center for Supercomputing Applications-NCSA

  • Bajcsy, Peter. Technologies for Appraising and Managing Electronic Records. National Center for Supercomputing Applications (NCSA), University of Illinois at Urbana-Champaign, Urbana, IL. Invited Lecture, National Archives, College Park, MD. September 23, 2009. PowerPoint slides as PDF.

Abstract:  A discovery of relationships among digital file collections (file2learn); Doc2Learn: a comprehensive comparison of contemporary documents; automated file format conversion software; Polyglot: conversion quality assessment tool; design technologies for appraising and managing electronic records; and discovery of relationships among digital file collections.

  • Bajcsy, Peter. Appraisal of 3D Data Conversions and Visualization Software Packages. National Center for Supercomputing Applications (NCSA), University of Illinois at Urbana-Champaign, Urbana, IL. January 21, 2009. PowerPoint Presentation PPT.

Abstract: Discusses appraisal of 3D Digital Data; managing 3D file formats; scalability of appraisals, 3D data conversions; components of Polyglot; challenges with conversion software; Polyglot as a web service; and international collaborations with PRONOM/DROID/JHOVE projects.  Project URL:  http://isda.ncsa.uiuc.edu/NARA/index.html and http://isda.ncsa.uiuc.edu/CompTradeoffs/ And other NCSA publications: http://isda.ncsa.uiuc.edu/publications

  • Bajcsy, Peter. To Preserve or Not to Preserve? How Can Computers Help with Appraisals. National Center for Supercomputing Applications (NCSA), University of Illinois at Urbana-Champaign, Urbana, IL. October 16, 2008. PPT.

Abstract: Discusses Past & Current Research; Computer-Assisted Appraisal of Documents; Approach and Methodology; PDF Documents; experimental results; grouping, ranking and Integrity verification; and computational scalability.  Project URL: http://isda.ncsa.uiuc.edu/CompTradeoffs/

  • Bajcsy, Peter and Kooper, Rob. Comprehensive Appraisals of Contemporary Documents. In 5th International IEEE eScience conference (IEEE e-Science 2009). National Center for Supercomputing Applications (NCSA), University of Illinois at Urbana-Champaign, Urbana, IL. Oxford, UK, December, 2009. PDF.

Abstract: Describes problems related to contemporary document analyses. Contemporary documents contain multiple digital objects of different type. These digital objects have to be extracted from document containers, represented as data structures, and described by features suitable for comparing digital objects. In many archival and machine learning applications, documents are compared by using multiple metrics, checked for integrity and authenticity, and grouped based on similarity. The objective of our work is to design methodologies for contemporary document processing, visual exploration, grouping and integrity verification.

  • Bajcsy, Peter, Kooper, Rob, and McHenry, Kenton. Towards a Universal, Quantifiable, and Scalable File Format Converter. National Center for Supercomputing Applications (NCSA), University of Illinois at Urbana-Champaign, Urbana, IL. In 2009 Fifth IEEE International Conference on e-Science, pp. 140–147. Oxford, UK. PDF.

Abstract: Addresses the problem of designing a universal file format converter. Discusses NCSA’s Polyglot system for data conversion.

  • Bajcsy, Peter, Kooper, Rob, McHenry, Kenton, McFadden, William, Ondrejcek, Michal, and Yahja, Alex.  Advanced Information Systems for Archival Appraisals of Contemporary Documents. National Center for Supercomputing Applications (NCSA), University of Illinois at Urbana-Champaign, Urbana, IL. In IEEE Fourth International Conference on eScience, 2008. eScience'08, pp. 440–441. PDF.

Abstract: Addresses the problem of designing a scalable framework for archival appraisals of contemporary PDF documents. Discusses methodologies on information types to one comprehensive analytical framework; small/large scale computational studies; comparisons of