CS 501
Software Engineering
Spring 2002

Project Concepts

Tools for Preservation of Online Materials


Client

Nancy McGovern, Cornell University Library (nm84@cornell.edu)

Project outlines

The Cornell University Library has a strong group working to save digital content on the Internet for future generations.  They have proposed two projects:

Digital Collection Registry System 

A core component of the central depository that will be developed and implemented by Cornell University Library will be the centralized registration of collections for potentially distributed digital collections.  For preservation, a digital collection should be registered from the point of creation.  The registry should contain metadata about the project or other source of creation, the collection, and the digital content stored as objects.  In the long-term, the aim is for centralized control over distributed storage, though the first phase will build a central store for digital images.  The registry system must contain active pointers to the digital content locations and ideally provide ongoing monitoring of the digital objects.   The aim is to plug the registry component into the Central Depository System and it would probably have to be interoperable with existing systems.  One issue will be achieving metadata synchronization between systems when all or portions of the stored metadata are replicated, to support creation, preservation and access. 

Preservation Technology Watch

The objective is to develop a set of tools for identifying, capturing, mapping, analyzing, and presenting information objects from a wide range of sources.  Objects might contain information in any format, including images, office documents, XML documents, and URLs with annotations.  The information store needs to be extensible, searchable and persistent.  The sources may be dynamic, the capture should be iterative and schedulable, and the presentation of the information should be flexible and easily modified.  The tools should support statistical analysis and other analytical methods, allow users to view captured information using a variety of diagramming and mapping options, provide ad hoc and set reports, and support distributed, comprehensive, current access to the information store.  

While it would be ideal for a software project to deliver the full range of tools and utilities that will be needed, the library expects to obtain grant funding to support the full implementation of the Preservation Technology Watch.  Tools that address Watch functions, need to be easy to maintain and use, and to be easily integrated into open environments.

Technical

You can select the technical environment for this project.


[CS 501 Home Page]

William Y. Arms

(wya@cs.cornell.edu)
Last changed: January 18, 2002