CS 501
Software Engineering
Spring 2002

Project Concepts

Cornell Institute for Digital Collections


Client

The Cornell Institute for Digital Collections is based in the Cornell University Library.  Peter Hirtle, the Director, has proposed three projects:

Project outlines

Development of a METS-compatible Document Structuring Tool
 
When converting paper pages to scanned images, data about how one page image relates to the others must also be collected. For example, when scanning a book, one would like to know that a particular image is of page 14, and that it is the first page of the Table of Contents. Several years ago a CS student developed for CIDC a Visual Basic front-end to an Access database to record rudimentary information about images that had been scanned from microfilm. Using the tool, a staff member could check the quality of all of the scanned images and tag each image with the appropriate structure. More recently, an XML schema entitled the "Metadata Encoding and Transmission Standard" (METS) has been developed to codify document structuring. (see <http://www.loc.gov/standards/mets/>).
 
We would like to see the development of software that could be used to create METS-compatible structuring documents in XML during an image quality control phase. Students would need to identify the implementation requirements for document structuring, determine the appropriate platform and underlying technology for the structuring tool, and build it for distribution by CIDC. Part of the task will involve the incorporation of image reformatting and resizing routines in the software. In addition the students will need to master the METS XML schema.
 
Client: Peter B. Hirtle, Director, Cornell Institute for Digital Collections (pbh6@cornell.edu; ark3@cornell.edu).
 
Linking Photo Ordering and Image Display software
 
Currently when someone requests a copy of a photograph from the Rare and Manuscripts collection, they complete a paper order form. The information on the photo order form is transferred to a photo order/tracking system in File Maker Pro. Either a photographic or digital copy of the image is made, and the copy is delivered to the patron. We would like to move to a system that would allow us to add digital copies to our web-accessible image data program, Insight from Luna Imaging. That means figuring out a way that information from the photo ordering system can be automatically transferred to the SQL Server database underlying the Insight program. Do accomplish this task, students would have to master both the data structures of the FMP and Insight databases, figure out an appropriate workflow of move data and images between the two systems, and ideally develop mechanisms to make this transfer automatic. Successful software might be used by other institutions using the Luna Insight software. It may require a redesign of the current photo ordering system.
 
Clients: Peter B. Hirtle, Director, Cornell Institute for Digital Collections (pbh6@cornell.edu; ark3@cornell.edu) and Elaine Engst, Director & University Archivist, Division of Rare and Manuscript Collections (ee11@cornell.edu).
 
Development of a Copyright Investigation Tracking Database
 
As part of the new distance learning courses, and in many digital imaging projects, it is necessary to seek permission from the copyright owner to use text, images, and other files. This means identifying the material that for which copyright permission is sought, tracking to whom permission requests have been sent and their responses, and recording whatever fees or limitations are required. While software programs to track copyright permission for specific types of material (i.e., coursepacks, course reserves) have been developed, there is no general software to track a wide variety of materials (images, text, film, music) for a variety of purposes (distance learning, course reserves, digital publication). Students on this project would work with the library's copyright service to identify the functional requirements for such a system and then build it. Ideally there would be some way to make those images that have been determined to be in the public domain broadly available, perhaps through existing image databases.
 
Clients: Peter B. Hirtle, Director, Cornell Institute for Digital Collections (pbh6@cornell.edu; ark3@cornell.edu) and Oya Rieger, Coordinator of Distributed Learning, Cornell University Library (oyr1@cornell.edu).

Technical

You can select the technical environment for these project in conjunction with the clients. 


[CS 501 Home Page]

William Y. Arms

(wya@cs.cornell.edu)
Last changed: January 22, 2002