CS 501
Software Engineering
Spring 2006
Project Suggestion: Data Tracking System for the Web Library
|
Client William Arms, wya@cs.cornell.edu. The Web Library The Web Library is a Cornell project to build a research library based on the Web crawls that have been collected by the Internet Archive since 1996. It is described at: http://www.infosci.cornell.edu/SIN/WebLib/index.html. By the end of 2007 the Web Library is planned to contain 10 billion
Web pages, occupying 240 TB of disk storage. To achieve this, millions
of files have to be transferred, indexed, and stored. In fall 2005, two
M.Eng. students designed and began implementation of a system to manage
these files and their transfer to Cornell. Their report is at: |
[ CS 501 Home | Notices | Syllabus | Projects | Readings | Assignments | Quizzes | Academic Integrity | About ]
William Y. Arms
(wya@cs.cornell.edu)
Last changed: January 21, 2006