
Computer Science and the Cornell Genomics Initiative
Since the early 1980s, a succession of technological
advances has made it possible to perform large-scale DNA sequencing. New genomic data has
been generated at an explosive rate: the volume has doubled every 15 months since about
1984. Extracting biological insight from this raw data is one of the central challenges of
biology today.
The ability to relate genomic data to biological
function would constitute a major step in understanding the process of life. Such an
advance would likely have a significant impact on medicine and agriculture. Recognizing
the importance of this challenge, a Genomics Task Force consisting of 40 Cornell faculty
was convened last year to plan the course of genomics research at Cornell into the next
century. The result of their labors is the Cornell Genomics Initiative, an ambitious plan
for a major new interdisciplinary research program involving all the biological sciences,
Engineering, and the Cornell Medical College. Areas of concentration will be
computational genomics and bioinformatics
mammalian genomics
plant genomics
microbial genomics
nanofabrication and bioengineering
The program has been endorsed by the University
administration, and major resources have been committed. The Cornell Genomics Initiative
is underway.
It is clear why Computer Science is a key
participant in this effort. Dealing with the overwhelming flood of genomic information
presents a bewildering array of computational challenges. There is a desperate need for
tools to retrieve, compare, filter, visualize, and analyze massive quantities of genomic
data spread among several sources and in different formats. The biological research
community alone is not in a position to deal with the enormous technological problems
involved in the production of these tools; expertise in high-performance scientific
computing, data management, information retrieval, software engineering, statistics,
stochastic processes, and graphics is required as well.
The plan for computational genomics and
bioinformatics involves (among other things) the appointment of two new faculty members in
Computer Science. In order to truly bridge the gap between fields, it was considered
important to recruit someone whose primary training and research interests were in a
biology-related field but who was conversant with computational aspects and would feel
comfortable in a computer science department. With the help of colleagues in the
biological sciences, CS recently identified a senior computational biochemist, Ron Elber,
formerly of the Hebrew University in Jerusalem. Professor Elber will join the Department
in January 1999. A search for an additional computational biologist is underway.
In addition, a Laboratory for Computational Genomics
and Bioinformatics will be established under the aegis of the Cornell Theory Center. A
core group of research personnel will be appointed, whose primary responsibility will be
to develop computational tools and provide support under the direction of faculty in the
biological sciences and CS. Most of the physical infrastructure for the laboratory is
already in place. Besides facilities available in the respective academic departments, the
Cornell Theory Center will provide the central hub of computational activity. Housed in
Rhodes Hall, it is close to the Computer Science Department and the School of Operations
Research and contains classrooms, computer labs, and offices for support personnel. It
also houses the SP2, a high-performance supercomputer that would be available for
computation-intensive applications. To coordinate activities between the Medical College
and the Ithaca campus, a high-speed data link will be installed, which will allow the
sharing of data between the two campuses and provide a medium for Web-based remote
instruction. |