Computer Science 628: Biological Sequence Analysis
Fall, 2004
Alignment of biological sequences (DNA, AA, RNA) features
prominently in modern biological and computational biology research. In
this course we will study in detail the statistical and algorithmic
challenges that one faces in designing tools for alignment. For
example, how do we find an optimal local alignment between a query
sequence and a genomic database and how do we know whether or not such
an alignment is statistically significant? Following Durbin et al.'s
textbook we will start with presenting sequence and multiple sequence
alignment algorithms in the context of probabilistic models (extensions
of HMM, covariance models). We will also go over Karlin and Altschul
and others' work on the statistical analysis of alignments. Building on
these "classical" results we will address more current topics such as
seed design for the seeded alignment paradigm and alignment questions
that came up in recent whole genome comparisons such as the
rat-human-mouse one.
Instructor: Uri Keich
Lectures: Tuesdays & Thursdays 2:55-4:10
Location: Theory Center 484
Prerequisites: Nothing is set in stone but some familiarity with
algorithms, statistics, and probability would make the course easier to
digest.
Grade: Your grade will be based on your submitted homework assignments
(that would include some programming) as well as on the final exam.
UPDATE 9/3/04: You
can now download the relevant slides
and papers. Note that you can now
only access these links from a Cornell address.
UPDATE 9/8/04: HW assignment #1 is now posted.
UPDATE
9/29/04: HW assignment #2
is now posted.
UPDATE 11/14/04: HW
assignment #3 was updated.
The file you need
is a 0-1 matlab vector.
UPDATE 11/25/04: HW assignment #4
is now posted, slides
updated.