Cornell Department of Computer Science Colloquium
4:15pm, November 15th, 2001
B17 Upson Hall

On the Power of Universal Bases in Sequencing by Hybridization

Eli Upfal
Brown University

 

 

Sequencing by hybridization is a novel DNA sequencing technique in which an array (SBH chip) of short sequences of nucleotides ({\it probes}) is brought in contact with a solution of (replicas of) the target DNA sequence. A biochemical method determines the subset of probes that bind to the target sequence (the {\it spectrum} of the sequence), and a combinatorial method is used to reconstruct the DNA sequence from the spectrum. Since technology limits the number of probes on the SBH chip, a challenging combinatorial question is the design of a smallest set of probes that can sequence almost all DNA string of a given length. Based on a novel combinatorial design, we show that the use of universal bases (bases that bind to any nucleotide) can drastically improve the performance of the SBH process. In particular, we present a probe design and sequencing algorithm with performance that asymptotically approaches the information-theoretical bound, and for any number of probes is significantly better than previously analyzed probe patterns.

(Joint work with A. Frieze and F. Preparata.)