The Disk-Covering Method for Phylogenetic Tree Reconstruction
Tandy Warnow
Department of Computer Science
The University of Texas at Austin
Abstract
Phylogenetic trees, also known as evolutionary trees, model the evolution of
biological species or genes
from a common ancestor. Most computational problems associated with phylogenetic
tree reconstruction are
very hard (specifically, they are NP-hard, and are practically hard, as real
datasets can take years
of analysis, without provably optimal solutions being found). Finding ways of
speeding up the
solutions to these problems is of major importance to systematic biologists.
Other approaches take only
polynomial time and have provable performance guarantees under Markov models of
evolution; however, our recent work shows
that the sequence lengths that suffice for these methods to be accurate with
high probability grows exponentially
in the diameter of the underlying tree.
In this talk, we will describe new dataset decomposition techniques, called the
Disk-Covering Methods, for phylogenetic tree reconstruction. This basic
algorithmic technique uses interesting graph theory, and can be used to reduce
the sequence length requirement of polynomial time methods, so that polynomial
length sequences suffice for accuracy with high probability (instead of
exponential). We also use this technique to speed up the solution of NP-hard
optimization problems, such as maximum likelihood and maximum parsimony.