Bayesian Gene/Species Tree Reconciliation and Orthology Analysis Using MCMC

Friday, October 24, 2003 - 11:00am - 11:50am
Keller 3-180
Jens Lagergren (Royal Institute of Technology (KTH))
Comparative genomics in general and orthology analysis in particular are becoming increasingly important parts of gene function prediction. Previously, orhtology analysis and reconciliation has been performed only with respect to the parsimony model. This discards many plausible solutions and sometimes precludes finding the correct one. In many other areas in bioinformatics probabilistic models have proven to be both more realistic and powerful than parsimony models.

We introduce a probabilistic gene evolution model based on a birth-death process in which a gene tree evolves inside a species tree. Based on this model, we develop a tool with the capacity to perform practical orthology analysis, based on Fitch's original definition, and more generally for reconciling pairs of gene and species trees. Our gene evolution model is biologically sound and intuitively attractive. We develop a Bayesian analysis based on MCMC which facilitates approximation of an a posteriori distribution for reconciliations. That is, we can find the most probable reconciliations and estimate the probability of any reconciliation, given the observed gene tree. This also gives a way to estimate the probability that a pair of genes are orthologs. To the best of our knowledge, this is the first successful introduction of this type of probabilistic methods, which flourish in phylogeny analysis, into reconciliation and orthology analysis.

The MCMC algorithm has been implemented and performs very well on synthetic as well as biological data. Using standard correspondences, our results carry over to allele trees as well as biogeography.