Reconstructing the Ancestor of a Modern Genome with Multigene Families

Monday, October 20, 2003 - 11:00am - 11:50am
Keller 3-180
Nadia El-Mabrouk (University of Montreal)
Given a particular model of evolution and an optimization criterion, the problem is to recover an ancestor of a modern genome modeled as an ordered sequence of signed genes. One direct application is to infer gene orders at the ancestral nodes of a phylogenetic tree. Implicit in the rearrangement literature is that each gene is present exactly once in each genome. This hypothesis is clearly unguaranteed for divergent species containing several copies of highly paralogous and orthologous genes. In this presentation, we consider models of genome evolution that take multigene families into account.

We first present a genome-wide doubling event. Genome duplication is an important source of new gene functions and novel physiological pathways. Originally (ancestrally), a duplicated genome contains two identical copies of each chromosome, but through genomic rearrangements, this simple doubled structure is disrupted. At the time of observation, each of the chromosomes resulting from the accumulation of rearrangements can be decomposed into a succession of conserved segments, such that each segment appears exactly twice in the genome. We present exact algorithms for reconstructing the ancestral doubled genome in linear time, minimizing the number of inversions and/or translocations required to derive the observed order of genes along the present-day chromosomes.

The second part of the presentation will concern a model of duplications at a regional level. In this model, chromosomal regions (one or more genes) are duplicated from one location of the genome to another. Studies from human genomic sequence indicate that many of these segments have been duplicatively transposed in very recent evolutionary time. The implicit hypothesis is that a genome with multigene families has an ancestor containing exactly one copy of each gene that has evolved through a series of duplication transpositions and substring inversions. We present an algorithm for reconstructing an ancestral genome giving rise to the minimal number of duplication transpositions and reversals. We then show how to use this algorithm to recover gene orders at the ancestral nodes of a phylogenetic tree.