IMA Tea/Reception (with POSTER SESSION)

Monday, October 20, 2003 - 5:10pm - 6:00pm
Lind 400
  • Duplicative and Conservative Transpositions of the Larval Serum Protein 1 Genes in the Genus Drosophila
    Josefa Gonzalez (Autonomous University of Barcelona)
    Joint work with Ferran Casals and Alfredo Ruiz.

    In the genus Drosophila, homologous chromosomal elements show a remarkable conservation of gene content but not of gene order, indicating that paracentric inversions are the most common kind of genomic change. Detailed physical maps of chromosomes X, 2 and 4 of Drosophila repleta and D. buzzatii, both belonging to the Drosophila subgenus, were constructed and their gene rearrangements compared with the homologous chromosomes in D. melanogaster. We estimated that 393 paracentric inversions have been fixed in the whole genome since the divergence between D. repleta and D. melanogaster, that amounts to an average rate of 0.053 disruptions/Mb/myr. Only two exceptions to the chromosomal homologies were found and we have further analyzed one of them: the transposition of the Larval serum protein 1 (Lsp1) genes. Comparative molecular analysis of the transposed genes and their flanking regions can help to elucidate the time, direction and mechanism of gene transposition. In the D. melanogaster genome, three Lsp1 ge es, alpha, beta and gamma, are present and each is located on a different chromosome. We have characterized the molecular organization of Lsp1 genes in D. buzzatii and in D. pseudoobscura, a species of the Sophophora subgenus. Our results show that only two Lsp1 genes (beta and gamma) exist in these two species suggesting that the duplicative transposition generating Lsp1alpha, took place
  • Reconstructing the Genomic Architecture of Mammalian Ancestors Using Multispecies Comparative Maps
    Bill Murphy (National Cancer Institute)
    Rapidly developing comparative maps in selected mammal species are providing an opportunity to reconstruct the genomic architecture of mammalian ancestors and study rearrangements that transformed this ancestral genome into existing mammalian genomes. Here we apply the recently developed Multiple Genome Rearrangement algorithm (MGR) to human, mouse, cat and cattle comparative maps (with 311-470 shared markers) to impute the ancestral mammalian genome. Reconstructed ancestors consist of 70-100 conserved segments shared across the genomes that have been exchanged by rearrangement events along the ordinal lineages leading to modern species genomes. Genomic distances between species, dominated by inversions (reversals) and translocations, are presented in a first multispecies attempt using ordered mapping data to reconstruct the evolutionary exchanges that preceded modern placental mammal genomes.

    Joint work with Guillaume Bourque (Centre de Recherches Mathématiques, Université de Montréal, Montréal, Canada H3C 3J7), Glenn Tesler, Pavel Pevzner (Department of Computer Science and Engineering, University of California, San Diego La Jolla, California 92093-0114), and Stephen J. O'Brien (Laboratory of Genomic Diversity, National Cancer Institute Frederick, Maryland 21702).
  • Evolution of the Hsp70 Gene Superfamily in Two Sibling Species of Nematodes Caenorhabditis elegans and C. briggsae
    Nikolas Nikolaidis (The Pennsylvania State University)
    Joint work with Masatoshi Nei.

    The Hsp70 gene superfamily of C. briggsae was characterized in an attempt to investigate the evolutionary relationships with the respective one of its sibling species C. elegans. The phylogenetic analyses included also genes from Drosophila melanogaster and Saccharomyces cerevisiae to clarify the long-term evolution of hsp70s. The Hsp70s are classified into three monophyletic groups according to their sub-cellular localization, namely, cytoplasm (CYT), endoplasmic reticulum (ER) and mitochondria (MT). The Hsp110 genes can be classified into the polyphyletic CYT group and the monophyletic ER group. The two nematode species encode two Hsp70 and two Hsp110 proteins localized in the ER and their highly heat-inducible genes contain introns. The different Hsp70 and Hsp110 groups appear to evolve following the model of independent or divergent evolution. These models can also explain the evolution of the ER and MT genes. On the other hand, the CYT genes are divided into heat-inducible and constitutively expressed genes. The constitutively expressed genes probably have evolved by the birth-and-death process and the rates of gene birth-and-death are different among all organisms studied. The heat-inducible genes show an intra-species phylogenetic clustering, suggesting sequence homogenization, probably by gene conversion-like events. In addition, these genes show high levels of sequence conservation in both intra- and inter-species comparisons, and in most comparisons the amino acid sequence similarity was higher than the nucleotide. These results suggest that purifying selection also played a crucial role in sequence conservation of the Hsp70s. Therefore, we suggest that the CYT heat-inducible genes have apparently followed a mixed evolutionary pattern with a combination of purifying selection, birth and death, and gene conversion-like events.
  • A Simpler 1.5-Approximation Algorithm for Sorting by Transpositions
    Tzvika Hartman (Weizmann Institute of Science)
    Joint work with Ron Shamir (School of Computer Science, Tel-Aviv University, Tel-Aviv, Israel).

    In this work we study the problem of sorting by transpositions. First, we prove that the problem of sorting circular permutations by transpositions is equivalent to the problem of sorting linear ones. Hence, all algorithms for sorting linear permutations by transpositions can be used to sort circular permutations. Then, we derive our main result: A new quadratic 1.5-approximation algorithm, which is considerably simpler than the extant algorithms of Bafna and Pevzner (1998) and Christie (1999). Thus, the algorithm achieves running time which is equal to the best known, with the advantage of being much simpler. Moreover, the analysis of the algorithm is significantly less involved, and provides a good starting point for studying related open problems.
  • New Methods for Estimating Amino Acid Replacement Rates
    Lars Arvestad (Royal Institute of Technology (KTH))
    Two new methods for estimating replacement rate matrices from protein sequence alignments are presented and shown to perform better than another recent method, Müller-Vingron's resolvent method, in a variety of settings. Furthermore, the best method is demonstrated to be robust on small datasets and practical also on very large datasets of real data. Neither short nor divergent sequence pairs have to be discarded, making the method economical with data.
  • Comparative Chloroplast Genomics of Seed Plants: Annotation and Analysis of Genomic Sequences
    Stacia Wyman (The University of Texas at Austin)
    Joint work with Romey Haberle, Tim Chumley, Jeff Boore, and Robert Jansen .

    Our research group is performing a comparative study of seed plant chloroplast genomes, which involves sequencing plastid genomes of 55 taxa representing all of the major lineages of seed plants, with more intensive sampling in groups with highly rearranged genomes. During the first two years we have completed sequencing or have nearly complete drafts for 10 plastid genomes and an additional 12 genomes are in various stages of progress. Most of the focus on the project so far has been on the highly rearranged chloroplast genomes of the angiosperm families Campanulaceae and Geraniaceae.

    We have also developed an annotation program and we have designed and tested several new computational methods for using whole genomes for phylogeny reconstruction. DOGMA (Dual Organellar GenoMe Annotator) is a web-based program for annotating organellar (currently chloroplast and animal mitochondrial) genomes. Given a whole genome sequence (or fragment) in FASTA format, DOGMA semi-automates the annotation process. DOGMA uses a custom database of the complete set of genes for 16 green plants. Biological expertise is still needed in order to identify start and stop codons as well as intron boundaries. The result is an annotated genome which can be saved in Sequin format for direct submission to GenBank.

    DOGMA, which is in the beta-testing phase, has already been used in the preliminary analysis of several sequenced chloroplast genomes. The complete sequences of the Trachelium (Campanulaceae) and Pelargonium (Geraniaceae) chloroplast genomes have identified numerous repeated sequences that are associated with extensive changes in gene order and they suggest that transposition may also be responsible for several genomic rearrangements in Trachelium.
  • Inferring Orthologous Regions via a Pseudo-Gibbs Sampler: Finding the Pieces of the Rearrangement Puzzle
    Bob Mau (University of Wisconsin, Madison)
    Joint work with Aaron Darling, Frederick R. Blattner, and Nicole T. Perna1.
  • Genome Halving Problem
    Max Alekseyev (University of California, San Diego)
    Joint work with Pavel Pevzner.

    Genome Halving Problem is motivated by an evolution mechanism that duplicates the entire genome. The result of such duplication, so-called perfectly duplicated genome, contains two identical copies of each chromosome. The genome then is a subject to reversal and/or translocation rearrangement operations. For given rearranged duplicated genome, Genome Halving Problem attempts to recover its closest perfectly duplicated ancestor. Solution to this problem is used as a building block for more sophisticated genome rearrangement algorithms.

    Genome Halving Problem was first introduced and solved in a series of papers by Nadia El-Mabrouk and David Sankoff. Their algorithm is rather complex and, to the best of our knowledge, it was never implemented as a computer program. In our work we present a new simpler and more general algorithm for Genome Halving Problem as well as its implementation in C++.
  • Easy Ways to Clear Hurdles
    Anne Bergeron (University of Quebec)
  • Application of MoBIoS for Conserved Primer Pair Discovery
    Daniel Miranker (The University of Texas at Austin)
    Joint work with Weijia Xu, Wenguo Liu, and C. Randal Linder.

    MoBIoS, a Molecular Biological Information System is a next generation database management system focused on scalable retrieval and mining of unorthodox biological data types that are poorly supported by relational database systems. MoBIoS comprises built-in data types for biological sequences and Mass Spectra. The MoBIoS storage manager extends traditional database systems by including built-in support for hierarchical clustering and nearest-neighbor and range search in metric spaces. In addition to built-in metrics to support sequence homology and protein identification, users may add their own metrics.

    We report on the first biological application of MoBIoS; a comparative study of the entire genomes of the plants rice and Arabidopsis to determine conserved pairs of strings of DNA that could be used to prime polymerase chain reactions (PCRs). Identification of such set of paired conserved primers would allow amplification of evolutionarily homologous DNA regions in a taxonomically broad set of seed plants. The ability to amplify homologous regions in a widely divergent set of species has a number of applications, e.g., phylogenetic reconstruction and comparison of protein evolution in a broad set of organisms. Ultimately, this approach to identifying conserved primer pairs could provide the community of systematists with a universal set of DNA sequences that can be used for assembling the tree of life.
  • Phylogenetic Analysis of Formin Homology Proteins in Arabidopsis Thaliana and Oryza Sativa
    Dimitra Chalkia (The Pennsylvania State University)
    Joint work with Tatiana Bibikova, Simon Gilroy, Wojciech Makalowski.

    The plant cell cytoskeleton plays an important role in many cellular processes, including cell polarity establishment and cytokinesis. Proteins that regulate cytoskeletal assembly are likely to be a part of the signaling cascade that governs plant cell morphogenesis. Formins are members of a large protein family that is defined by the presence of the highly conserved Formin Homology II (FH2) domain. In a wide range of organisms, including vertebrates, arthropods, nematodes and fungi, formins have been implicated in the regulation of cytoskeletal assembly and in the control of cytokinesis and cell polarity establishment and maintenance. The genomes of Arabidopsis thaliana and Oryza sativa contain putative formin-like proteins based on the presence of an FH2 domain. Arabidopsis thaliana formins have been tentatively sub-divided into two clades: Type I and Type II, based on the FH2 domain alignment. We have extended this analysis to cover both Arabidopsis and rice and have provided an evolutionary context for these plant formin families.Our phylogenetic analysis shows that formins are divided in two distinct clades in plants. This phylogenetic clustering is also supported by the stuctrural features of these proteins. This division of plant formins in two distinctive groups seems to predate the split of monocots/eudicots. The detailed evolutionary relationships of plant formins remain unclear. The placement of fungi formins at the basal position of the tree is in accordance with the most recent proposed phylogenetic scheme for eukaryotes. Animal and plant formins cluster together, and split into two major groups. This clustering may suggest that their last common ancestor had already at least two different types of formins.
  • Induced CYP1A1 Gene Expression in Lung Cancer Cell Lines
    Amal Shervington (University of Central Lancashire)
    Joint work with Kulthum Mohammed.

    The gene CYP1A1 (cytochrome P450, family A polypeptide 1), encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases that catalyze numerous reactions involved in drug metabolism and synthesis of cholesterol, steroids and lipids. The enzyme is reported to be present predominantly in extrahepatic tissues in humans and in experimental animals (1). CYP1A1 is of toxicological importance because it catalyses the bioactivation of polyaromatic hydrocarbon (PHA) constituents e.g. Benzo[a]pyrene and other combustion products abundant in tobacco smoke to mutagens and canrcinogens (2).

    Several studies of the oncogenic significance of CYP1A1 have found correlation between inducibility of the enzyme and lung cancer susceptibility in smokers (3). The expression and activity of CYP1A1 were examined using either peripheral blood lymphocytes as surrogate for lung cancer tissue (3) or lung biopsy specimens from human subjects. CYP1A1 transcripts were detected in lung cancer tissue either by reverse transcription polymerase chain reaction (RT-PCR) or northern blot hybridization (4).

    In our laboratory we used four different lung cell lines: A549 Adenocarcinoma; H460 large cell carcinoma; COR-L23/5010 drug resistance large cell carcinoma and CCD-32Lu normal lung cells as a control. We measured the level of CYP1A1 transcript using the LightCycler (quantitative PCR). mRNA extracted from 106 cells using mRNA capture kit (Roche) were used to generate cDNA by Reverse Transcription System (Roche) with CYP1A1 primers (designed using primer3 web site) and amplified by the LightCycler using CYP1A1

    The size of the CYP1A1 amplicon expected were 166bp, which was expressed at a highly induced level in the A549 Adenocarcinoma and to a less extent in the H460 large cell carcinoma. Very faint bands can be seen in L23/5010 drug resistance large cell carcinoma. No CYP1A1 can be detected in the normal lung cells. An amplicon of 300bp was amplified only in the control and not in the cancerous cell lines. Further work is required to characterise the 300bp band and to identify its significance.

    Our results have shown an induced level of CYP1A1 in the adenocarcinoma cell line which is absent from the control, indicating that CYP1A1 is expressed at elevated level in some cancer cell line but not in the control.

    Numerous citation have emphasised on the induction level of CYP1A1 in peripheral blood lymphocytes and lung cancer tissue but there have been no or few reports on the level of CYP1A1 in established cancer such as cancerous cell lines. Our study has shown elevated level of CYP1A1 in some of the cancerous cell lines, which may suggest an active role for the CYP1A1 in the maintenance of cancer.
  • An Objective View of Humankind's Place in Primate Evolution
    Derek Wildman (Wayne State University)
    Joint work with Monica Uddin, Guozhen Liu, Lawrence I. Grossman, and Morris Goodman.

    In order to accurately place humankind in a phylogenetic classification of Primates it is necessary to know the phylogenetic relationships among all members of the order. We present the phylogenetic relationships and times of divergence for extant members of the order as determined by DNA nucleotide sequence data, and we focus particularly on the relationships within the family Hominidae. Local molecular clock analyses using fossil calibrations calculate that the time of origin for the order Primates as a crown group is 63 million years ago. Anthropoid primates (New World monkeys, Old World monkeys, and apes including humans) originated approximately 40 million years ago.

    Phylogenetic and local molecular clock analyses from a sample of 97 genes show that humans and chimpanzees form a clade that most recently shared a common ancestor between 5 and 6 million years ago. These coding DNA data separate the human-chimpanzee clade from the gorilla clade between 6 and 7 million years ago. This African ape clade separated from the orangutan clade between 13 and 15 million years ago. We calculated the percent nonsynonymous DNA identity between humans and chimpanzees to be 99.4%, synonymous identity to be 98.4%, and total DNA sequence identity to be 99.1%. Interestingly, phylogenetic analysis grouped humans and chimpanzees together when only nonsynonymous sites were analyzed. This result suggests that at the protein level humans and chimpanzees are functionally more similar to each other than either taxon is to any other ape. Additionally, of these 97 genes, 30 show evidence of positive selection during the descent of catarrhine primates. An equal number (n=14) of these genes show elevated nonsynonymous rates of substitution on the human and chimpanzee lineages.

    Divergences between humans and chimpanzees are placed in perspective by comparing their date of divergence with those found across the class Mammalia. The age of genus level crown groups for mammals ranged from 2 to 21 million years old. The mean crown group time of origin is approximately 8 million years ago, and the 95% confidence interval falls between 6.61 and 9.71 million years ago. Thus, humans and chimpanzees more recently share a common ancestor than do many congeneric groups of mammals.
  • Distinguishing Orthologs from paralogs by Integrating Comparative Genome Data and Gene Phylogenies
    Steven Cannon (University of Minnesota, Twin Cities)
    Background: In eukaryotic genomes, most genes are members of gene families. When comparing genes from two species, therefore, most genes in one species will be homologous to multiple genes in the second. This often makes it difficult to distinguish orthologs (separated through speciation) from paralogs (separated by other types of gene duplication). Combining phylogenetic relationships and genomic position in both genomes helps to distinguish between these scenarios. This kind of comparison can also help to describe how gene families have evolved within a single genome that has undergone polyploidy or other large-scale duplications, as in the case of Arabidopsis thaliana and probably most plant genomes.

    Results: We describe a suite of programs called OrthoParaMap that makes genomic comparisons, identifies syntenic regions, determines whether sets of genes in a gene family are related through speciation or internal chromosomal duplications, maps this information onto phylogenetic trees, and infers internal nodes within the phylogenetic tree that may represent local as opposed to speciation or segmental duplication. We describe the application of the software using three examples: the melanoma-associated antigen (MAGE) gene family on the X chromosomes of mouse and human; the 20S proteasome subunit gene family in Arabidopsis, and the major latex protein gene family in Arabidopsis.
    Conclusion: OrthoParaMap combines comparative genomic positional information and phylogenetic reconstructions to identify which gene duplications are likely to have arisen through internal genomic duplications (such as polyploidy), through speciation, or through local duplications (such as unequal crossing-over). The software is freely available at

    Joint work with Georgiana May 1,2 and Nevin D. Young1,3.

    1 Plant Biology Department, University of Minnesota, St. Paul, MN 55108, USA
    2 Ecology, Evolution, and Behavior Department, University of Minnesota, St. Paul, MN 55108, USA
    3 Plant Pathology Department, University of Minnesota, St. Paul, MN 55108, USA
  • Reconstructing Reticulate Evolution in Species
    Luay Nakhleh (The University of Texas at Austin)
    In 1997, Wayne Maddison made an important observation that led to a separate analysis approach for phylogeny reconstruction. In his seminal paper, Maddison observed that gene trees that are related by reticulation can be combined into a network via the computation of the minimum number of certain branch moves; this number is called the SPR (for Subtree Prune and Regraft) distance. The two main challenges for Maddison's approach are

    (1) computational: computing the SPR distance between two trees is hard.

    (2) systematic: in practice, it is very hard to obtain the correct gene trees.

    In this poster we present our solutions to these two challenges. We address phylogenetic networks with constrained reticulation. For such networks, and trees induced by them, we present an efficient algorithm for measuring the SPR distance, as well as reconstructing the network from the given trees. We address the systematic challenge by considering a set of good gene trees instead of a single gene tree. We present results from extensive simulation studies that we conducted. Those results show a significant improvement of our method over Maddison's, as well as a clear outperformance over methods based on combined analysis of datasets.

    This is a joint work with Tandy Warnow and Randy Linder.
  • Mammalian Promoter Database: A Computational Platform for Comparative Genomics of Mammalian Transcriptional Regulation
    Dannie Durand (Carnegie-Mellon University)
    Joint work with Hao Sun, Saranyan K. Palaniswamy, Twyla T. Pohar, and Victor Jin.

    Transcription in mammalian cells is a highly complex process that involves multiple layers of general and gene-specific transcription factors. Although extensive molecular research has been providing important details about several transcription factors and their binding sites in the target gene promoters, the information generated over the years is highly fragmented. In order to better integrate this vast amount of information with the genome sequences, we have developed a new database called MPromDb (Mammalian Promoter Database), an information resource of mammalian gene regulatory regions. MPromDb (Version 1.0) contains 28,306 experimentally supported and 32,121 computationally annotated promoters, and mapping of 4,231 experimentally known binding sites, with links to published literature. Each promoter sequence in MPromDb is presented in the form of an image map with annotations of first exon, cis-regulatory elements and plots of CpG scores, with interactive contextual menus for easy navigation. MPromDb provides a platform for comparative genomics of transcriptional regulation, since promoters of orthologous genes are linked with each other and displayed in the same record. The current version contains 9,331 human-mouse orthologous pairs. The database can be searched for promoter sequences, transcription factors, and their direct target genes, through a user-friendly web interface at
  • Distributional Approximations in Genome Reconstruction
    Anant Godbole (East Tennessee State University)
    Joint work with Adam Briska, Shiguo Zhou, and David C. Schwartz.

    Optical Mapping is a system capable of producing genome-wide ordered restriction maps. Such a restriction map provides a description of an organism's genome, a description not unlike the sequence of the genome, albeit at a coarser resolution. Just as comparisons of whole genome sequences are leading to an exciting array of biological advances, comparisons of optical maps will provide a wealth of valuable information.

    Now that optical mapping has entered the high-throughput era, there is a need for software to compare restriction maps of closely related organisms. We present an algorithmic framework for this task, closely modeled after DNA sequence comparison algorithms. The major challenge lies in adapting the exact matching phase of the sequence algorithms to handle the imprecision inherent in determining restriction fragment lengths. Our graph-based approach not only overcomes this challenge, but also can be applied to sequence algorithms, providing advantages over suffix-tree approaches.
  • Graph Compression Algorithms for Efficiently Comparing Genomes
    Steve Goldstein (University of Wisconsin, Madison)
    Joint work with Adam Briska, Shiguo Zhou, and David C. Schwartz.

    Optical Mapping is a system capable of producing genome-wide ordered restriction maps. Such a restriction map provides a description of an organism's genome, a description not unlike the sequence of the genome, albeit at a coarser resolution. Just as comparisons of whole genome sequences are leading to an exciting array of biological advances, comparisons of optical maps will provide a wealth of valuable information.

    Now that optical mapping has entered the high-throughput era, there is a need for software to compare restriction maps of closely related organisms. We present an algorithmic framework for this task, closely modeled after DNA sequence comparison algorithms. The major challenge lies in adapting the exact matching phase of the sequence algorithms to handle the imprecision inherent in determining restriction fragment lengths. Our graph-based approach not only overcomes this challenge, but also can be applied to sequence algorithms, providing advantages over suffix-tree approaches.
  • Searching for Optimal Trees Under Maximum Parsimony
    Tiffani Williams (University of New Mexico)
  • Evaluating a Class of Length-Sensitive Algorithms for Sorting by Reversal
    Ron Pinter (Technion-Israel Institute of Technology)
    Sorting by reversal (SBR) has been used extensively in comparative genomic studies [3]. Traditionally, bioinformaticians have been trying to minimize the number of reversals and they evaluate results by looking at the trace generated by the algorithm and asking whether it makes biological sense. We have introduced a length sensitive cost measure in an attempt to model the likelihood of reversals based on their length. In this model the cost f(x) of each reversal depends on the length, x, of the reversed sequence; the overall cost of the SBR process is the total of the individual reversals costs.

    Initially [4] we looked at f(x)=x, offering a QuickSort-like algorithm which guarantees a provably good approximation of the minimal SBR cost (finding the minimal cost is NP-hard). In response, several biologists suggested we look at the family of functions f(x)=x**alpha. We have developed a class of algorithms [1] that find an approximate cost for any positive value of the exponent alpha, but the question of which value of alpha is best is of great interest.

    We decided to make this evaluation by using the cost of sorting one genome to another as a distance between the genomes that is fed to a tool that builds phylogenetic trees, and then compare the results to evolutionary trees found using other methods. This gives rise to numerous methodical and algorithmic issues, such as:
    - How many common genes are necessary to draw meaningful conclusions?
    - How do we deal with duplicate genes?
    - If the number of common genes for the whole dataset under study is too low
    - how do we put together partial results (i.e. combining trees that were built on subsets of the sample) and how small can the subsets be?
    - Do we really need to rebuild the whole tree or can we accumulate the scores of matches of the partial trees with the reference tree?
    - What similarity score between trees is appropriate for this study?
    - How do we cope with the fact that our algorithms produce only approximate costs?
    But the ultimate question is - how do we scan for the best value of alpha?

    The poster will describe the method and the results on two datasets, including the one from [2] which includes 15 genomes, and discuss their merits.


    [1] Michael A. Bender, Dongdong Ge, Simai He, Haodong Hu, Ron Y. Pinter, Steven Skiena, and Firas Swidan. Improved Bounds on Sorting with Length-Weighted Reversals. To appear in the Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA'04), January 2004.

    [2] William Martin, Tamas Rujan, Erik Richly, Andrea Hansen, Sabine Cornelsen, Thomas Lins, Dario Leister, Bettina Stoebe, Masami Hasegawa, and David Penny. Evolutionary analysis of Arabidopsis, cyanobacterial, and chloroplast genomes reveals plastid phylogeny and thousands of cyanobacterial genes in the nucleus. Proc. Natl. Acad. Sci. USA. September 17, 2002; 99 (19): 12246^Ö12251.

    [3] Pavel A. Pevzner: Computational Molecular Biology - an Algorithmic Approach, MIT Press, 2000.

    [4] Ron Y. Pinter and Steven Skiena. Genomic Sorting with Length-Weighted Reversals. Genome Informatics 13: 103-111 (2002).

    Joint work with Michael A. Bender*, Yaniv Berliner**, Dongdong Ge*, Simai He*, Haodong Hu*, Michael Shmoish**, Meir Shoham**, Steven Skiena*, and Firas Swidan**.

    * Dept. of Computer Science, SUNY Stony Brook, NY 11794-4400.
    ** Dept. of Computer Science, Technion, Israel Institute of Technology, Haifa 32000, Israel