Phylogenetic reconstruction from arbitrary gene-order data

Phylogenetic reconstruction from gene-order data has attracted attention from both biologists and computer scientists over the last few years. So far, our software suite GRAPPA is the most accurate approach, but it requires that all genomes have identical gene content, with each gene appearing exactly once in each genome. Some progress has been made in handling genomes with unequal gene content, both in terms of computing pair-wise genomic distances and in terms of reconstruction. In this paper, we present a new approach for computing the median of three arbitrary genomes and apply it to the reconstruction of phylogenies from arbitrary gene-order data. We implemented these methods within GRAPPA and tested them on simulated datasets under various conditions as well as on a real dataset of chloroplast genomes; we report the results of our simulations and our analysis of the real dataset and compare them to reconstructions made by using neighbor-joining and using the original GRAPPA on the same genomes with equalized gene contents. Our new approach is remarkably accurate both in simulations and on the real dataset, in contrast to the distance-based approaches and to reconstructions using the original GRAPPA applied to equalized gene contents.

[1]  Debashish Bhattacharya,et al.  Actin Phylogeny Identifies Mesostigma viride as a Flagellate Ancestor of the Land Plants , 1998, Journal of Molecular Evolution.

[2]  Breakpoint Phylogenies. , 1997, Genome informatics. Workshop on Genome Informatics.

[3]  Bernard M. E. Moret,et al.  Finding an Optimal Inversion Median: Experimental Results , 2001, WABI.

[4]  D. Sankoff,et al.  Gene Order Breakpoint Evidence in Animal Mitochondrial Phylogeny , 1999, Journal of Molecular Evolution.

[5]  Jeffrey D. Palmer,et al.  Use of Chloroplast DNA Rearrangements in Reconstructing Plant Phylogeny , 1992 .

[6]  Bernard M. E. Moret,et al.  An Empirical Comparison of Phylogenetic Methods on Chloroplast Gene Order Data in Campanulaceae , 2000 .

[7]  David A. Bader,et al.  A detailed study of breakpoint analysis , 2001 .

[8]  Krister M. Swenson,et al.  Genomic Distances under Deletions and Insertions , 2004, Theor. Comput. Sci..

[9]  Brandon S. Gaut,et al.  Extensive gene gain associated with adaptive evolution of poxviruses , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Alberto Caprara,et al.  On the Practical Solution of the Reversal Median Problem , 2001, WABI.

[11]  Tandy J. Warnow,et al.  New approaches for reconstructing phylogenies from gene order data , 2001, ISMB.

[12]  Charles F. Delwiche,et al.  The Closest Living Relatives of Land Plants , 2001, Science.

[13]  David Sankoff,et al.  Multiple Genome Rearrangement and Breakpoint Phylogeny , 1998, J. Comput. Biol..

[14]  M. Hasegawa,et al.  Gene transfer to the nucleus and the evolution of chloroplasts , 1998, Nature.

[15]  Tandy J. Warnow,et al.  Steps toward accurate reconstructions of phylogenies from gene-order data , 2002, J. Comput. Syst. Sci..

[16]  Jijun Tang,et al.  Phylogenetic Reconstruction from Gene-Rearrangement Data with Unequal Gene Content , 2003, WADS.

[17]  Jijun Tang,et al.  Scaling up accurate phylogenetic reconstruction from gene-order data , 2003, ISMB.

[18]  J. Palmer,et al.  Function and evolution of a minimal plastid genome from a nonphotosynthetic parasitic plant. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Nadia El-Mabrouk,et al.  Genome Rearrangement by Reversals and Insertions/Deletions of Contiguous Segments , 2000, CPM.

[20]  Alberto Caprara,et al.  Formulations and hardness of multiple sorting by reversals , 1999, RECOMB.

[21]  Claude W. dePamphilis,et al.  The Chlamydomonas reinhardtii Plastid Chromosome , 2002, The Plant Cell Online.

[22]  Monique Turmel,et al.  Ancestral chloroplast genome in Mesostigma viride reveals an early branch of green plant evolution , 2000, Nature.