Bioinformatics methods for the comparative analysis of metazoan mitochondrial genome sequences.

In this review we provide an overview of various bioinformatics methods and tools for the analysis of metazoan mitochondrial genomes. We compare available dedicated databases and present current tools for accurate genome annotation, identification of protein coding genes, and determination of tRNA and rRNA models.We also evaluate various tools and models for phylogenetic tree inference using gene order or sequence based data. As for gene order based methods, we compare rearrangement based and gene cluster based methods for gene order rearrangement analysis. As for sequence based methods, we give special emphasis to substitution models or data treatment that reduces certain systematic biases that are typical for metazoan mitogenomes such as within genome and/or among lineage compositional heterogeneity.

[1]  Pietro Liò,et al.  Phylogenetic analysis of mitochondrial protein coding genes confirms the reciprocal paraphyly of Hexapoda and Crustacea , 2007, BMC Evolutionary Biology.

[2]  Hervé Philippe,et al.  Evaluation of the models handling heterotachy in phylogenetic inference , 2007, BMC Evolutionary Biology.

[3]  Andrew P. Gibson,et al.  A comprehensive analysis of mammalian mitochondrial genome base composition and improved phylogenetic methods. , 2005, Molecular biology and evolution.

[4]  Guillaume Fertin,et al.  Combinatorics of Genome Rearrangements , 2009, Computational molecular biology.

[5]  Sean R. Eddy,et al.  Infernal 1.0: inference of RNA alignments , 2009, Bioinform..

[6]  H. Philippe,et al.  A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. , 2004, Molecular biology and evolution.

[7]  Björn Canbäck,et al.  ARWEN: a program to detect tRNA genes in metazoan mitochondrial nucleotide sequences , 2008, Bioinform..

[8]  Thomas J Naughton,et al.  Assessment of methods for amino acid matrix selection and their use on empirical data shows that ad hoc assumptions for choice of matrix are not justified , 2006, BMC Evolutionary Biology.

[9]  J. Boore Animal mitochondrial genomes. , 1999, Nucleic acids research.

[10]  R. Copley,et al.  Acoelomorph flatworms are deuterostomes related to Xenoturbella , 2011, Nature.

[11]  Cédric Chauve,et al.  ANGES: reconstructing ANcestral GEnomeS maps , 2012, Bioinform..

[12]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[13]  Sean R. Eddy,et al.  Infernal 1.0: inference of RNA alignments , 2009, Bioinform..

[14]  P. Stadler,et al.  Improved systematic tRNA gene annotation allows new insights into the evolution of mitochondrial tRNA structures and into the mechanisms of mitochondrial genome rearrangements , 2011, Nucleic acids research.

[15]  Peter G Foster,et al.  Modeling compositional heterogeneity. , 2004, Systematic biology.

[16]  J. Boore,et al.  Arachnid relationships based on mitochondrial genomes: asymmetric nucleotide and amino acid bias affects phylogenetic analyses. , 2009, Molecular phylogenetics and evolution.

[17]  Tyra G. Wolfsberg,et al.  Organelle genome resources at NCBI , 2001 .

[18]  David Sankoff,et al.  Common Intervals and Symmetric Difference in a Model-Free Phylogenomics, with an Application to Streptophyte Evolution , 2006, Comparative Genomics.

[19]  D. Posada jModelTest: phylogenetic model averaging. , 2008, Molecular biology and evolution.

[20]  Flavio Licciulli,et al.  Update of AMmtDB: a database of multi-aligned metazoa mitochondrial DNA sequences , 1999, Nucleic Acids Res..

[21]  Richard Friedberg,et al.  Efficient sorting of genomic permutations by translocation, inversion and block interchange , 2005, Bioinform..

[22]  M. Hasegawa,et al.  Model of amino acid substitution in proteins encoded by mitochondrial DNA , 1996, Journal of Molecular Evolution.

[23]  David Sankoff,et al.  On the PATHGROUPS approach to rapid small phylogeny , 2011, BMC Bioinformatics.

[24]  Z. Yang,et al.  Models of amino acid substitution and applications to mitochondrial protein evolution. , 1998, Molecular biology and evolution.

[25]  D. Posada,et al.  Genetic code prediction for metazoan mitochondria with GenDecoder. , 2009, Methods in molecular biology.

[26]  Matthias Bernt,et al.  Finding all sorting tandem duplication random loss operations , 2009, J. Discrete Algorithms.

[27]  Matthias Bernt,et al.  CREx: inferring genomic rearrangements based on common intervals , 2007, Bioinform..

[28]  Tandy J. Warnow,et al.  Distance-Based Genome Rearrangement Phylogeny , 2006, Journal of Molecular Evolution.

[29]  Jian Ma A probabilistic framework for inferring ancestral genomic orders , 2010, 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[30]  M. Whiting,et al.  Characterization of 67 mitochondrial tRNA gene rearrangements in the Hymenoptera suggests that mitochondrial tRNA gene position is selectively neutral. , 2009, Molecular biology and evolution.

[31]  Tandy J. Warnow,et al.  New approaches for reconstructing phylogenies from gene order data , 2001, ISMB.

[32]  Mathieu Raffinot,et al.  Gene teams: a new formalization of gene clusters for comparative genomics , 2003, Comput. Biol. Chem..

[33]  D. Kowbel,et al.  Nucleotide sequence of nine protein-coding genes and 22 tRNAs in the mitochondrial DNA of the sea starPisaster ochraceus , 1990, Journal of Molecular Evolution.

[34]  Graziano Pesole,et al.  Hypervariability of ascidian mitochondrial gene order: exposing the myth of deuterostome organelle genome stability. , 2010, Molecular biology and evolution.

[35]  Sarah J. Bourlat,et al.  The mitochondrial genome structure of Xenoturbella bocki (phylum Xenoturbellida) is ancestral within the deuterostomes , 2009, BMC Evolutionary Biology.

[36]  Matthias Bernt,et al.  An Algorithm for Inferring Mitogenome Rearrangements in a Phylogenetic Tree , 2008, RECOMB-CG.

[37]  P. Stadler,et al.  MITOS: improved de novo metazoan mitochondrial genome annotation. , 2013, Molecular phylogenetics and evolution.

[38]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[39]  Nathan C. Sheffield,et al.  Mitochondrial genomics in Orthoptera using MOSAS , 2010, Mitochondrial DNA.

[40]  Olivier Gascuel,et al.  Empirical profile mixture models for phylogenetic reconstruction , 2008, Bioinform..

[41]  J. Boore The duplication/random loss model for gene rearrangement exemplified by mitochondrial genomes of deu , 2000 .

[42]  Timothy M. Collins,et al.  Deducing the pattern of arthropod phytogeny from mitochondrial DNA rearrangements , 1995, Nature.

[43]  A. Hassanin Phylogeny of Arthropoda inferred from mitochondrial sequences: strategies for limiting the misleading effects of multiple changes in pattern and rates of substitution. , 2006, Molecular phylogenetics and evolution.

[44]  Nicolas Lartillot,et al.  PhyloBayes 3: a Bayesian software package for phylogenetic reconstruction and molecular dating , 2009, Bioinform..

[45]  J. Boore,et al.  Requirements and standards for organelle genome databases. , 2006, Omics : a journal of integrative biology.

[46]  Matthias Bernt,et al.  Preserving Inversion Phylogeny Reconstruction , 2012, WABI.

[47]  Hervé Philippe,et al.  Lack of resolution in the animal phylogeny: closely spaced cladogeneses or undetected systematic errors? , 2007, Molecular biology and evolution.

[48]  Jens Stoye,et al.  On the Similarity of Sets of Permutations and Its Applications to Genome Comparison , 2003, COCOON.

[49]  R. Zardoya,et al.  Evolution of gastropod mitochondrial genome arrangements , 2008, BMC Evolutionary Biology.

[50]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[51]  Satish Rao,et al.  On the tandem duplication-random loss model of genome rearrangement , 2006, SODA '06.

[52]  S. Whelan,et al.  A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. , 2001, Molecular biology and evolution.

[53]  Jens Stoye,et al.  Algorithms for Finding Gene Clusters , 2001, WABI.

[54]  Domenica D'Elia,et al.  MitBASE: a comprehensive and integrated mitochondrial DNA database , 1999, Nucleic Acids Res..

[55]  Ziheng Yang,et al.  MtZoa: a general mitochondrial amino acid substitutions model for animal evolutionary studies. , 2009, Molecular phylogenetics and evolution.

[56]  Guillaume Bourque,et al.  Recovering genome rearrangements in the mammalian phylogeny. , 2009, Genome research.

[57]  J. Boore,et al.  The use of genome-level characters for phylogenetic reconstruction. , 2006, Trends in ecology & evolution.

[58]  Enno Ohlebusch,et al.  A fast algorithm for the multiple genome rearrangement problem with weighted reversals and transpositions , 2008, BMC Bioinformatics.

[59]  G. Grillo,et al.  MitoNuc : a database of nuclear genes encoding for mitochondrial proteins , 2004 .

[60]  Jens Stoye,et al.  A Unifying View of Genome Rearrangements , 2006, WABI.

[61]  Bernhard Moser,et al.  On the T , 2006, Fuzzy Sets Syst..

[62]  Bernard B. Suh,et al.  Reconstructing contiguous regions of an ancestral genome. , 2006, Genome research.

[63]  D. Posada,et al.  Model selection and model averaging in phylogenetics: advantages of akaike information criterion and bayesian approaches over likelihood ratio tests. , 2004, Systematic biology.

[64]  O. Gascuel,et al.  Mitochondrial genes support a common origin of rodent malaria parasites and Plasmodium falciparum's relatives infecting great apes , 2011, BMC Evolutionary Biology.

[65]  Peter F Stadler,et al.  Alignments of mitochondrial genome arrangements: applications to metazoan phylogeny. , 2006, Journal of theoretical biology.

[66]  Jens Stoye,et al.  Approximative Gencluster und ihre Anwendung in der komparativen Genomik , 2009, Informatik-Spektrum.

[67]  Matthias Bernt,et al.  A method for computing an inventory of metazoan mitochondrial gene order rearrangements , 2011, BMC Bioinformatics.

[68]  Cédric Chauve,et al.  A Methodological Framework for the Reconstruction of Contiguous Regions of Ancestral Genomes and Its Application to Mammalian Genomes , 2008, PLoS Comput. Biol..

[69]  M. Blaxter,et al.  The effect of model choice on phylogenetic inference using mitochondrial sequence data: lessons from the scorpions. , 2007, Molecular phylogenetics and evolution.

[70]  P. Pevzner,et al.  Genome-scale evolution: reconstructing gene orders in the ancestral species. , 2002, Genome research.

[71]  Jeffrey L. Boore,et al.  Gene translocation links insects and crustaceans , 1998, Nature.

[72]  David Sankoff,et al.  Multichromosomal median and halving problems under different genomic distances , 2009, BMC Bioinformatics.

[73]  Daniel Jameson,et al.  OGRe: a relational database for comparative analysis of mitochondrial genomes , 2003, Nucleic Acids Res..

[74]  Nelli Shimko,et al.  GOBASE: the organelle genome database , 2001, Nucleic Acids Res..

[75]  Graziano Pesole,et al.  MitoNuc: a database of nuclear genes coding for mitochondrial proteins. Update 2002 , 2002, Nucleic Acids Res..

[76]  J. Boore,et al.  Sequencing and comparing whole mitochondrial genomes of animals. , 2005, Methods in enzymology.

[77]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[78]  Jens Stoye,et al.  A Unified Approach for Reconstructing Ancient Gene Clusters , 2009, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[79]  Enno Ohlebusch,et al.  Sorting by Weighted Reversals, Transpositions, and Inverted Transpositions , 2006, RECOMB.

[80]  Joern Pütz,et al.  Mamit-tRNA, a database of mammalian mitochondrial tRNA primary and secondary structures. , 2007, RNA.

[81]  João Meidanis,et al.  SCJ: A Breakpoint-Like Distance that Simplifies Several Rearrangement Problems , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[82]  S. Eddy,et al.  tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. , 1997, Nucleic acids research.

[83]  D. Sankoff,et al.  Gene order comparisons for phylogenetic inference: evolution of the mitochondrial genome. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[84]  Laura F. Landweber,et al.  Rewiring the keyboard: evolvability of the genetic code , 2001, Nature Reviews Genetics.

[85]  Mattia D'Antonio,et al.  MitoZoa: a curated mitochondrial genome database of metazoans for comparative genomics studies. , 2010, Mitochondrion.

[86]  Meng Zhang,et al.  Maximum likelihood phylogenetic reconstruction using gene order encodings , 2011, 2011 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).

[87]  Bret Larget,et al.  A Bayesian approach to the estimation of ancestral genome arrangements. , 2005, Molecular phylogenetics and evolution.

[88]  I. Mizrachi GenBank : The Nucleotide Sequence Database , 2002 .

[89]  István Miklós,et al.  Approximating the number of Double Cut-and-Join scenarios , 2012, Theor. Comput. Sci..

[90]  L. Jermiin,et al.  Nucleotide Composition Bias Affects Amino Acid Content in Proteins Coded by Animal Mitochondria , 1997, Journal of Molecular Evolution.

[91]  M. Huynen,et al.  FACIL: Fast and Accurate Genetic Code Inference and Logo , 2011, Bioinform..

[92]  Young Uk Kim,et al.  Mitome: dynamic and interactive database for comparative mitochondrial genomics in metazoan animals , 2007, Nucleic Acids Res..

[93]  Nicolas Lartillot,et al.  A site- and time-heterogeneous model of amino acid replacement. , 2008, Molecular biology and evolution.

[94]  Tandy J. Warnow,et al.  A New Fast Heuristic for Computing the Breakpoint Phylogeny and Experimental Phylogenetic Analyses of Real and Synthetic Data , 2000, ISMB.

[95]  David Posada,et al.  MtArt: a new model of amino acid replacement for Arthropoda. , 2006, Molecular biology and evolution.

[96]  J. Daub,et al.  Ecdysozoan Mitogenomics: Evidence for a Common Origin of the Legged Invertebrates, the Panarthropoda , 2010, Genome biology and evolution.

[97]  Ana Cláudia Lessinger,et al.  AMiGA: the arthropodan mitochondrial genomes accessible database , 2006, Bioinform..

[98]  G. C. Pomraning A Method for Computing , 1965 .

[99]  Willem-Jan van den Heuvel,et al.  The Methodological Framework , 2007 .

[100]  G. Serio,et al.  A new method for calculating evolutionary substitution rates , 2005, Journal of Molecular Evolution.

[101]  P. Stadler,et al.  Genetic aspects of mitochondrial genome evolution. , 2013, Molecular phylogenetics and evolution.

[102]  Pavel A. Pevzner,et al.  Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals , 1995, JACM.

[103]  Blanchette,et al.  Breakpoint Phylogenies. , 1997, Genome informatics. Workshop on Genome Informatics.

[104]  C. Gissi,et al.  Evolutionary genomics in Metazoa: the mitochondrial DNA as a model system. , 1999, Gene.

[105]  Peter F. Stadler,et al.  tRNAdb 2009: compilation of tRNA sequences and tRNA genes , 2008, Nucleic Acids Res..

[106]  Robert K. Jansen,et al.  Automatic annotation of organellar genomes with DOGMA , 2004, Bioinform..

[107]  D. Sankoff,et al.  Comparative Genomics: "Empirical And Analytical Approaches To Gene Order Dynamics, Map Alignment And The Evolution Of Gene Families" , 2000 .

[108]  Israel T. da Silva,et al.  MamMiBase: a mitochondrial genome database for mammalian phylogenetic studies , 2005, Bioinform..

[109]  R. Durbin,et al.  RNA sequence analysis using covariance models. , 1994, Nucleic acids research.

[110]  David Posada,et al.  ProtTest: selection of best-fit models of protein evolution , 2005, Bioinform..