Estimation of Phylogeny Using a General Markov Model
Vivek Jayaswal1, Lars S. Jermiin2, John Robinson3
1School of Mathematics and Statistics, University of Sydney, NSW 2006, Australia; Sydney University Biological Informatics and Technology Centre, University of Sydney, NSW 2006, Australia. 2School of Biological Sciences, University of Sydney, NSW 2006, Australia; Sydney University Biological Informatics and Technology Centre, University of Sydney, NSW 2006, Australia. Unité de Biologie Moléculaire de Gène chez les Extrêmophiles, Institut Pasteur, 75724 Paris Cedex 15, France. 3School of Mathematics and Statistics, University of Sydney, NSW 2006, Australia.
Abstract: The non-homogeneous model of nucleotide substitution proposed by Barry and Hartigan (Stat Sci, 2: 191-210) is the most general model of DNA evolution assuming an independent and identical process at each site. We present a computational solution for this model, and use it to analyse two data sets, each violating one or more of the assumptions of stationarity, homogeneity, and reversibility. The log likelihood values returned by programs based on the F84 model (J Mol Evol, 29: 170-179), the general time reversible model (J Mol Evol, 20: 86-93), and Barry and Hartigan’s model are compared to determine the validity of the assumptions made by the first two models. In addition, we present a method for assessing whether sequences have evolved under reversible conditions and discover that this is not so for the two data sets. Finally, we determine the most likely tree under the three models of DNA evolution and compare these with the one favoured by the tests for symmetry.
Readers of this also read:
- Phylogenetic biodiversity assessment based on systematic nomenclature
- Evolution of proteins and proteomes: a phylogenetics approach
- Environmental Quality, Developmental Plasticity and the Thrifty Phenotype: A Review of Evolutionary Models
- Pattern-Based Phylogenetic Distance Estimation and Tree Reconstruction
- Prediction of Protein-protein Interactions on the Basis of Evolutionary Conservation of Protein Functions