Evolutionary Bioinformatics

Maximum Likelihood Analyses of 3,490 rbcL Sequences: Scalability of Comprehensive Inference Versus Group-Specific Taxon Sampling

Authors: Alexandros Stamatakis, Markus Göker and Guido W. Grimm

Publication Date: 24 May 2010

Evolutionary Bioinformatics 2010:6 73-90

Alexandros Stamatakis¹, Markus Göker² and Guido W. Grimm³

¹The Exelixis Lab, Dept. of Computer Science, Technische Universität München, Germany. ²German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany. ³Department of Palaeobotany, Swedish Museum of Natural History, Stockholm, Sweden.

Abstract

The constant accumulation of sequence data poses new computational and methodological challenges for phylogenetic inference, since multiple sequence alignments grow both in the horizontal (number of base pairs, phylogenomic alignments) as well as vertical (number of taxa) dimension. Put aside the ongoing controversial discussion about appropriate models, partitioning schemes, and assembly methods for phylogenomic alignments, coupled with the high computational cost to infer these, for many organismic groups, a sufficient number of taxa is often exclusively available from one or just a few genes (e.g., rbcL, matK, rDNA). In this paper we address scalability of Maximum-Likelihood-based phylogeny reconstruction with respect to the number of taxa by example of several large nested single-gene rbcL alignments comprising 400 up to 3,491 taxa. In order to test the effect of taxon sampling, we employ an appropriately adapted taxon jackknifing approach. In contrast to standard jackknifing, this taxon subsampling procedure is not conducted entirely at random, but based on drawing subsamples from empirical taxon-groups which can either be user-defined or determined by using taxonomic information from databases. Our results indicate that, despite an unfavorable number of sequences to number of base pairs ratio, i.e., many relatively short sequences, Maximum Likelihood tree searches and bootstrap analyses scale well on single-gene rbcL alignments with a dense taxon sampling up to several thousand sequences. Moreover, the newly implemented taxon subsampling procedure can be beneficial for inferring higher level relationships and interpreting bootstrap support from comprehensive analysis.

Categories: Evolutionary bioinformatics , Phylogenetics

Keywords: RAxML, phylogenetic inference, many taxon analyses, taxon jackknifing

Download this full text open access article

(2.08 MB PDF format)

Send to Endnote

Readers of this also read:

Maximum Likelihood Analyses of 3,490 rbcL Sequences: Scalability of Comprehensive Inference Versus Group-Specific Taxon Sampling
Romiplostim: Therapeutic Value in Chronic Immune Thrombocytopenic Purpura
Targeted Therapy for Locally Advanced or Metastatic Non-Small Cell Lung Cancer—Patient Selection Changes the Fate of Gefitinib
Apolipoprotein E Allelic Frequency Altered in Women with Early-onset Breast Cancer
Comparison of Co-phenylcaine Spray or Lidocaine/Epinephrine Nasal Packing for Flexible Laryngoscopy

Evolutionary Bioinformatics

Dennis Wall, Editor in Chief

New article RSS feed

Journal newsletter

--- Subjects ---
Bioinformatics
Biology
Biomarkers
Cancer
Chemistry
Drugs and therapeutics
Environment
Genomics
Immunology
Infectious Diseases
Medicine
Neurology
Physics
Prevention and Rehabilitation
--- Journals ---
Advances in Tumor Virology
Air, Soil and Water Research
Analytical Chemistry Insights
Autism Insights
Biochemistry Insights
Bioinformatics and Biology Insights
Biomarker Insights
Biomarkers in Cancer
Biomedical Engineering and Computational Biology
Biomedical Informatics Insights
Bone and Tissue Regeneration Insights
Breast Cancer: Basic and Clinical Research
Cancer Growth and Metastasis
Cancer Informatics
Cell & Tissue Transplantation & Therapy
Cell Biology Insights
Cell Communication Insights
Clinical Medicine Insights: Arthritis and Musculoskeletal Disorders
Clinical Medicine Insights: Blood Disorders
Clinical Medicine Insights: Cardiology
Clinical Medicine Insights: Case Reports
Clinical Medicine Insights: Circulatory, Respiratory and Pulmonary Medicine
Clinical Medicine Insights: Dermatology
Clinical Medicine Insights: Ear, Nose and Throat
Clinical Medicine Insights: Endocrinology and Diabetes
Clinical Medicine Insights: Gastroenterology
Clinical Medicine Insights: Geriatrics
Clinical Medicine Insights: Oncology
Clinical Medicine Insights: Pathology
Clinical Medicine Insights: Pediatrics
Clinical Medicine Insights: Psychiatry
Clinical Medicine Insights: Reproductive Health
Clinical Medicine Insights: Therapeutics
Clinical Medicine Insights: Trauma and Intensive Medicine
Clinical Medicine Insights: Urology
Clinical Medicine Insights: Women's Health
Drug Target Insights
Environmental Health Insights
Evolutionary Bioinformatics
Gene Expression to Genetical Genomics
Gene Regulation and Systems Biology
Genetics & Epigenetics
Genomics Insights
Glycobiology Insights
Health Services Insights
Healthy Aging & Clinical Care in the Elderly
Human Parasitic Diseases
Immunology and Immunogenetics Insights
Immunotherapy Insights
Indian Journal of Clinical Medicine
Infectious Diseases: Research and Treatment
Integrative Medicine Insights
International Journal of Insect Science
International Journal of Tryptophan Research
Journal of Cell Death
Journal of Central Nervous System Disease
Journal of Experimental Neuroscience
Lipid Insights
Lymphoma and Chronic Lymphocytic Leukemias
Magnetic Resonance Insights
Medical Equipment Insights
Microbiology Insights
Nutrition and Metabolic Insights
Open Journal of Cardiovascular Surgery
Ophthalmology and Eye Diseases
Organic Chemistry Insights
Palliative Care: Research and Treatment
Particle Physics Insights
Perspectives in Medicinal Chemistry
Primary Prevention Insights
Proteomics Insights
Rehabilitation Process and Outcome
Reproductive Biology Insights
Retrovirology: Research and Treatment
Signal Transduction Insights
Substance Abuse: Research and Treatment
Tobacco Use Insights
Translational Oncogenomics
Virology: Research and Treatment

Evolutionary Bioinformatics

Maximum Likelihood Analyses of 3,490 rbcL Sequences: Scalability of Comprehensive Inference Versus Group-Specific Taxon Sampling

Evolutionary Bioinformatics

Dennis Wall, Editor in Chief

Submit a paper