Classifying Coding DNA with Nucleotide Statistics

Nicolas Carels; Diego Frías

Classifying Coding DNA with Nucleotide Statistics

Submit a Paper

Download PDF

Other Downloads

Authors: Nicolas Carels and Diego Frías

Publication Date: 28 Oct 2009

Type: Original Research

Journal: Bioinformatics and Biology Insights

Citation: Bioinformatics and Biology Insights 2009:3 141-154

doi: 10.4137/BBI.S3030

6,078 Article Views

Article Metrics

Abstract and Sharing
Related Articles
Article Metrics
Discuss

Abstract

In this report, we compared the success rate of classification of coding sequences (CDS) vs. introns by Codon Structure Factor (CSF) and by a method that we called Universal Feature Method (UFM). UFM is based on the scoring of purine bias (Rrr) and stop codon frequency. We show that the success rate of CDS/intron classification by UFM is higher than by CSF. UFM classifies ORFs as coding or non-coding through a score based on (i) the stop codon distribution, (ii) the product of purine probabilities in the three positions of nucleotide triplets, (iii) the product of Cytosine (C), Guanine (G), and Adenine (A) probabilities in the 1st, 2nd, and 3rd positions of triplets, respectively, (iv) the probabilities of G in 1st and 2nd position of triplets and (v) the distance of their GC3 vs. GC2 levels to the regression line of the universal correlation. More than 80% of CDSs (true positives) of Homo sapiens (>250 bp), Drosophila melanogaster (>250 bp) and Arabidopsis thaliana (>200 bp) are successfully classified with a false positive rate lower or equal to 5%. The method releases coding sequences in their coding strand and coding frame, which allows their automatic translation into protein sequences with 95% confidence. The method is a natural consequence of the compositional bias of nucleotides in coding sequences.

Downloads

PDF (1.85 MB PDF FORMAT)

RIS citation (ENDNOTE, REFERENCE MANAGER, PROCITE, REFWORKS)

BibTex citation (BIBDESK, LATEX)

XML

PMC HTML

Our Service Promise

Prompt Processing (3 Weeks to Editorial Decision)
Fair, Independent Peer Review
High Visibility & Extensive Indexing

What Your Colleagues Say About Bioinformatics and Biology Insights

The publication of our paper in Bioinformatics and Biology Insights was highly professional and very pleasant on all levels: the guidelines for authors are concise, the online submission system is user-friendly, the comments from the reviewers were insightful and improved our paper, and the preparation of the manuscript for publication was efficient. I particularly liked the fast feedback from the staff on the state of the submission and review process.

Professor Werner Braun (Department of Biochemistry and Molecular Biology, Sealy Center for Structural Biology and Molecular Biophysics, University of Texas Medical Branch, Galveston, Texas, USA)

More Testimonials

Quick Links

LATEST NEWS

Libertas Upgrades Journal RSS Service for Readers

Computational Small RNA Prediction in Bacteria

DNA Damage, Autophagy, Cell Death, and Senescence Model

Published This Week (4 March to 8 March)

Follow Libertas on LinkedIn

Inferring interactions between subunits of the BK channel

Published This Week (25 February to 1 March)

Published During February

A Microbial Metagenome in Caenorhabditis Whole Genome Sequences

New Service: Ask Editor in Chief