Predicting Consensus Structures for RNA Alignments Via Pseudo-Energy Minimization
Junilda Spirollari1, Jason T.L. Wang1,2, Kaizhong Zhang3, Vivian Bellofatto2, Yongkyu Park4 and Bruce A. Shapiro5
1Bioinformatics Program, Department of Computer Science, New Jersey Institute of Technology, University Heights, Newark, NJ 07102, U.S.A. 2Department of Microbiology and Molecular Genetics, University of Medicine and Dentistry of New Jersey-New Jersey Medical School, International Center for Public Health, 225 Warren Street, Newark, NJ 07103, U.S.A. 3Department of Computer Science, University of Western Ontario, London, Ontario, N6A 5B7, Canada. 4Department of Cell Biology and Molecular Medicine, University of Medicine and Dentistry of New Jersey-New Jersey Medical School, Newark, NJ 07103, U.S.A. 5Center for Cancer Research Nanobiology Program, National Cancer Institute, Frederick, MD 21702, U.S.A.
Abstract
Thermodynamic processes with free energy parameters are often used in algorithms that solve the free energy minimization problem to predict secondary structures of single RNA sequences. While results from these algorithms are promising, an observation is that single sequence-based methods have moderate accuracy and more information is needed to improve on RNA secondary structure prediction, such as covariance scores obtained from multiple sequence alignments. We present in this paper a new approach to predicting the consensus secondary structure of a set of aligned RNA sequences via pseudo-energy minimization. Our tool, called RSpredict, takes into account sequence covariation and employs effective heuristics for accuracy improvement. RSpredict accepts, as input data, a multiple sequence alignment in FASTA or ClustalW format and outputs the consensus secondary structure of the input sequences in both the Vienna style Dot Bracket format and the Connectivity Table format. Our method was compared with some widely used tools including KNetFold, Pfold and RNAalifold. A comprehensive test on different datasets including Rfam sequence alignments and a multiple sequence alignment obtained from our study on the Drosophila X chromosome reveals that RSpredict is competitive with the existing tools on the tested datasets. RSpredict is freely available online as a web server and also as a jar file for download at http:// datalab.njit.edu/biology/RSpredict.
Readers of this also read:
- Predicting Consensus Structures for RNA Alignments Via Pseudo-Energy Minimization
- Universal Features for the Classification of Coding and Non-Coding DNA Sequences
- Sequence Analysis of the Full-length cDNA and Protein Structure Homology Modeling of FABP2 from Paralichthys Olivaceus
- Sequence Analysis of the Full-length cDNA and Protein Structure Homology Modeling of FABP2 from Paralichthys Olivaceus
- Bioinformatics as a Tool for Assessing the Quality of Sub-Cellular Proteomic Strategies and Inferring Functions of Proteins: Plant Cell Wall Proteomics as a Test Case