Bioinformatics and Biology Insights 2015:9 103-109
Original Research
Published on 05 Jul 2015
DOI: 10.4137/BBI.S26864
Sign up for email alerts to receive notifications of new articles published in Bioinformatics and Biology Insights
O-glycosylation is one of the main types of the mammalian protein glycosylation; it occurs on the particular site of serine (S) or threonine (T). Several O-glycosylation site predictors have been developed. However, a need to get even better prediction tools remains. One challenge in training the classifiers is that the available datasets are highly imbalanced, which makes the classification accuracy for the minority class to become unsatisfactory. In our previous work, we have proposed a new classification approach, which is based on particle swarm optimization (PSO) and random forest (RF); this approach has considered the imbalanced dataset problem. The PSO parameters setting in the training process impacts the classification accuracy. Thus, in this paper, we perform parameters optimization for the PSO algorithm, based on genetic algorithm, in order to increase the classification accuracy. Our proposed genetic algorithm-based approach has shown better performance in terms of area under the receiver operating characteristic curve against existing predictors. In addition, we implemented a glycosylation predictor tool based on that approach, and we demonstrated that this tool could successfully identify candidate glycosylation sites in case study protein.
PDF (896.48 KB PDF FORMAT)
RIS citation (ENDNOTE, REFERENCE MANAGER, PROCITE, REFWORKS)
Supplementary Files 1 (489.14 KB ZIP FORMAT)
BibTex citation (BIBDESK, LATEX)
PMC HTML
I have had the honor to work with the professional team at Bioinformatics and Biology Insights. The reviewers' recommendations were very interesting and I am satisfied of the article quality. I encourage scientists to submit their work to Libertas Academica.
Facebook Google+ Twitter
Pinterest Tumblr YouTube