Comparison of Two Output-Coding Strategies for Multi-Class Tumor Classification Using Gene Expression Data and Latent Variable Model as Binary Classifier
Sandeep J. Joseph1, Kelly R. Robbins1, Wensheng Zhang1 and Romdhane Rekaya1,2,3
1Rhodes Centre for Animal and Dairy Science, 2Institute of Bioinformatics, 3Department of Statistics, University of Georgia, Athens, GA-30605, USA.
Abstract
Multi-class cancer classification based on microarray data is described. A generalized output-coding scheme based on One Versus One (OVO) combined with Latent Variable Model (LVM) is used. Results from the proposed One Versus One (OVO) output- coding strategy is compared with the results obtained from the generalized One Versus All (OVA) method and their efficiencies of using them for multi-class tumor classification have been studied. This comparative study was done using two microarray gene expression data: Global Cancer Map (GCM) dataset and brain cancer (BC) dataset. Primary feature selection was based on fold change and penalized t-statistics. Evaluation was conducted with varying feature numbers. The OVO coding strategy worked quite well with the BC data, while both OVO and OVA results seemed to be similar for the GCM data. The selection of output coding methods for combining binary classifiers for multi-class tumor classification depends on the number of tumor types considered, the discrepancies between the tumor samples used for training as well as the heterogeneity of expression within the cancer subtypes used as training data.
Readers of this also read:
- Comparison of Two Output-Coding Strategies for Multi-Class Tumor Classification Using Gene Expression Data and Latent Variable Model as Binary Classifier
- A Novel Strategy for Mechanism Based Computational Drug Discovery
- Expression a La Bimode
- Bimodal Gene Expression and Biomarker Discovery
- The Power of the Web in Cancer Drug Discovery and Clinical Trial Design: Research without a Laboratory?