Prediction of Breast Cancer Metastasis by Gene Expression Profiles: A Comparison of Metagenes and Single Genes

Mark Burton; Mads Thomassen; Qihua Tan; Torben A. Kruse

JOURNAL

Cancer Informatics

1,042,797 Journal Article Views | Journal Analytics

Prediction of Breast Cancer Metastasis by Gene Expression Profiles: A Comparison of Metagenes and Single Genes

Submit a Paper

Download PDF

Other Downloads

Authors: Mark Burton, Mads Thomassen, Qihua Tan and Torben A. Kruse

Publication Date: 10 Dec 2012

Type: Original Research

Journal: Cancer Informatics

Citation: Cancer Informatics 2012:11 193-217

doi: 10.4137/CIN.S10375

4,141 Article Views

Article Metrics

Abstract and Sharing
Related Articles
Article Metrics
Discuss

Abstract

Background: The popularity of a large number of microarray applications has in cancer research led to the development of predictive or prognostic gene expression profiles. However, the diversity of microarray platforms has made the full validation of such profiles and their related gene lists across studies difficult and, at the level of classification accuracies, rarely validated in multiple independent datasets. Frequently, while the individual genes between such lists may not match, genes with same function are included across such gene lists. Development of such lists does not take into account the fact that genes can be grouped together as metagenes (MGs) based on common characteristics such as pathways, regulation, or genomic location. Such MGs might be used as features in building a predictive model applicable for classifying independent data. It is, therefore, demanding to systematically compare independent validation of gene lists or classifiers based on metagene or individual gene (SG) features.

Methods: In this study we compared the performance of either metagene- or single gene-based feature sets and classifiers using random forest and two support vector machines for classifier building. The performance within the same dataset, feature set validation performance, and validation performance of entire classifiers in strictly independent datasets were assessed by 10 times repeated 10-fold cross validation, leave-one-out cross validation, and one-fold validation, respectively. To test the significance of the performance difference between MG- and SG-features/classifiers, we used a repeated down-sampled binomial test approach.

Results: MG- and SG-feature sets are transferable and perform well for training and testing prediction of metastasis outcome in strictly independent data sets, both between different and within similar microarray platforms, while classifiers had a poorer performance when validated in strictly independent datasets. The study showed that MG- and SG-feature sets perform equally well in classifying independent data. Furthermore, SG-classifiers significantly outperformed MG-classifier when validation is conducted between datasets using similar platforms, while no significant performance difference was found when validation was performed between different platforms.

Conclusion: Prediction of metastasis outcome in lymph node–negative patients by MG- and SG-classifiers showed that SG-classifiers performed significantly better than MG-classifiers when validated in independent data based on the same microarray platform as used for developing the classifier. However, the MG- and SG-classifiers had similar performance when conducting classifier validation in independent data based on a different microarray platform. The latter was also true when only validating sets of MG- and SG-features in independent datasets, both between and within similar and different platforms.

Downloads

PDF (1.32 MB PDF FORMAT)

RIS citation (ENDNOTE, REFERENCE MANAGER, PROCITE, REFWORKS)

BibTex citation (BIBDESK, LATEX)

XML

PMC HTML

What Your Colleagues Say About Cancer Informatics

Cancer Informatics has become an increasingly important source for research in the methodology of cancer genomics and the novel use of informatics technology. I have been impressed by the journal's contents and have been very gratified by the number of accesses to my recent publication. Cancer Informatics has filled an important gap in cancer research journals.

Dr Richard Simon (Chief, Biometric Research Branch, National Cancer Institute, USA )

More Testimonials

Quick Links

LATEST NEWS

Educational Effects of a Tailored Leaflet Addressing Drinking During Pregnancy

Molecular Characterization of Cysteine Protease YopT

Intense Exercise Increases Protein Oxidation in Spleen and Liver

L. Pneumophila in Cooling Tower Water

Most Viewed Articles in Evolutionary Bioinformatics for 2013

Ascites Drainage Leading to Intestinal Adhesions with Fatal Outcome

BPA in Three Economically Distinct Regions

Knee OA, Alignment and Range of Motion

Cannabis Residues in Psychiatric Patients in Uganda

PPARγ and Adipogenesis in Beef Muscle