Abstract Pingzhao Hu1, Celia M.T. Greenwood1,2 and Joseph Beyene2,3
1Program in Genetics and Genomic Biology, The Hospital for Sick Children Research Institute, 15–706 TMDT, 101 College Street, Toronto, ON, M5G 1L7, Canada. 2Department of Public Health Sciences, University of Toronto, Health Sciences Building, 155 College St, Toronto, ON, M5T 3M7, Canada. 3Child Health Evaluative Sciences, The Hospital for Sick Children Research Institute, 555 University Ave, Toronto, ON, M5G 1X8, Canada.
Abstract
Background: Microarray technology has been previously used to identify genes that are differentially expressed between tumour and normal samples in a single study, as well as in syntheses involving multiple studies. When integrating results from several Affymetrix microarray datasets, previous studies summarized probeset-level data, which may potentially lead to a loss of information available at the probe-level. In this paper, we present an approach for integrating results across studies while taking probe-level data into account. Additionally, we follow a new direction in the analysis of microarray expression data, namely to focus on the variation of expression phenotypes in predefined gene sets, such as pathways. This targeted approach can be helpful for revealing information that is not easily visible from the changes in the individual genes.
Results: We used a recently developed method to integrate Affymetrix expression data across studies. The idea is based on a probe-level based test statistic developed for testing for differentially expressed genes in individual studies. We incorporated this test statistic into a classic random-effects model for integrating data across studies. Subsequently, we used a gene set enrichment test to evaluate the significance of enriched biological pathways in the differentially expressed genes identified from the integrative analysis. We compared statistical and biological significance of the prognostic gene expression signatures and pathways identified in the probe-level model (PLM) with those in the probeset-level model (PSLM). Our integrative analysis of Affymetrix microarray data from 110 prostate cancer samples obtained from three studies reveals thousands of genes significantly correlated with tumour cell differentiation. The bioinformatics analysis, mapping these genes to the publicly available KEGG database, reveals evidence that tumour cell differentiation is significantly associated with many biological pathways. In particular, we observed that by integrating information from the insulin signalling pathway into our prediction model, we achieved better prediction of prostate cancer.
Conclusions: Our data integration methodology provides an efficient way to identify biologically sound and statistically significant pathways from gene expression data. The significant gene expression phenotypes identified in our study have the potential to characterize complex genetic alterations in prostate cancer.
Discussion
No comments yet...Be the first to comment.
I had an excellent experience publishing our review article in Clinical Medicine Reviews. The managing editor was very helpful and the process was very timely and transparent.Professor Jonathan A. Bernstein (University of Cincinnati College of Medicine, Division of Immunology, Allergy Section, Cincinnati, OH, USA) What our authors say
Copyright © 2010 Libertas Academica Ltd (except open access articles and accompanying metadata and supplementary files.)