Unsupervised Meta-Analysis on Diverse Gene Expression Datasets Allows Insight into Gene Function and Regulation
Julia C. Engelmann1, Roland Schwarz1, Steffen Blenk, Torben Friedrich, Philipp N. Seibel, Thomas Dandekar and Tobias Müller
Department of Bioinformatics, Biocenter, University of Würzburg, Am Hubland, D-97074 Würzburg, Germany. 1Both authors contributed equally.
Abstract
Over the past years, microarray databases have increased rapidly in size. While they offer a wealth of data, it remains challenging to integrate data arising from different studies. Here we propose an unsupervised approach of a large-scale meta-analysis on Arabidopsis thaliana whole genome expression datasets to gain additional insights into the function and regulation of genes. Applying kernel principal component analysis and hierarchical clustering, we found three major groups of experimental contrasts sharing a common biological trait. Genes associated to two of these clusters are known to play an important role in indole-3-acetic acid (IAA) mediated plant growth and development or pathogen defense. Novel functions could be assigned to genes including a cluster of serine/threonine kinases that carry two uncharacterized domains (DUF26) in their receptor part implicated in host defense. With the approach shown here, hidden interrelations between genes regulated under different conditions can be unraveled.
Readers of this also read:
- Unsupervised Meta-Analysis on Diverse Gene Expression Datasets Allows Insight into Gene Function and Regulation
- Evidence of Highly Regulated Genes (in-Hubs) in Gene Networks of Saccharomyces Cerevisiae
- Normalization and Gene p-Value Estimation: Issues in Microarray Data Processing
- Using a Seed-Network to Query Multiple Large-Scale Gene Expression Datasets from the Developing Retina in Order to Identify and Prioritize Experimental Targets
- A New Test Statistic Based on Shrunken Sample Variance for Identifying Differentially Expressed Genes in Small Microarray Experiments