An Effective Tri-Clustering Algorithm Combining Expression Data with Gene Regulation Information
Ao Li and David Tuck
Department of Pathology, Yale University School of Medicine, New Haven, Connecticut 06510, U.S.A.
Abstract
Motivation: Bi-clustering algorithms aim to identify sets of genes sharing similar expression patterns across a subset of conditions. However direct interpretation or prediction of gene regulatory mechanisms may be difficult as only gene expression data is used. Information about gene regulators may also be available, most commonly about which transcription factors may bind to the promoter region and thus control the expression level of a gene. Thus a method to integrate gene expression and gene regulation information is desirable for clustering and analyzing.
Methods: By incorporating gene regulatory information with gene expression data, we define regulated expression values (REV) as indicators of how a gene is regulated by a specific factor. Existing bi-clustering methods are extended to a three dimensional data space by developing a heuristic TRI-Clustering algorithm. An additional approach named Automatic Boundary Searching algorithm (ABS) is introduced to automatically determine the boundary threshold.
Results: Results based on incorporating ChIP-chip data representing transcription factor-gene interactions show that the algorithms are efficient and robust for detecting tri-clusters. Detailed analysis of the tri-cluster extracted from yeast sporulation REV data shows genes in this cluster exhibited significant differences during the middle and late stages. The implicated regulatory network was then reconstructed for further study of defined regulatory mechanisms. Topological and statistical analysis of this network demonstrated evidence of significant changes of TF activities during the different stages of yeast sporulation, and suggests this approach might be a general way to study regulatory networks undergoing transformations.
Readers of this also read:
- An Effective Tri-Clustering Algorithm Combining Expression Data with Gene Regulation Information
- Sigma Factors for Cyanobacterial Transcription
- Post-Transcriptional Control of Chloroplast Gene Expression
- Application of Petri Nets in Bone Remodeling
- Safety and in vivo Expression of a GNE-Transgene: A Novel Treatment Approach for Hereditary Inclusion Body Myopathy-2