Expression a La Bimode
James C. Willey
Editor-in-Chief, Cancer Informatics.
The recently published article by Wang, and commented on by Ertel in this journal, describes development of a method, named the Bimodality Index, to objectively identify and rank meaningful and reliable bimodal patterns from large-scale gene expression datasets. This is an important step because genes with a bimodal distribution may be prime candidates to develop as molecular diagnostic tests for distinguishing clinically important groups. In their introduction, they make the point that a bimodal expression pattern may be observed when two distinct subgroups of samples are measured as one group, with each mode representing the mean expression of a gene in one of the sub-groups. Recently, it was reported that bimodal gene expression may represent another important phenomenon.
Specifically, two clinically important groups may be distinguished on the basis of a gene being unimodal in one group and bimodal in the other group. For example, certain key antioxidant, DNA repair, and transcription factor genes each display a unimodal pattern in non-lung cancer subjects but a bimodal pattern in lung cancer subjects. Importantly, when two groups are distinguished on the basis of unimodal vs bimodal distribution of an analyte (in this case gene expression value), biomarker development requires use of two cut-points, one on either side of the unimodal distribution. It is timely then that an improved method for separating groups on the basis of two cut-points was described recently.
Thus, the analytical method to discover genes with bimodal expression distribution described by Wang and recent statistical methods that enable separation of groups on the basis of two-cut points are likely to accelerate both diagnostic test discovery, as well as mechanistic understanding of disease and disease risk. It is important to recognize that transcript abundance methods vary in their linear dynamic range, and that the ability to identify bimodal distribution of genes will depend to a significant degree on the linear dynamic range of the method used.
Readers of this also read:
- Comparison of Two Output-Coding Strategies for Multi-Class Tumor Classification Using Gene Expression Data and Latent Variable Model as Binary Classifier
- Comparison of Two Output-Coding Strategies for Multi-Class Tumor Classification Using Gene Expression Data and Latent Variable Model as Binary Classifier
- Expression a La Bimode
- The Power of the Web in Cancer Drug Discovery and Clinical Trial Design: Research without a Laboratory?
- Dual KS: Defining Gene Sets with Tissue Set Enrichment Analysis