Publication Date: 14 Jan 2015
Type: Review
Journal: Cancer Informatics
Citation: Cancer Informatics 2014:Suppl. 7 55-65
doi: 10.4137/CIN.S16350
Variable selection methods play an important role in high-dimensional statistical modeling and analysis. Computational cost and estimation accuracy are the two main concerns for statistical inference from ultrahigh-dimensional data. In particular, genome-wide association studies (GWAS), which focus on identifying single nucleotide polymorphisms (SNPs) associated with a disease of interest, have produced ultrahigh-dimensional data. Numerous methods have been proposed to handle GWAS data. Most statistical methods have adopted a two-stage approach: pre-screening for dimensional reduction and variable selection to identify causal SNPs. The pre-screening step selects SNPs in terms of their P-values or the absolute values of the regression coefficients in single SNP analysis. Penalized regressions, such as the ridge, lasso, adaptive lasso, and elastic-net regressions, are commonly used for the variable selection step. In this paper, we investigate which combination of pre-screening method and penalized regression performs best on a quantitative phenotype using two real GWAS datasets.
PDF (1.56 MB PDF FORMAT)
RIS citation (ENDNOTE, REFERENCE MANAGER, PROCITE, REFWORKS)
BibTex citation (BIBDESK, LATEX)
PMC HTML
I would like to extend my gratitude for creating the next generation of a scientific journal -- the science journal of tomorrow. The entire process bespoke of exceptional efficiency, celerity, professionalism, competency, and service.
Facebook Google+ Twitter
Pinterest Tumblr YouTube