pubmed.ncbi.nlm.nih.gov

A powerful and efficient set test for genetic markers that handles confounders - PubMed

  • ️Tue Jan 01 2013

A powerful and efficient set test for genetic markers that handles confounders

Jennifer Listgarten et al. Bioinformatics. 2013.

Abstract

Motivation: Approaches for testing sets of variants, such as a set of rare or common variants within a gene or pathway, for association with complex traits are important. In particular, set tests allow for aggregation of weak signal within a set, can capture interplay among variants and reduce the burden of multiple hypothesis testing. Until now, these approaches did not address confounding by family relatedness and population structure, a problem that is becoming more important as larger datasets are used to increase power.

Results: We introduce a new approach for set tests that handles confounders. Our model is based on the linear mixed model and uses two random effects-one to capture the set association signal and one to capture confounders. We also introduce a computational speedup for two random-effects models that makes this approach feasible even for extremely large cohorts. Using this model with both the likelihood ratio test and score test, we find that the former yields more power while controlling type I error. Application of our approach to richly structured Genetic Analysis Workshop 14 data demonstrates that our method successfully corrects for population structure and family relatedness, whereas application of our method to a 15 000 individual Crohn's disease case-control cohort demonstrates that it additionally recovers genes not recoverable by univariate analysis.

Availability: A Python-based library implementing our approach is available at http://mscompbio.codeplex.com.

PubMed Disclaimer

Figures

Similar articles

Cited by

References

    1. Astle W, Balding DJ. Population structure and cryptic relatedness in genetic association studies. Stat. Sci. 2009;24:451–471.
    1. Atwell S, et al. Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature. 2010;465:627–631. - PMC - PubMed
    1. Balding DJ. A tutorial on statistical methods for population association studies. Nat. Rev. Genet. 2006;7:781–791. - PubMed
    1. Bansal V, et al. Statistical analysis strategies for association studies involving rare variants. Nat. Rev. Genet. 2010;11:773–785. - PMC - PubMed
    1. Braun R, Buetow K. Pathways of distinction analysis: a new technique for Multi–SNP analysis of GWAS data. PLoS Genet. 2011;7:e1002101. - PMC - PubMed

Publication types

MeSH terms

Substances