pubmed.ncbi.nlm.nih.gov

Translating non-coding genetic associations into a better understanding of immune-mediated disease - PubMed

  • ️Sun Jan 01 2023

Review

. 2023 Mar 1;16(3):dmm049790.

doi: 10.1242/dmm.049790. Epub 2023 Mar 7.

Affiliations

Review

Translating non-coding genetic associations into a better understanding of immune-mediated disease

Christina T Stankey et al. Dis Model Mech. 2023.

Abstract

Genome-wide association studies have identified hundreds of genetic loci that are associated with immune-mediated diseases. Most disease-associated variants are non-coding, and a large proportion of these variants lie within enhancers. As a result, there is a pressing need to understand how common genetic variation might affect enhancer function and thereby contribute to immune-mediated (and other) diseases. In this Review, we first describe statistical and experimental methods to identify causal genetic variants that modulate gene expression, including statistical fine-mapping and massively parallel reporter assays. We then discuss approaches to characterise the mechanisms by which these variants modulate immune function, such as clustered regularly interspaced short palindromic repeats (CRISPR)-based screens. We highlight examples of studies that, by elucidating the effects of disease variants within enhancers, have provided important insights into immune function and uncovered key pathways of disease.

Keywords: Genetics; Genomics; Immune-mediated disease; Immunology.

© 2023. Published by The Company of Biologists Ltd.

PubMed Disclaimer

Conflict of interest statement

Competing interests The authors declare no competing or financial interests.

Figures

Fig. 1.
Fig. 1.

Haplotype blocks and linkage disequilibrium. Haplotype blocks (blue rectangles, top; 1, 2 and 3) are regions on a chromosome (top) that are inherited as a unit, such that variants or single-nucleotide polymorphisms (SNPs; a, b, c, d, e and f) within the blocks tend to be inherited together. This gives rise to linkage disequilibrium (LD), where inheritance of common genetic variants within a haplotype block is highly correlated (red diamonds; high LD/co-inheritance, e.g. Block 1 vs 1), whereas inheritance of variants in different haplotype blocks is weakly correlated (light-red/light-blue diamonds; weak LD/co-inheritance, e.g. Block 1 vs 2) or not correlated (blue diamonds; low LD/co-inheritance, e.g. Block 1 vs 3).

Fig. 2.
Fig. 2.

Statistical methods to identify putative causal variants. (A) First, genome wide-association studies (GWAS) identify loci associated with a disease. Single-nucleotide polymorphisms (SNPs) within these loci that reach a corrected P-value threshold (dashed line) are associated with the disease. An association signal (for example on chromosome 13) can be refined through various methods to identify putative causal SNPs, as shown in B. (B) Statistical fine-mapping (left) can be used to refine disease associations. This technique typically uses Bayesian methods and data from SNP microarrays and whole-genome sequencing to evaluate the probability that each variant is causal given the haplotype structure across the locus. SNPs (dots) are coloured according to the r2 value, which denotes linkage disequilibrium (LD) with the lead SNP at the locus (r2 values closer to 1 indicate a higher LD). Corrected P-values are adjusted for multiple testing. Quantitative trait loci (QTL; middle) are loci in which genetic variation is associated with a cellular trait. In the plots, the corrected P-value indicates the association of SNPs with the trait of interest, with higher associations in disease than in non-disease contexts. Chromatin annotations (right) denote, for example, chromatin accessibility or histone marks that are indicative of enhancer activity. In disease context, higher corrected P-values indicate higher chromatin accessibility.

Fig. 3.
Fig. 3.

Experimental methods to identify variants that modulate gene expression. (A,B) Massively parallel reporter assay (MPRA; A) and self-transcribing active regulatory region sequencing (STARR-seq; B) are used to directly assess the transcriptional effects of putative regulatory elements and/or putative causal variants. A library of barcoded putative enhancer sequences (MPRA) or genomic fragments (STARR-seq) are cloned into empty vectors (circles). A promoter (grey) and enhanced green fluorescent protein (eGFP; green arrow) are also inserted. The resulting plasmid is transfected into cells, RNA is extracted, and mRNA barcode counts (MPRA) or mRNA genomic fragment counts (STARR-seq) are obtained by high-throughput sequencing. mRNA counts are then normalised to their corresponding DNA counts within the transfected plasmid pool. This provides a measure of expression-modulating activity (mRNA/DNA).

Fig. 4.
Fig. 4.

Methods to map three-dimensional chromatin interactions. Methods to map three-dimensional chromatin interactions are based on chromatin conformation capture, where chromatin (grey line with red and blue interacting segments) is cross-linked. (A) In chromatin interaction analysis with paired-end tag (ChIA-PET), cross-linked chromatin is sonicated to obtain complexes of DNA fragments and proteins. The protein of interest (blue oval) is immunoprecipitated using the antibody specific for the protein of interest along with the cross-linked DNA. Linkers are added and the fragments are ligated (grey shape). The cross-links are then removed, the proteins are digested, and the DNA is prepared for sequencing to identify fragments of DNA interacting with the protein of interest. (B) In Hi-C, following chromatin digestion to generate cross-linked DNA fragments, biotin is added at the digested ends of DNA fragments, and the fragments are ligated. The cross-links are reversed, then the fragments are sonicated. Streptavidin beads are used to pull down biotinylated fragments. Biotin is removed, and the DNA is prepared for sequencing to identify all interacting pairs of loci. (C,D) In Capture-C and 4C, chromatin is digested to generate cross-linked DNA fragments, which are self-ligated and the cross-links are removed. (C) In Capture-C, ligated fragments are sonicated and amplified by PCR. Biotinylated oligonucleotide probes are added to hybridise to DNA sequences of interest, which are pulled down using streptavidin beads and sequenced to identify fragments interacting with these loci. (D) In 4C, ligated fragments are circularised then are amplified using PCR primers (black arrows) for a locus of interest. High-throughput sequencing is then done to identify all fragments interacting with this locus.

Similar articles

Cited by

References

    1. Altshuler, D., Daly, M. J. and Lander, E. S. (2008). Genetic mapping in human disease. Science 322, 881-888. 10.1126/science.1156409 - DOI - PMC - PubMed
    1. Almlöf, J. C., Nystedt, S., Leonard, D., Eloranta, M. L., Grosso, G., Sjöwall, C., Bengtsson, A. A., Jönsen, A., Gunnarsson, I., Svenungsson, E.et al. (2019). Whole-genome sequencing identifies complex contributions to genetic risk by variants in genes causing monogenic systemic lupus erythematosus. Hum. Genet. 138, 141-150. 10.1007/s00439-018-01966-7 - DOI - PMC - PubMed
    1. Andreu, N., Phelan, J., de Sessions, P. F., Cliff, J. M., Clark, T. G. and Hibberd, M. L. (2017). Primary macrophages and J774 cells respond differently to infection with Mycobacterium tuberculosis. Sci. Rep. 7, 42225. 10.1038/srep42225 - DOI - PMC - PubMed
    1. Anderson, C. A., Boucher, G., Lees, C. W., Franke, A., D'Amato, M., Taylor, K. D., Lee, J. C., Goyette, P., Imielinski, M., Latiano, A.et al. (2011). Meta-analysis identifies 29 additional ulcerative colitis risk loci, increasing the number of confirmed associations to 47. Nat. Genet. 43, 246-252. 10.1038/ng.764 - DOI - PMC - PubMed
    1. Arnold, C. D., Gerlach, D., Stelzer, C., Boryń, Ł. M., Rath, M. and Stark, A. (2013). Genome-wide quantitative enhancer activity maps identified by STARR-seq. Science 339, 1074-1077. 10.1126/science.1232542 - DOI - PubMed

Publication types

MeSH terms