pubmed.ncbi.nlm.nih.gov

Conservation of trans-acting circuitry during mammalian regulatory evolution - PubMed

  • ️Wed Jan 01 2014

Conservation of trans-acting circuitry during mammalian regulatory evolution

Andrew B Stergachis et al. Nature. 2014.

Abstract

The basic body plan and major physiological axes have been highly conserved during mammalian evolution, yet only a small fraction of the human genome sequence appears to be subject to evolutionary constraint. To quantify cis- versus trans-acting contributions to mammalian regulatory evolution, we performed genomic DNase I footprinting of the mouse genome across 25 cell and tissue types, collectively defining ∼8.6 million transcription factor (TF) occupancy sites at nucleotide resolution. Here we show that mouse TF footprints conjointly encode a regulatory lexicon that is ∼95% similar with that derived from human TF footprints. However, only ∼20% of mouse TF footprints have human orthologues. Despite substantial turnover of the cis-regulatory landscape, nearly half of all pairwise regulatory interactions connecting mouse TF genes have been maintained in orthologous human cell types through evolutionary innovation of TF recognition sequences. Furthermore, the higher-level organization of mouse TF-to-TF connections into cellular network architectures is nearly identical with human. Our results indicate that evolutionary selection on mammalian gene regulation is targeted chiefly at the level of trans-regulatory circuitry, enabling and potentiating cis-regulatory plasticity.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interests.

Figures

Figure 1
Figure 1. Footprinting the mouse genome and comparison with human footprints.

a, Derivation of 8.6 million differentially occupied DNase I footprints from 25 mouse cell and tissue types. b, Per-nucleotide DNase I cleavage across three gene promoters in both mouse and human cell types; shared TF occupancy sites are indicated by faded boxes. c, Percentage of mouse DNase I footprints with sequence aligning to the human genome but not occupied in any human cell type (grey) versus aligning footprints that are occupied in one or more human cell type (red). PowerPoint slide

Figure 2
Figure 2. Mouse TF footprints define a conserved cis-regulatory lexicon.

a, Average per-nucleotide DNase I cleavage at occupied TF recognition sites within mouse and human DHSs. b, Of 604 motif models derived de novo from mouse footprints, 355 match curated databases. c, Comparison of 249 novel mouse motif models with models derived from human footprints. d, DNase I footprinting pattern at a novel mouse-selective motif instance. e, Preferential occupancy of 16 out of 22 mouse-selective motifs (red); occupancy of pluripotency-related TFs is shown in blue. f, Average human nucleotide diversity (π) in different classes of human DNase I footprints partitioned by matches to mouse-derived motifs (mean ± 95% confidence interval (CI); bootstrap resampling). NS, not significant. PowerPoint slide

Figure 3
Figure 3. Evolutionary dynamics of cis-regulatory logic.

a, Schematic for construction of cell-type regulatory networks using TF footprints: TF genes = network nodes; occupied TF motifs = directed network edges. b, TF genes regulated by OTX2 in fetal brain and retina networks. Symbols indicate known roles of target genes in brain versus retina development. c, Clustering of cell/tissue TF regulatory networks using Jaccard distances between regulatory networks. Cell/tissue types are coloured using physioanatomical and/or functional properties. d, Heat map showing network similarity (Jaccard index) between human and mouse cell-type regulatory networks. e, Pairwise similarities (Jaccard index) between the regulatory networks of all human and mouse cell/tissue types. PowerPoint slide

Figure 4
Figure 4. Conservation of TF-to-TF regulatory circuitry.

a, Four categories of regulatory interactions identified by comparative analysis of mouse and human TF networks. Functionally conserved connections can be mediated by TF occupancy at orthologous (red) or non-orthologous (blue) binding sites. b, Categorization and overall conservation of TF-to-TF connections between orthologous mouse and human cell types. On average 44% of TF-to-TF edges are conserved (P < 0.001; empirically calculated using shuffled networks). PowerPoint slide

Figure 5
Figure 5. Conserved organizing principles of mammalian TF regulatory networks.

a, Enrichment of three-node circuits in each mouse (red lines) and human (black lines) TF regulatory network (expanded in Extended Data Fig. 5). b, Left: frequency with which individual three-node circuits are identically maintained between the mouse and human Treg network. Middle: percentage of specific three-node circuits identically maintained between the mouse and human Treg network. Right: enrichment of three-node circuits in a network constructed using edges present in both mouse and human Treg networks. c, d, Frequency with which TFs from six functional classes occupy different positions (driver, first passenger, second passenger) within FFL (c) or RM (d) circuits in different mouse and human cell-type networks (hfBrain and hfHeart refer to human fetal brain and heart, respectively). PowerPoint slide

Figure 6
Figure 6. Hierarchy of evolutionary constraint on cis- versus trans-regulatory features.

Shown are: overall proportion of conserved DNA bases between mouse and human; proportion of orthologous TF footprints (from data shown in Fig. 1c); average proportion of individual conserved TF-to-TF regulatory connections across orthologous mouse and human cell types (from data shown in Fig. 4); and similarity in overall TF regulatory network architecture (from data shown in Figs 2 and 5). PowerPoint slide

Extended Data Figure 1
Extended Data Figure 1. Cell-selectivity and reproducible detection of DNase I footprints.

a, Distribution of the number of mouse cell types in which each of the 8.6 million distinct footprinted cis-regulatory elements in mouse is contained within a DNase I footprint. b, For each mouse and human cell type, shown is the percentage of DNase I footprints identified in that cell type that are observed in at least one other mouse or human cell type respectively (data represents median ± 25% and 75% quartiles). c, Red: percentage of mouse DNase I footprints with sequence aligning to the human genome that are occupied in one or more human cell types. Brown: percentage of human DNase I footprints with sequence aligning to the mouse genome that are occupied in one or more mouse cell types.

Extended Data Figure 2
Extended Data Figure 2. Negligible impact of intrinsic DNase I cleavage biases on delineation of DNase I footprints.

a, Box-and-whisker plot displaying the percentage of DNase I footprints found in each of the mouse and human samples that are potentially better explained by intrinsic DNase I cleavage specificity (box represents mean ± 25% and 75% quartiles and whiskers represent minimum and maximum values across all human and mouse samples, respectively. b, Effects of protein occupancy and sequence context on DNase I cleavage profiles. Top: heat maps of per-nucleotide DNase I cleavages; the ratio of the observed cleavages to expected cleavages computed using empirically-modelled DNase I cleavage bias; and discovered 1% FDR DNase I footprints surrounding Sp1, Ctf1 and Nrf1 recognition sequences in MEL cells. Each heat map pixel row corresponds to an individual motif instance within a DNase I hotspot. Each blue tick mark under the ‘footprint’ column denotes whether (tick) or not (blank) that motif instance overlaps a called FDR 1% DNase I footprint. Bottom: aggregated DNase I cleavage profiles of occupied (that is, within DNase I footprints) Sp1, Ctf1 and Nrf1 recognition sequences in MEL cells shown side-by-side with log2 ratio of observed versus expected (from intrinsic cleavage preferences) DNase I cleavage. Note that in all cases the cleavage profile of occupied elements differs markedly from expectation.

Extended Data Figure 3
Extended Data Figure 3. DNase I footprints accurately recapitulate ChIP-seq data.

For five different TFs with corresponding ChIP-seq data in MEL cells, displayed are (left) heat maps showing per-nucleotide DNase I cleavage and (right) vertebrate conservation by phyloP for all motif instances of that TF within MEL DNase I hotspots (irrespective of whether they overlap a DNase I footprint), ranked by the local density of DNase I cleavages. The number of motif instances for that TF is indicated to the left of the heat map. Purple ticks indicate the presence of the corresponding TF ChIP-seq peaks at each motif instance. Green ticks indicate the presence of DNase I footprints at each motif instance. Below each graph is indicated the percentage of TF footprints that reside outside of a ChIP-seq verified binding site, as well as the percentage of ChIP-seq peaks that do not contain a DNase I footprint for that TF (indicating indirect TF occupancy). Of note, occupied motifs within DNase I footprints accurately recapitulate sites of

direct

TF occupancy, as 99% of DNase I footprinted motifs for a given TF overlap a cognate ChIP-seq peak. In contrast, for most TFs the majority of ChIP-seq peaks arise from indirect TF occupancy events (and thus lack DNase I footprinted sequence elements for their cognate TF).

Extended Data Figure 4
Extended Data Figure 4. Annotation of the de novo mouse motif models.

a, Left: bar chart showing the percentage of the motif models within different experimentally grounded motif databases that match our de novo mouse motif models. Right: bar chart showing the number of novel de novo motif models in mouse that match de novo motif models in human. b, The proportion of mouse-selective motif model DNase I footprints within distal regulatory regions.

Extended Data Figure 5
Extended Data Figure 5. Conserved organizing principles of the mammalian TF regulatory network.

a, b, Shown is the relative enrichment or depletion of the 13 three-node network motifs in each of the mouse (a) and human (b) regulatory networks. c, Shown is the relative enrichment or depletion of the 13 three-node network motifs in each of the mouse regulatory networks compared with the relative enrichment of the same motifs in the C. elegans neuronal connectivity network.

Extended Data Figure 6
Extended Data Figure 6. The conservation of individual three-node circuit types.

a, Examples of three-node circuits formed by TFs in both mouse and human regulatory T (Treg) cells. b, For each of eight orthologous mouse and human cell-type pairings shown is the percentage of three-node circuits in the mouse cell type that are maintained as any three-node circuit in the orthologous human cell type. c, For each of seven orthologous mouse and human cell-type pairings shown is: (left) heat map showing the overall propensity of individual three-node circuits in the mouse cell-type regulatory network to form the same or other three-node circuits in the human cell-type regulatory network; (middle) bar plot showing the percentage of specific three-node circuits in the mouse cell-type regulatory network to be maintained as the same three-node circuits in the human cell-type regulatory network; (right) the relative enrichment or depletion of the 13 three-node network motifs in a regulatory network constructed using the subset of edges present in both mouse and human cell-type regulatory networks.

Extended Data Figure 7
Extended Data Figure 7. TF position propensities and cell selectivity of conserved network motifs.

a, Shown is the propensity of all TFs within the ES cell regulatory network to occupy the different positions within a FFL. FFL positions are defined in panel c. b, Shown is the GO term enrichment of TFs that preferentially occupy position C within FFLs as opposed to TFs that preferentially occupy positions A and B within FFLs. Asterisk indicates a q value less than 0.05. P values and q values calculated using the Gene Ontology enrichment analysis and visualization tool (GOrilla). c, For all instances of FFLs in mouse ES cells, shown is the tissue specificity of each component edge across the other 24 mouse cell types. P values were calculated using a Wilcoxon rank sum test. d, Same as c but for regulating mutual motifs.

Extended Data Figure 8
Extended Data Figure 8. Polarity of TF genes and regulatory networks during development.

a, Schematic illustrating the definition of and contrasting effector-facing and TF-facing TFs. b, Top: a box-and-whisker plot shows the distribution of the relative log enrichment of TF-facing to effector-facing TFs in mouse ES cells. Bottom: relative target landscape enrichments for individual TFs grouped together based on their functional categories. c, Shown is the GO term enrichment of TFs that preferentially regulate TFs (TF-facing) as opposed to TFs that preferentially regulate effector genes (effector-facing). Asterisk indicates a q value less than 0.05. P values and q values calculated using the Gene Ontology enrichment analysis and visualization tool (GOrilla). d, For each cell type, shown is the average propensity of each TF within the regulatory network to regulate TF genes versus effector genes. Relative enrichment values were calculated such that 0 indicates a cell-type regulatory network that is equally geared towards regulating TF genes and effector genes. Cell types are grouped/coloured according to their developmental origin. P values were calculated using a Wilcoxon rank sum test. e, Same as b but for human iPS cells. For box-and-whisker plots, box represents mean ± 25% and 75% quartiles, whiskers represent minimum and maximum values excluding outliers, and outliers indicated by open circles are defined as values outside 1.5 times the interquartile range above the upper quartile and bellow the lower quartile.

Similar articles

  • Mouse regulatory DNA landscapes reveal global principles of cis-regulatory evolution.

    Vierstra J, Rynes E, Sandstrom R, Zhang M, Canfield T, Hansen RS, Stehling-Sun S, Sabo PJ, Byron R, Humbert R, Thurman RE, Johnson AK, Vong S, Lee K, Bates D, Neri F, Diegel M, Giste E, Haugen E, Dunn D, Wilken MS, Josefowicz S, Samstein R, Chang KH, Eichler EE, De Bruijn M, Reh TA, Skoultchi A, Rudensky A, Orkin SH, Papayannopoulou T, Treuting PM, Selleri L, Kaul R, Groudine M, Bender MA, Stamatoyannopoulos JA. Vierstra J, et al. Science. 2014 Nov 21;346(6212):1007-12. doi: 10.1126/science.1246426. Science. 2014. PMID: 25411453 Free PMC article.

  • Principles of regulatory information conservation between mouse and human.

    Cheng Y, Ma Z, Kim BH, Wu W, Cayting P, Boyle AP, Sundaram V, Xing X, Dogan N, Li J, Euskirchen G, Lin S, Lin Y, Visel A, Kawli T, Yang X, Patacsil D, Keller CA, Giardine B; mouse ENCODE Consortium; Kundaje A, Wang T, Pennacchio LA, Weng Z, Hardison RC, Snyder MP. Cheng Y, et al. Nature. 2014 Nov 20;515(7527):371-375. doi: 10.1038/nature13985. Nature. 2014. PMID: 25409826 Free PMC article.

  • An expansive human regulatory lexicon encoded in transcription factor footprints.

    Neph S, Vierstra J, Stergachis AB, Reynolds AP, Haugen E, Vernot B, Thurman RE, John S, Sandstrom R, Johnson AK, Maurano MT, Humbert R, Rynes E, Wang H, Vong S, Lee K, Bates D, Diegel M, Roach V, Dunn D, Neri J, Schafer A, Hansen RS, Kutyavin T, Giste E, Weaver M, Canfield T, Sabo P, Zhang M, Balasundaram G, Byron R, MacCoss MJ, Akey JM, Bender MA, Groudine M, Kaul R, Stamatoyannopoulos JA. Neph S, et al. Nature. 2012 Sep 6;489(7414):83-90. doi: 10.1038/nature11212. Nature. 2012. PMID: 22955618 Free PMC article.

  • Evolution of transcriptional control in mammals.

    Wilson MD, Odom DT. Wilson MD, et al. Curr Opin Genet Dev. 2009 Dec;19(6):579-85. doi: 10.1016/j.gde.2009.10.003. Epub 2009 Nov 11. Curr Opin Genet Dev. 2009. PMID: 19913406 Review.

  • Deep conservation of cis-regulatory elements in metazoans.

    Maeso I, Irimia M, Tena JJ, Casares F, Gómez-Skarmeta JL. Maeso I, et al. Philos Trans R Soc Lond B Biol Sci. 2013 Nov 11;368(1632):20130020. doi: 10.1098/rstb.2013.0020. Print 2013 Dec 19. Philos Trans R Soc Lond B Biol Sci. 2013. PMID: 24218633 Free PMC article. Review.

Cited by

References

    1. Neph S, et al. Circuitry and dynamics of human transcription factor regulatory networks. Cell. 2012;150:1274–1286. doi: 10.1016/j.cell.2012.04.040. - DOI - PMC - PubMed
    1. Thurman RE, et al. The accessible chromatin landscape of the human genome. Nature. 2012;489:75–82. doi: 10.1038/nature11232. - DOI - PMC - PubMed
    1. Mouse Genome Sequencing Consortium Initial sequencing and comparative analysis of the mouse genome. Nature. 2002;420:520–562. - PubMed
    1. Vierstra, J. et al. Mouse regulatory DNA landscapes reveal global principles of cis-regulatory evolution. Science (in the press) - PMC - PubMed
    1. Schmidt D, et al. Five-vertebrate ChIP-seq reveals the evolutionary dynamics of transcription factor binding. Science. 2010;328:1036–1040. doi: 10.1126/science.1186176. - DOI - PMC - PubMed

Publication types

MeSH terms

Substances