Conservation of trans-acting circuitry during mammalian regulatory evolution - PubMed
- ️Wed Jan 01 2014
. 2014 Nov 20;515(7527):365-70.
doi: 10.1038/nature13972.
Shane Neph 1 , Richard Sandstrom 1 , Eric Haugen 1 , Alex P Reynolds 1 , Miaohua Zhang 2 , Rachel Byron 2 , Theresa Canfield 1 , Sandra Stelhing-Sun 1 , Kristen Lee 1 , Robert E Thurman 1 , Shinny Vong 1 , Daniel Bates 1 , Fidencio Neri 1 , Morgan Diegel 1 , Erika Giste 1 , Douglas Dunn 1 , Jeff Vierstra 1 , R Scott Hansen 3 , Audra K Johnson 1 , Peter J Sabo 1 , Matthew S Wilken 4 , Thomas A Reh 4 , Piper M Treuting 5 , Rajinder Kaul 3 , Mark Groudine 6 , M A Bender 7 , Elhanan Borenstein 8 , John A Stamatoyannopoulos 3
Affiliations
- PMID: 25409825
- PMCID: PMC4405208
- DOI: 10.1038/nature13972
Conservation of trans-acting circuitry during mammalian regulatory evolution
Andrew B Stergachis et al. Nature. 2014.
Abstract
The basic body plan and major physiological axes have been highly conserved during mammalian evolution, yet only a small fraction of the human genome sequence appears to be subject to evolutionary constraint. To quantify cis- versus trans-acting contributions to mammalian regulatory evolution, we performed genomic DNase I footprinting of the mouse genome across 25 cell and tissue types, collectively defining ∼8.6 million transcription factor (TF) occupancy sites at nucleotide resolution. Here we show that mouse TF footprints conjointly encode a regulatory lexicon that is ∼95% similar with that derived from human TF footprints. However, only ∼20% of mouse TF footprints have human orthologues. Despite substantial turnover of the cis-regulatory landscape, nearly half of all pairwise regulatory interactions connecting mouse TF genes have been maintained in orthologous human cell types through evolutionary innovation of TF recognition sequences. Furthermore, the higher-level organization of mouse TF-to-TF connections into cellular network architectures is nearly identical with human. Our results indicate that evolutionary selection on mammalian gene regulation is targeted chiefly at the level of trans-regulatory circuitry, enabling and potentiating cis-regulatory plasticity.
Conflict of interest statement
The authors declare no competing financial interests.
Figures

a, Derivation of 8.6 million differentially occupied DNase I footprints from 25 mouse cell and tissue types. b, Per-nucleotide DNase I cleavage across three gene promoters in both mouse and human cell types; shared TF occupancy sites are indicated by faded boxes. c, Percentage of mouse DNase I footprints with sequence aligning to the human genome but not occupied in any human cell type (grey) versus aligning footprints that are occupied in one or more human cell type (red). PowerPoint slide

a, Average per-nucleotide DNase I cleavage at occupied TF recognition sites within mouse and human DHSs. b, Of 604 motif models derived de novo from mouse footprints, 355 match curated databases. c, Comparison of 249 novel mouse motif models with models derived from human footprints. d, DNase I footprinting pattern at a novel mouse-selective motif instance. e, Preferential occupancy of 16 out of 22 mouse-selective motifs (red); occupancy of pluripotency-related TFs is shown in blue. f, Average human nucleotide diversity (π) in different classes of human DNase I footprints partitioned by matches to mouse-derived motifs (mean ± 95% confidence interval (CI); bootstrap resampling). NS, not significant. PowerPoint slide

a, Schematic for construction of cell-type regulatory networks using TF footprints: TF genes = network nodes; occupied TF motifs = directed network edges. b, TF genes regulated by OTX2 in fetal brain and retina networks. Symbols indicate known roles of target genes in brain versus retina development. c, Clustering of cell/tissue TF regulatory networks using Jaccard distances between regulatory networks. Cell/tissue types are coloured using physioanatomical and/or functional properties. d, Heat map showing network similarity (Jaccard index) between human and mouse cell-type regulatory networks. e, Pairwise similarities (Jaccard index) between the regulatory networks of all human and mouse cell/tissue types. PowerPoint slide

a, Four categories of regulatory interactions identified by comparative analysis of mouse and human TF networks. Functionally conserved connections can be mediated by TF occupancy at orthologous (red) or non-orthologous (blue) binding sites. b, Categorization and overall conservation of TF-to-TF connections between orthologous mouse and human cell types. On average 44% of TF-to-TF edges are conserved (P < 0.001; empirically calculated using shuffled networks). PowerPoint slide

a, Enrichment of three-node circuits in each mouse (red lines) and human (black lines) TF regulatory network (expanded in Extended Data Fig. 5). b, Left: frequency with which individual three-node circuits are identically maintained between the mouse and human Treg network. Middle: percentage of specific three-node circuits identically maintained between the mouse and human Treg network. Right: enrichment of three-node circuits in a network constructed using edges present in both mouse and human Treg networks. c, d, Frequency with which TFs from six functional classes occupy different positions (driver, first passenger, second passenger) within FFL (c) or RM (d) circuits in different mouse and human cell-type networks (hfBrain and hfHeart refer to human fetal brain and heart, respectively). PowerPoint slide

Shown are: overall proportion of conserved DNA bases between mouse and human; proportion of orthologous TF footprints (from data shown in Fig. 1c); average proportion of individual conserved TF-to-TF regulatory connections across orthologous mouse and human cell types (from data shown in Fig. 4); and similarity in overall TF regulatory network architecture (from data shown in Figs 2 and 5). PowerPoint slide

a, Distribution of the number of mouse cell types in which each of the 8.6 million distinct footprinted cis-regulatory elements in mouse is contained within a DNase I footprint. b, For each mouse and human cell type, shown is the percentage of DNase I footprints identified in that cell type that are observed in at least one other mouse or human cell type respectively (data represents median ± 25% and 75% quartiles). c, Red: percentage of mouse DNase I footprints with sequence aligning to the human genome that are occupied in one or more human cell types. Brown: percentage of human DNase I footprints with sequence aligning to the mouse genome that are occupied in one or more mouse cell types.

a, Box-and-whisker plot displaying the percentage of DNase I footprints found in each of the mouse and human samples that are potentially better explained by intrinsic DNase I cleavage specificity (box represents mean ± 25% and 75% quartiles and whiskers represent minimum and maximum values across all human and mouse samples, respectively. b, Effects of protein occupancy and sequence context on DNase I cleavage profiles. Top: heat maps of per-nucleotide DNase I cleavages; the ratio of the observed cleavages to expected cleavages computed using empirically-modelled DNase I cleavage bias; and discovered 1% FDR DNase I footprints surrounding Sp1, Ctf1 and Nrf1 recognition sequences in MEL cells. Each heat map pixel row corresponds to an individual motif instance within a DNase I hotspot. Each blue tick mark under the ‘footprint’ column denotes whether (tick) or not (blank) that motif instance overlaps a called FDR 1% DNase I footprint. Bottom: aggregated DNase I cleavage profiles of occupied (that is, within DNase I footprints) Sp1, Ctf1 and Nrf1 recognition sequences in MEL cells shown side-by-side with log2 ratio of observed versus expected (from intrinsic cleavage preferences) DNase I cleavage. Note that in all cases the cleavage profile of occupied elements differs markedly from expectation.

For five different TFs with corresponding ChIP-seq data in MEL cells, displayed are (left) heat maps showing per-nucleotide DNase I cleavage and (right) vertebrate conservation by phyloP for all motif instances of that TF within MEL DNase I hotspots (irrespective of whether they overlap a DNase I footprint), ranked by the local density of DNase I cleavages. The number of motif instances for that TF is indicated to the left of the heat map. Purple ticks indicate the presence of the corresponding TF ChIP-seq peaks at each motif instance. Green ticks indicate the presence of DNase I footprints at each motif instance. Below each graph is indicated the percentage of TF footprints that reside outside of a ChIP-seq verified binding site, as well as the percentage of ChIP-seq peaks that do not contain a DNase I footprint for that TF (indicating indirect TF occupancy). Of note, occupied motifs within DNase I footprints accurately recapitulate sites of
directTF occupancy, as 99% of DNase I footprinted motifs for a given TF overlap a cognate ChIP-seq peak. In contrast, for most TFs the majority of ChIP-seq peaks arise from indirect TF occupancy events (and thus lack DNase I footprinted sequence elements for their cognate TF).

a, Left: bar chart showing the percentage of the motif models within different experimentally grounded motif databases that match our de novo mouse motif models. Right: bar chart showing the number of novel de novo motif models in mouse that match de novo motif models in human. b, The proportion of mouse-selective motif model DNase I footprints within distal regulatory regions.

a, b, Shown is the relative enrichment or depletion of the 13 three-node network motifs in each of the mouse (a) and human (b) regulatory networks. c, Shown is the relative enrichment or depletion of the 13 three-node network motifs in each of the mouse regulatory networks compared with the relative enrichment of the same motifs in the C. elegans neuronal connectivity network.

a, Examples of three-node circuits formed by TFs in both mouse and human regulatory T (Treg) cells. b, For each of eight orthologous mouse and human cell-type pairings shown is the percentage of three-node circuits in the mouse cell type that are maintained as any three-node circuit in the orthologous human cell type. c, For each of seven orthologous mouse and human cell-type pairings shown is: (left) heat map showing the overall propensity of individual three-node circuits in the mouse cell-type regulatory network to form the same or other three-node circuits in the human cell-type regulatory network; (middle) bar plot showing the percentage of specific three-node circuits in the mouse cell-type regulatory network to be maintained as the same three-node circuits in the human cell-type regulatory network; (right) the relative enrichment or depletion of the 13 three-node network motifs in a regulatory network constructed using the subset of edges present in both mouse and human cell-type regulatory networks.

a, Shown is the propensity of all TFs within the ES cell regulatory network to occupy the different positions within a FFL. FFL positions are defined in panel c. b, Shown is the GO term enrichment of TFs that preferentially occupy position C within FFLs as opposed to TFs that preferentially occupy positions A and B within FFLs. Asterisk indicates a q value less than 0.05. P values and q values calculated using the Gene Ontology enrichment analysis and visualization tool (GOrilla). c, For all instances of FFLs in mouse ES cells, shown is the tissue specificity of each component edge across the other 24 mouse cell types. P values were calculated using a Wilcoxon rank sum test. d, Same as c but for regulating mutual motifs.

a, Schematic illustrating the definition of and contrasting effector-facing and TF-facing TFs. b, Top: a box-and-whisker plot shows the distribution of the relative log enrichment of TF-facing to effector-facing TFs in mouse ES cells. Bottom: relative target landscape enrichments for individual TFs grouped together based on their functional categories. c, Shown is the GO term enrichment of TFs that preferentially regulate TFs (TF-facing) as opposed to TFs that preferentially regulate effector genes (effector-facing). Asterisk indicates a q value less than 0.05. P values and q values calculated using the Gene Ontology enrichment analysis and visualization tool (GOrilla). d, For each cell type, shown is the average propensity of each TF within the regulatory network to regulate TF genes versus effector genes. Relative enrichment values were calculated such that 0 indicates a cell-type regulatory network that is equally geared towards regulating TF genes and effector genes. Cell types are grouped/coloured according to their developmental origin. P values were calculated using a Wilcoxon rank sum test. e, Same as b but for human iPS cells. For box-and-whisker plots, box represents mean ± 25% and 75% quartiles, whiskers represent minimum and maximum values excluding outliers, and outliers indicated by open circles are defined as values outside 1.5 times the interquartile range above the upper quartile and bellow the lower quartile.
Similar articles
-
Mouse regulatory DNA landscapes reveal global principles of cis-regulatory evolution.
Vierstra J, Rynes E, Sandstrom R, Zhang M, Canfield T, Hansen RS, Stehling-Sun S, Sabo PJ, Byron R, Humbert R, Thurman RE, Johnson AK, Vong S, Lee K, Bates D, Neri F, Diegel M, Giste E, Haugen E, Dunn D, Wilken MS, Josefowicz S, Samstein R, Chang KH, Eichler EE, De Bruijn M, Reh TA, Skoultchi A, Rudensky A, Orkin SH, Papayannopoulou T, Treuting PM, Selleri L, Kaul R, Groudine M, Bender MA, Stamatoyannopoulos JA. Vierstra J, et al. Science. 2014 Nov 21;346(6212):1007-12. doi: 10.1126/science.1246426. Science. 2014. PMID: 25411453 Free PMC article.
-
Principles of regulatory information conservation between mouse and human.
Cheng Y, Ma Z, Kim BH, Wu W, Cayting P, Boyle AP, Sundaram V, Xing X, Dogan N, Li J, Euskirchen G, Lin S, Lin Y, Visel A, Kawli T, Yang X, Patacsil D, Keller CA, Giardine B; mouse ENCODE Consortium; Kundaje A, Wang T, Pennacchio LA, Weng Z, Hardison RC, Snyder MP. Cheng Y, et al. Nature. 2014 Nov 20;515(7527):371-375. doi: 10.1038/nature13985. Nature. 2014. PMID: 25409826 Free PMC article.
-
An expansive human regulatory lexicon encoded in transcription factor footprints.
Neph S, Vierstra J, Stergachis AB, Reynolds AP, Haugen E, Vernot B, Thurman RE, John S, Sandstrom R, Johnson AK, Maurano MT, Humbert R, Rynes E, Wang H, Vong S, Lee K, Bates D, Diegel M, Roach V, Dunn D, Neri J, Schafer A, Hansen RS, Kutyavin T, Giste E, Weaver M, Canfield T, Sabo P, Zhang M, Balasundaram G, Byron R, MacCoss MJ, Akey JM, Bender MA, Groudine M, Kaul R, Stamatoyannopoulos JA. Neph S, et al. Nature. 2012 Sep 6;489(7414):83-90. doi: 10.1038/nature11212. Nature. 2012. PMID: 22955618 Free PMC article.
-
Evolution of transcriptional control in mammals.
Wilson MD, Odom DT. Wilson MD, et al. Curr Opin Genet Dev. 2009 Dec;19(6):579-85. doi: 10.1016/j.gde.2009.10.003. Epub 2009 Nov 11. Curr Opin Genet Dev. 2009. PMID: 19913406 Review.
-
Deep conservation of cis-regulatory elements in metazoans.
Maeso I, Irimia M, Tena JJ, Casares F, Gómez-Skarmeta JL. Maeso I, et al. Philos Trans R Soc Lond B Biol Sci. 2013 Nov 11;368(1632):20130020. doi: 10.1098/rstb.2013.0020. Print 2013 Dec 19. Philos Trans R Soc Lond B Biol Sci. 2013. PMID: 24218633 Free PMC article. Review.
Cited by
-
Hartmann K, Seweryn M, Handelman SK, Rempała GA, Sadee W. Hartmann K, et al. BMC Genomics. 2016 Sep 17;17(1):738. doi: 10.1186/s12864-016-3075-6. BMC Genomics. 2016. PMID: 27640124 Free PMC article.
-
Modeling cis-regulation with a compendium of genome-wide histone H3K27ac profiles.
Wang S, Zang C, Xiao T, Fan J, Mei S, Qin Q, Wu Q, Li X, Xu K, He HH, Brown M, Meyer CA, Liu XS. Wang S, et al. Genome Res. 2016 Oct;26(10):1417-1429. doi: 10.1101/gr.201574.115. Epub 2016 Jul 27. Genome Res. 2016. PMID: 27466232 Free PMC article.
-
Kern C, Wang Y, Xu X, Pan Z, Halstead M, Chanthavixay G, Saelao P, Waters S, Xiang R, Chamberlain A, Korf I, Delany ME, Cheng HH, Medrano JF, Van Eenennaam AL, Tuggle CK, Ernst C, Flicek P, Quon G, Ross P, Zhou H. Kern C, et al. Nat Commun. 2021 Mar 23;12(1):1821. doi: 10.1038/s41467-021-22100-8. Nat Commun. 2021. PMID: 33758196 Free PMC article.
-
Zai G, Alberry B, Arloth J, Bánlaki Z, Bares C, Boot E, Camilo C, Chadha K, Chen Q, Cole CB, Cost KT, Crow M, Ekpor I, Fischer SB, Flatau L, Gagliano S, Kirli U, Kukshal P, Labrie V, Lang M, Lett TA, Maffioletti E, Maier R, Mihaljevic M, Mittal K, Monson ET, O'Brien NL, Østergaard SD, Ovenden E, Patel S, Peterson RE, Pouget JG, Rovaris DL, Seaman L, Shankarappa B, Tsetsos F, Vereczkei A, Wang C, Xulu K, Yuen RK, Zhao J, Zai CC, Kennedy JL. Zai G, et al. Psychiatr Genet. 2016 Dec;26(6):229-257. doi: 10.1097/YPG.0000000000000148. Psychiatr Genet. 2016. PMID: 27606929 Free PMC article.
-
Miraldi ER, Pokrovskii M, Watters A, Castro DM, De Veaux N, Hall JA, Lee JY, Ciofani M, Madar A, Carriero N, Littman DR, Bonneau R. Miraldi ER, et al. Genome Res. 2019 Mar;29(3):449-463. doi: 10.1101/gr.238253.118. Epub 2019 Jan 29. Genome Res. 2019. PMID: 30696696 Free PMC article.
References
-
- Mouse Genome Sequencing Consortium Initial sequencing and comparative analysis of the mouse genome. Nature. 2002;420:520–562. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
- R37DK44746/DK/NIDDK NIH HHS/United States
- RC2HG005654/HG/NHGRI NIH HHS/United States
- U54HG004592/HG/NHGRI NIH HHS/United States
- F30 DK095678/DK/NIDDK NIH HHS/United States
- FDK095678A/PHS HHS/United States
- U54HG007010/HG/NHGRI NIH HHS/United States
- U01ES01156/ES/NIEHS NIH HHS/United States
- R37 DK044746/DK/NIDDK NIH HHS/United States
- U54 HG007010/HG/NHGRI NIH HHS/United States
- T32 GM007266/GM/NIGMS NIH HHS/United States
- RC2 HG005654/HG/NHGRI NIH HHS/United States
- U54 HG004592/HG/NHGRI NIH HHS/United States
- R01 EY021482/EY/NEI NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Miscellaneous