pubmed.ncbi.nlm.nih.gov

Architecture of the human regulatory network derived from ENCODE data - PubMed

  • ️Sun Jan 01 2012

. 2012 Sep 6;489(7414):91-100.

doi: 10.1038/nature11245.

Mark B Gerstein #  1   2   3 Manoj Hariharan #  5 Stephen G Landt #  5 Koon-Kiu Yan #  1   2 Chao Cheng #  1   2 Xinmeng Jasmine Mu #  1 Ekta Khurana #  1   2 Joel Rozowsky #  2 Roger Alexander #  1   2 Renqiang Min #  1   2   6 Pedro Alves #  1 Alexej Abyzov  1   2 Nick Addleman  5 Nitin Bhardwaj  1   2 Alan P Boyle  5 Philip Cayting  5 Alexandra Charos  7 David Z Chen  2 Yong Cheng  5 Declan Clarke  8 Catharine Eastman  5 Ghia Euskirchen  5 Seth Frietze  9 Yao Fu  1 Jason Gertz  10 Fabian Grubert  5 Arif Harmanci  1   2 Preti Jain  10 Maya Kasowski  5 Phil Lacroute  5 Jing Jane Leng  1 Jin Lian  11 Hannah Monahan  7 Henriette O'Geen  12 Zhengqing Ouyang  5 E Christopher Partridge  10 Dorrelyn Patacsil  5 Florencia Pauli  10 Debasish Raha  7 Lucia Ramirez  5 Timothy E Reddy  10 Brian Reed  7 Minyi Shi  5 Teri Slifer  5 Jing Wang  1 Linfeng Wu  5 Xinqiong Yang  5 Kevin Y Yip  1   2   13 Gili Zilberman-Schapira  1 Serafim Batzoglou  4 Arend Sidow  14 Peggy J Farnham  9 Richard M Myers  10 Sherman M Weissman  11 Michael Snyder  5

Affiliations

Architecture of the human regulatory network derived from ENCODE data

Mark B Gerstein et al. Nature. 2012.

Abstract

Transcription factors bind in a combinatorial fashion to specify the on-and-off states of genes; the ensemble of these binding events forms a regulatory network, constituting the wiring diagram for a cell. To examine the principles of the human transcriptional regulatory network, we determined the genomic binding information of 119 transcription-related factors in over 450 distinct experiments. We found the combinatorial, co-association of transcription factors to be highly context specific: distinct combinations of factors bind at specific genomic locations. In particular, there are significant differences in the binding proximal and distal to genes. We organized all the transcription factor binding into a hierarchy and integrated it with other genomic information (for example, microRNA regulation), forming a dense meta-network. Factors at different levels have different properties; for instance, top-level transcription factors more strongly influence expression and middle-level ones co-regulate targets to mitigate information-flow bottlenecks. Moreover, these co-regulations give rise to many enriched network motifs (for example, noise-buffering feed-forward loops). Finally, more connected network components are under stronger selection and exhibit a greater degree of allele-specific activity (that is, differential binding to the two parental alleles). The regulatory information obtained in this study will be crucial for interpreting personal genome sequences and understanding basic principles of human biology and disease.

PubMed Disclaimer

Figures

Figure 1
Figure 1. TF Co-association

(a) The co-binding map for the GATA1 focus-factor context in K562 shows the binding intensity of peaks of all TFs in K562 (rows) that overlap each GATA1 peak (columns). The colored rectangles represent 8 key clusters consisting of different combinations of co-associating partner-factors. (b) The GATA1 context-specific relative importance scores (RI) of all partner-factors (top) and the matrix of co-association scores (CS) between all pairs of TFs (bottom). Primary and local partners of GATA have high RI scores. The co-association score matrix captures the 8 clusters observed in (a). (c) Different partner-factors are preferentially enriched at gene-distal (positive differential RI) and proximal (negative differential RI) GATA1 peaks. (d) The aggregate factor importance matrix, obtained by stacking the RI of all partner-factors (columns) from all focus-factor contexts (rows) in K562, shows 9 functionally distinct clusters (C1 to C9) of contexts that can be broadly grouped as distal, proximal, mixed, and repressive. The blue rectangles highlight representative partner-factors with high RI in the clusters. The arrow from (b) to (d) indicates that the GATA1 context-specific RI scores form one row in this matrix. (e) Co-association variability map of partners (columns) of GATA1 (left panel) and FOS (right panel) over all K562 focus-factor contexts (rows). TAL1 and GATA2 show consistently high CS with GATA1 over most focus-factor contexts, but JUND shows context-specific co-association. FOS shows dramatic changes in CS of partner-factors over different contexts (e.g. FOS-JUND in distal contexts and FOS-SP2 in proximal ones). (More details in Fig. S2c, S2f-1, S2d, S2l-2.)

Figure 2
Figure 2. Overall Network

(a) Close-up of the TF hierarchy. The nodes depict the TFs: TFSSs are triangles, and non-TFSSs are circles. At the left we show the proximal-edge hierarchy with downward pointing edges colored in green, and upward pointing ones colored in red. The nodes are shaded according to their out-degree in the full network (as described in Table 1). The right part shows the TFs placed in the same proximal hierarchy but now with edges corresponding to distal regulation colored green and red, and nodes recolored according to out-degree in the distal network. We see that the distal edges do not follow the proximal-edge hierarchy. (b) Close-up of TF-miRNA regulation. The outer circle contains the 119 TFs, while the inner circle contains miRNAs. Red edges correspond to miRNAs regulating TFs; green ones, TFs regulating miRNAs. TFs and miRNAs each are arranged by their out-degree, beginning at 12 o'clock and decreasing in order clock-wise. Node sizes are proportional to out-degree. For TFs, the out-degree is as described in Table 1; for miRNAs, it is according to the out-degree in this network. Red nodes are enriched for miRNA-TF edges and green nodes are enriched for TF-miRNA edges. Gray nodes have a balanced number of edges (within ±1). (c) Average values of various properties (topological, dynamic, expression-related, and selection-related -ordered consistently with Table 1) for each level are shown for the proximal-edge hierarchy. The top, middle, and bottom rows correspond to the top, middle, and bottom of the hierarchy, respectively. The sizing of the grey circles indicate the relative ordering of the values for the three levels. Significantly different values (P<0.05) using the Wilcoxon-rank-sum test are indicated by black brackets. The proximal-edge hierarchy depicted on the right shows non-synonymous SNP density, where the shading corresponds to the density for the associated TF. (More details in Fig S4.)

Figure 3
Figure 3. Collaboration between Levels

(a) Enrichment of collaborating TF pairs from different levels (T,M,B). The TFs are represented by two nodes below each bar graph. The dashed orange line indicates the expected level of collaboration. Significant enrichment above or depletion below that level are marked by asterisks (P<0.05). (More details in SOM/G.1,2.) (b) Enrichment of proximal and distal co-regulatory pairs in the network hierarchy. Co-regulatory pairs from different levels are shown by the two nodes below each bar.

Figure 4
Figure 4. Motif Analysis

Motifs are accompanied by the occurrence frequency, N. Enriched motifs are highlighted in green, and depleted ones, in red. An occurrence frequency with a star means that the corresponding enrichment/depletion is statistically significant (P=1e-5). The motifs are sorted such that those at the ends have more significant p-values. (More details in Fig. S9h.) (a) Systematic search of 3-TF motifs. The most enriched motif is the FFL. A particular example formed by STAT1, STAT3 and RUNX1 is highlighted. Here, the “+” sign on an edge indicates that the correlation between the gene expression of the source and the target across tissues is positive. Other motifs containing a toggle-switch regulation on top of the basic FFL design are also indicated. (b) Proximal-Distal-PPI MIMs. Here we searched all motifs involving the co-regulation of two TFs (which could be either proximal or distal) with (or without) a protein-protein interaction between them. We found the motifs containing the protein-protein interaction tended to be enriched. (c) miRNA-SIMs. This figure shows the 2 enriched motifs resulting from enumerating all motifs in which a miRNA targets two TFs that are connected in various ways. These 2 motifs contain a protein complex of 2 TFs and a cooperative pair of promoter and distal regulatory TFs. (d) The auto-regulator motif is enriched in the TF-TF network: 28 of all TFs are auto-regulators. Moreover, auto-regulators are more likely to be repressors (-) relative to non-auto regulators, and they tend to have more ncRNAs as their targets.

Figure 5
Figure 5. Allelic Effects

(a) An “allelic effects network” depicting the increasing coordination between ASB and ASE as the number of TFs regulating a target increases. Central white nodes denote TFs, and peripheral nodes denote targets, which are blue (red) if they are expressed from the paternal (maternal) allele. Blue (red) edges denote ASB to the paternal (maternal) allele. This network represents the strongest differences between the paternal- and maternal-specific regulatory networks. As one goes around the larger circle counterclockwise (clockwise), each of the small circular clusters represents targets with progressively more paternal (maternal) regulation, indicated by the small blue (red) numbers to the side of the clusters. Moreover, within each of the clusters the fraction of predominantly paternally (maternally) expressed targets increases as one goes around the larger circle. As an illustration, this fraction is explicitly indicated by the ratios within three of the larger clusters at bottom right. (b) Relationship between TF allelicity and selection. The bar height is the ratio of the degree of selection (as measured by SNP density or average DAF) in those TF-binding peaks showing allelic behavior to the degree of selection in all other TF-binding peaks. Asterisks represent significant differences (P<0.05, Wilcoxon-rank-sum test). (More details in SOM/I.2 and Fig S10b,c.)

Similar articles

Cited by

References

    1. Lee TI, et al. Transcriptional regulatory networks in Saccharomyces cerevisiae. Science. 2002;298:799–804. - PubMed
    1. Balazsi G, Barabasi AL, Oltvai ZN. Topological units of environmental signal processing in the transcriptional regulatory network of Escherichia coli. Proc. Natl. Acad. Sci. U. S. A. 2005;102:7841–7846. - PMC - PubMed
    1. Yu HY, Gerstein M. Genomic analysis of the hierarchical structure of regulatory networks. Proc. Natl. Acad. Sci. U. S. A. 2006;103:14724–14731. - PMC - PubMed
    1. Hu ZZ, Killion PJ, Iyer VR. Genetic reconstruction of a functional transcriptional regulatory network. Nature Genet. 2007;39:683–687. - PubMed
    1. Balaji S, Babu MM, Aravind L. Interplay between network structures, regulatory modes and sensing mechanisms of transcription factors in the transcriptional regulatory network of E. coli. J. Mol. Biol. 2007;372:1108–1122. - PMC - PubMed

Publication types

MeSH terms

Substances