In vivo enhancer analysis of human conserved non-coding sequences - Nature
- ️Rubin, Edward M.
- ️Sun Nov 05 2006
- Letter
- Published: 05 November 2006
- Nadav Ahituv2,
- Alan M. Moses2,
- Shyam Prabhakar2,
- Marcelo A. Nobrega2 nAff5,
- Malak Shoukry2,
- Simon Minovitsky2,
- Inna Dubchak1,2,
- Amy Holt2,
- Keith D. Lewis2,
- Ingrid Plajzer-Frick2,
- Jennifer Akiyama2,
- Sarah De Val4,
- Veena Afzal2,
- Brian L. Black4,
- Olivier Couronne1,2,
- Michael B. Eisen2,3,
- Axel Visel2 &
- …
- Edward M. Rubin1,2
Nature volume 444, pages 499–502 (2006)Cite this article
-
14k Accesses
-
908 Citations
-
38 Altmetric
Abstract
Identifying the sequences that direct the spatial and temporal expression of genes and defining their function in vivo remains a significant challenge in the annotation of vertebrate genomes. One major obstacle is the lack of experimentally validated training sets. In this study, we made use of extreme evolutionary sequence conservation as a filter to identify putative gene regulatory elements, and characterized the in vivo enhancer activity of a large group of non-coding elements in the human genome that are conserved in human–pufferfish, Takifugu (Fugu) rubripes, or ultraconserved1 in human–mouse–rat. We tested 167 of these extremely conserved sequences in a transgenic mouse enhancer assay. Here we report that 45% of these sequences functioned reproducibly as tissue-specific enhancers of gene expression at embryonic day 11.5. While directing expression in a broad range of anatomical structures in the embryo, the majority of the 75 enhancers directed expression to various regions of the developing nervous system. We identified sequence signatures enriched in a subset of these elements that targeted forebrain expression, and used these features to rank all ∼3,100 non-coding elements in the human genome that are conserved between human and Fugu. The testing of the top predictions in transgenic mice resulted in a threefold enrichment for sequences with forebrain enhancer activity. These data dramatically expand the catalogue of human gene enhancers that have been characterized in vivo, and illustrate the utility of such training sets for a variety of biological applications, including decoding the regulatory vocabulary of the human genome.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Additional access options:
Similar content being viewed by others
References
Bejerano, G. et al. Ultraconserved elements in the human genome. Science 304, 1321–1325 (2004)
Roeder, R. G. & Rutter, W. J. Multiple forms of DNA-dependent RNA polymerase in eukaryotic organisms. Nature 224, 234–237 (1969)
Goldberg, M. L. Sequence Analysis of Drosophila Histone Genes. Ph.D. thesis, Stanford Univ. (1979)
Stathopoulos, A. & Levine, M. Genomic regulatory networks and animal development. Dev. Cell 9, 449–462 (2005)
Levine, M. & Tjian, R. Transcription regulation and animal diversity. Nature 424, 147–151 (2003)
Emison, E. S. et al. A common sex-dependent mutation in a RET enhancer underlies Hirschsprung disease risk. Nature 434, 857–863 (2005)
Kleinjan, D. A. & van Heyningen, V. Long-range control of gene expression: emerging mechanisms and disruption in disease. Am. J. Hum. Genet. 76, 8–32 (2005)
Lettice, L. A. et al. A long-range Shh enhancer regulates expression in the developing limb and fin and is associated with preaxial polydactyly. Hum. Mol. Genet. 12, 1725–1735 (2003)
Boffelli, D. et al. Phylogenetic shadowing of primate sequences to find functional regions of the human genome. Science 299, 1391–1394 (2003)
Nobrega, M. A., Ovcharenko, I., Afzal, V. & Rubin, E. M. Scanning human gene deserts for long-range enhancers. Science 302, 413 (2003)
Prabhakar, S. et al. Close sequence comparisons are sufficient to identify human cis-regulatory elements. Genome Res. 16, (7)855–863 (2006)
Woolfe, A. et al. Highly conserved non-coding sequences are associated with vertebrate development. PLoS Biol. 3, e7 (2005)
Kothary, R. et al. Inducible expression of an hsp68-lacZ hybrid gene in transgenic mice. Development 105, 707–714 (1989)
Rojas, A. et al. Gata4 expression in lateral mesoderm is downstream of BMP4 and is activated directly by Forkhead and GATA transcription factors through a distal enhancer element. Development 132, 3405–3417 (2005)
Rossant, J., Zirngibl, R., Cado, D., Shago, M. & Giguere, V. Expression of a retinoic acid response element-hsplacZ transgene defines specific domains of transcriptional activity during mouse embryogenesis. Genes Dev. 5, 1333–1344 (1991)
Yamagishi, H. et al. Tbx1 is regulated by tissue-specific forkhead proteins through a common Sonic hedgehog-responsive enhancer. Genes Dev. 17, 269–281 (2003)
Boffelli, D., Nobrega, M. A. & Rubin, E. M. Comparative genomics at the vertebrate extremes. Nature Rev. Genet. 5, 456–465 (2004)
Ahituv, N., Prabhakar, S., Poulin, F., Rubin, E. M. & Couronne, O. Mapping cis-regulatory domains in the human genome using multi-species conservation of synteny. Hum. Mol. Genet. 14, 3057–3063 (2005)
Kohlhase, J., Wischermann, A., Reichenbach, H., Froster, U. & Engel, W. Mutations in the SALL1 putative transcription factor gene cause Townes-Brocks syndrome. Nature Genet. 18, 81–83 (1998)
Buck, A., Kispert, A. & Kohlhase, J. Embryonic expression of the murine homologue of SALL1, the gene mutated in Townes–Brocks syndrome. Mech. Dev. 104, 143–146 (2001)
Carroll, S. B. Evolution at two levels: on genes and form. PLoS Biol. 3, e245 (2005)
Davidson, E. H. Genomic Regulatory Systems: In Development and Evolution (Academic, San Diego, 2001)
Lee, T. I. et al. Control of developmental regulators by Polycomb in human embryonic stem cells. Cell 125, 301–313 (2006)
Bard, J. L. et al. An internet-accessible database of mouse developmental anatomy based on a systematic nomenclature. Mech. Dev. 74, 111–120 (1998)
Gray, P. A. et al. Mouse brain organization revealed through direct genome-scale TF expression analysis. Science 306, 2255–2257 (2004)
Poulin, F. et al. In vivo characterization of a vertebrate ultraconserved enhancer. Genomics 85, 774–781 (2005)
van Helden, J., Andre, B. & Collado-Vides, J. Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. J. Mol. Biol. 281, 827–842 (1998)
Kurokawa, D. et al. Regulation of Otx2 expression and its functions in mouse forebrain and midbrain. Development 131, 3319–3331 (2004)
Zhou, J., Zwicker, J., Szymanski, P., Levine, M. & Tjian, R. TAFII mutations disrupt Dorsal activation in the Drosophila embryo. Proc. Natl Acad. Sci. USA 95, 13483–13488 (1998)
Acknowledgements
Research was conducted at the E. O. Lawrence Berkeley National Laboratory, under the Programs for Genomic Application, funded by the National Heart, Lung, and Blood Institute, USA as well as the National Human Genome Research Institute, USA, and performed under a Department of Energy Contract with the University of California.
Author information
Author notes
Marcelo A. Nobrega
Present address: Department of Human Genetics, University of Chicago, Chicago, Illinois, 60637, USA
Authors and Affiliations
US Department of Energy Joint Genome Institute, Walnut Creek, California, 94598, USA
Len A. Pennacchio, Inna Dubchak, Olivier Couronne & Edward M. Rubin
Genomics Division, MS 84-171, Lawrence Berkeley National Laboratory, Berkeley, California, 94720, USA
Len A. Pennacchio, Nadav Ahituv, Alan M. Moses, Shyam Prabhakar, Marcelo A. Nobrega, Malak Shoukry, Simon Minovitsky, Inna Dubchak, Amy Holt, Keith D. Lewis, Ingrid Plajzer-Frick, Jennifer Akiyama, Veena Afzal, Olivier Couronne, Michael B. Eisen, Axel Visel & Edward M. Rubin
Molecular and Cellular Biology Department, University of California-Berkeley, California, 954720, USA
Michael B. Eisen
Cardiovascular Research Institute, University of California, San Francisco, California, 94143-2240, USA
Sarah De Val & Brian L. Black
Authors
- Len A. Pennacchio
You can also search for this author inPubMed Google Scholar
- Nadav Ahituv
You can also search for this author inPubMed Google Scholar
- Alan M. Moses
You can also search for this author inPubMed Google Scholar
- Shyam Prabhakar
You can also search for this author inPubMed Google Scholar
- Marcelo A. Nobrega
You can also search for this author inPubMed Google Scholar
- Malak Shoukry
You can also search for this author inPubMed Google Scholar
- Simon Minovitsky
You can also search for this author inPubMed Google Scholar
- Inna Dubchak
You can also search for this author inPubMed Google Scholar
- Amy Holt
You can also search for this author inPubMed Google Scholar
- Keith D. Lewis
You can also search for this author inPubMed Google Scholar
- Ingrid Plajzer-Frick
You can also search for this author inPubMed Google Scholar
- Jennifer Akiyama
You can also search for this author inPubMed Google Scholar
- Sarah De Val
You can also search for this author inPubMed Google Scholar
- Veena Afzal
You can also search for this author inPubMed Google Scholar
- Brian L. Black
You can also search for this author inPubMed Google Scholar
- Olivier Couronne
You can also search for this author inPubMed Google Scholar
- Michael B. Eisen
You can also search for this author inPubMed Google Scholar
- Axel Visel
You can also search for this author inPubMed Google Scholar
- Edward M. Rubin
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence to Len A. Pennacchio.
Ethics declarations
Competing interests
Reprints and permissions information is available at www.nature.com/reprints. The authors declare no competing financial interests.
Supplementary information
Supplementary Table 1.
A summary of all the human conserved noncoding fragments tested for enhancer activity at embryonic day 11.5. Enhancer ID refers to a unique identifier defined at http://enhancer.lbl.gov. (XLS 37 kb)
Supplementary Table 2.
A compilation of human-fugu conserved noncoding elements in the human genome. (XLS 208 kb)
Supplementary Table 3.
The top 30 forebrain enhancer predictions in the human genome. The strategy to generate this list can be found in the Supplementary Methods. (XLS 18 kb)
Supplementary Methods.
An expanded version of the Materials and Methods. (DOC 61 kb)
Rights and permissions
About this article
Cite this article
Pennacchio, L., Ahituv, N., Moses, A. et al. In vivo enhancer analysis of human conserved non-coding sequences. Nature 444, 499–502 (2006). https://doi.org/10.1038/nature05295
Received: 14 June 2006
Accepted: 22 September 2006
Published: 05 November 2006
Issue Date: 23 November 2006
DOI: https://doi.org/10.1038/nature05295
This article is cited by
Editorial Summary
Gene regulators unmasked
Identifying the non-coding DNA sequences that act at a distance to regulate patterns of gene expression is not a simple matter; one useful pointer is evolutionary sequence conservation. An in vivo analysis of 167 non-coding elements in the human genome that are extremely conserved based on comparisons with pufferfish, rat and mouse genomes, has identified 75 previously unknown tissue-specific enhancers. These are active in embryos on day 11, most of them directing expression in the developing nervous system. The success of this method suggests that the further 5,500 non-coding sequences conserved between humans and pufferfish may yield another new batch of gene enhancers.