Transposable elements reveal a stem cell-specific class of long noncoding RNAs - PubMed
- ️Sun Jan 01 2012
Transposable elements reveal a stem cell-specific class of long noncoding RNAs
David Kelley et al. Genome Biol. 2012.
Abstract
Background: Numerous studies over the past decade have elucidated a large set of long intergenic noncoding RNAs (lincRNAs) in the human genome. Research since has shown that lincRNAs constitute an important layer of genome regulation across a wide spectrum of species. However, the factors governing their evolution and origins remain relatively unexplored. One possible factor driving lincRNA evolution and biological function is transposable element (TE) insertions. Here, we comprehensively characterize the TE content of lincRNAs relative to genomic averages and protein coding transcripts.
Results: Our analysis of the TE composition of 9,241 human lincRNAs revealed that, in sharp contrast to protein coding genes, 83% of lincRNAs contain a TE, and TEs comprise 42% of lincRNA sequence. lincRNA TE composition varies significantly from genomic averages - L1 and Alu elements are depleted and broad classes of endogenous retroviruses are enriched. TEs occur in biased positions and orientations within lincRNAs, particularly at their transcription start sites, suggesting a role in lincRNA transcriptional regulation. Accordingly, we observed a dramatic example of HERVH transcriptional regulatory signals correlating strongly with stem cell-specific expression of lincRNAs. Conversely, lincRNAs devoid of TEs are expressed at greater levels than lincRNAs with TEs in all tissues and cell lines, particularly in the testis.
Conclusions: TEs pervade lincRNAs, dividing them into classes, and may have shaped lincRNA evolution and function by conferring tissue-specific expression from extant transcriptional regulatory signals.
Figures

Transposable element composition of human lincRNAs. We intersected TE annotations with a catalog of 9,241 human lincRNAs. (a) TEs compose less lincRNA sequence than genomic background but much more than protein coding genes. Promoters for the two gene classes are more similar than the transcripts. (b) The lincRNA frequencies of many specific TE families differ significantly (based on a shuffling statistical test) from their genomic averages. Larger families are to the right. Enrichments are above zero on the y-axis, and depletions are below zero. ERV1 families (labeled in blue) are particularly enriched.

Example lincRNAs with TE annotations. lincRNA exons are drawn above in blue, with introns colored lighter. TEs are colored by family, matching the legend in Figure 1a. (a) TUG1 serves as a typical example of a lincRNA containing multiple TE families. (b) Alternatively, HOTAIR and 1,531 (17%) of the lincRNAs in our catalog are devoid of TEs. (c) Linc-ROR is almost entirely composed of TEs, including its TSS in the LTR of a HERVH element. (d) BC026300 also initiates transcription in a HERVH. The images were created using the software AnnotationSketch [93].

ERV1 LTRs associate with lincRNA TSSs. We plotted the coverage of various TE families approaching lincRNA TSSs. The prevalent L1 and Alu families are depleted in lincRNAs. Accordingly, their coverage drops throughout lincRNA promoters leading up to the TSS. Alternatively, ERV1 elements are enriched in lincRNAs, and coverage of the transcription-promoting ERV1 LTRs peaks at the TSS. This pattern was not observed for mRNAs (Figure S11 in Additional file 1).

HERVH elements associate with stem cell-specific lincRNA expression. (a) HERVH is a primate-specific 9 kb endogenous retrovirus containing the group specific antigen (Gag), protease (Pro), polymerase (Pol), and envelope (Env) proteins, surrounded on both sides by transcription-promoting LTRs. (b) 127 lincRNAs (columns) contain HERVH elements and expression of these lincRNAs (measured as log2(FPKM + 0.25)) across cell types (rows) is highly specific to the pluripotent H1-hESCs and iPSCs. (c) HERVH-lincRNAs are expressed at much greater levels than lincRNAs devoid of HERVH (dHERVH-lincRNAs) in ESCs, displayed here as the cumulative distribution of FPKM + 0.25. (d) ChIP-Seq read coverage indicates that HERVH-lincRNAs are marked by the activating histone modification H3K4me3 in H1-hESCs but not GM12878 where expression is low. (e) The transcription factor SP1 was previously found to be required for HERVH transcription. Accordingly, ChIP-Seq read coverage shows SP1 occupies the TSSs of HERVH-lincRNAs in H1-hESC but not GM12878.

Mouse lincRNAs share TE composition properties. (a) Similar to the human genome, in a catalog of 981 mouse lincRNAs, TEs are depleted overall relative to the genomic background frequency, but still a substantial 33% of sequence is TE-derived. (b) TEs also exhibit biased composition in mouse lincRNAs, with strong L1 depletion and ERV1 enrichment, matching observations in human.
Similar articles
-
Comparative analysis of lincRNA in insect species.
Lopez-Ezquerra A, Harrison MC, Bornberg-Bauer E. Lopez-Ezquerra A, et al. BMC Evol Biol. 2017 Jul 3;17(1):155. doi: 10.1186/s12862-017-0985-0. BMC Evol Biol. 2017. PMID: 28673235 Free PMC article.
-
Transposable elements (TEs) contribute to stress-related long intergenic noncoding RNAs in plants.
Wang D, Qu Z, Yang L, Zhang Q, Liu ZH, Do T, Adelson DL, Wang ZY, Searle I, Zhu JK. Wang D, et al. Plant J. 2017 Apr;90(1):133-146. doi: 10.1111/tpj.13481. Plant J. 2017. PMID: 28106309 Free PMC article.
-
Liu WH, Tsai ZT, Tsai HK. Liu WH, et al. BMC Genomics. 2017 Oct 16;18(1):786. doi: 10.1186/s12864-017-4156-x. BMC Genomics. 2017. PMID: 29037146 Free PMC article.
-
The intertwining of transposable elements and non-coding RNAs.
Hadjiargyrou M, Delihas N. Hadjiargyrou M, et al. Int J Mol Sci. 2013 Jun 26;14(7):13307-28. doi: 10.3390/ijms140713307. Int J Mol Sci. 2013. PMID: 23803660 Free PMC article. Review.
-
How to tame an endogenous retrovirus: HERVH and the evolution of human pluripotency.
Römer C, Singh M, Hurst LD, Izsvák Z. Römer C, et al. Curr Opin Virol. 2017 Aug;25:49-58. doi: 10.1016/j.coviro.2017.07.001. Epub 2017 Jul 24. Curr Opin Virol. 2017. PMID: 28750248 Review.
Cited by
-
The pluripotent stem cell-specific transcript ESRG is dispensable for human pluripotency.
Takahashi K, Nakamura M, Okubo C, Kliesmete Z, Ohnuki M, Narita M, Watanabe A, Ueda M, Takashima Y, Hellmann I, Yamanaka S. Takahashi K, et al. PLoS Genet. 2021 May 25;17(5):e1009587. doi: 10.1371/journal.pgen.1009587. eCollection 2021 May. PLoS Genet. 2021. PMID: 34033652 Free PMC article.
-
Basu S, Hadzhiev Y, Petrosino G, Nepal C, Gehrig J, Armant O, Ferg M, Strahle U, Sanges R, Müller F. Basu S, et al. Sci Rep. 2016 Sep 15;6:33210. doi: 10.1038/srep33210. Sci Rep. 2016. PMID: 27628538 Free PMC article.
-
The evolutionary landscape and expression pattern of plant lincRNAs.
Chen L, Zhu QH. Chen L, et al. RNA Biol. 2022 Jan;19(1):1190-1207. doi: 10.1080/15476286.2022.2144609. RNA Biol. 2022. PMID: 36382947 Free PMC article.
-
Schettini GP, Morozyuk M, Biase FH. Schettini GP, et al. BMC Genomics. 2024 Aug 9;25(1):775. doi: 10.1186/s12864-024-10685-5. BMC Genomics. 2024. PMID: 39118001 Free PMC article.
-
Unravelling the impact of aging on the human endothelial lncRNA transcriptome.
Drekolia MK, Talyan S, Cordellini Emídio R, Boon RA, Guenther S, Looso M, Dumbović G, Bibli SI. Drekolia MK, et al. Front Genet. 2022 Oct 21;13:1035380. doi: 10.3389/fgene.2022.1035380. eCollection 2022. Front Genet. 2022. PMID: 36338971 Free PMC article.
References
-
- Guttman M, Garber M, Levin JZ, Donaghey J, Robinson J, Adiconis X, Fan L, Koziol MJ, Gnirke A, Nusbaum C, Rinn JL, Lander ES, Regev A. Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nat Biotechnol. 2010;28:503–510. - PMC - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Miscellaneous