pubmed.ncbi.nlm.nih.gov

Lineage and species-specific long noncoding RNAs during erythro-megakaryocytic development - PubMed

  • ️Wed Jan 01 2014

. 2014 Mar 20;123(12):1927-37.

doi: 10.1182/blood-2013-12-544494. Epub 2014 Feb 4.

Affiliations

Lineage and species-specific long noncoding RNAs during erythro-megakaryocytic development

Vikram R Paralkar et al. Blood. 2014.

Abstract

Mammals express thousands of long noncoding (lnc) RNAs, a few of which are known to function in tissue development. However, the entire repertoire of lncRNAs in most tissues and species is not defined. Indeed, most lncRNAs are not conserved, raising questions about function. We used RNA sequencing to identify 1109 polyadenylated lncRNAs expressed in erythroblasts, megakaryocytes, and megakaryocyte-erythroid precursors of mice, and 594 in erythroblasts of humans. More than half of these lncRNAs were unannotated, emphasizing the opportunity for new discovery through studies of specialized cell types. Analysis of the mouse erythro-megakaryocytic polyadenylated lncRNA transcriptome indicates that ~75% arise from promoters and 25% from enhancers, many of which are regulated by key transcription factors including GATA1 and TAL1. Erythroid lncRNA expression is largely conserved among 8 different mouse strains, yet only 15% of mouse lncRNAs are expressed in humans and vice versa, reflecting dramatic species-specificity. RNA interference assays of 21 abundant erythroid-specific murine lncRNAs in primary mouse erythroid precursors identified 7 whose knockdown inhibited terminal erythroid maturation. At least 6 of these 7 functional lncRNAs have no detectable expression in human erythroblasts, suggesting that lack of conservation between mammalian species does not predict lack of function.

PubMed Disclaimer

Figures

Figure 1
Figure 1

Identification and characterization of mouse erythro-megakaryocytic lncRNAs. (A) Bioinformatic pipeline for identification of lncRNAs. See “Materials and methods” for details. (B) Venn diagram showing protein coding potential of low-stringency lncRNAs as assessed by 3 different bioinformatic tools. Note that bioinformatic tools, even after calibration with test datasets (supplemental Figure 1C), show limited correlation with each other in determining coding potential. (C) Pie chart showing composition of the mouse erythro-megakaryocytic polyA+ transcriptome; 8.4% of genes are candidate lncRNA genes (low + high stringency). (D) Most erythromegakaryocytic lncRNAs are not annotated in RefSeq, UCSC, or Ensembl datasets. (E) Expression measured as FPKM for gene categories. LncRNA genes have overall lower expression than coding genes. (F) Cell specificity of RNAs according to type. Note that approximately one-third of high-stringency lncRNAs are detectable in only 1 of the cell types indicated at the top of panel A, as compared with 90% of coding genes, most of which are detectable in all 3 types. (G) Percentage of lncRNA genes in orientations relative to nearby coding genes. ORF, open reading frame.

Figure 2
Figure 2

Most mouse polyA+ erythro-megakaryocytic lncRNAs arise from conventional promoters and are regulated by key transcription factors. (A) Histone modifications H3K36me3, H3K4me1, and H3K4me3 were determined globally by ChIP-seq and analyzed at gene TSSs. Composite profiles are shown for promoter signatures (H3K4me3-high/H3K4me1-low) and enhancer signatures (H3K4me3-low/H3K4me1-high). The x-axis shows distance from TSS; y-axis shows average peak score for the indicated modifications. (B) Relative proportions of enhancer and promoter signatures at the TSSs of erythro-megakaryocytic coding and lncRNA genes. (C) Transcription factor (TF) occupancy at erythro-megakaryocytic coding and lncRNA genes measured by ChIP-seq. The genes are subclassified according to whether they are up- or downregulated in erythroblasts or megakaryocytes compared with MEPs, as measured by RNA-seq. (D) The expression of erythroid coding and noncoding genes in the Gata1 mouse erythroblast line G1E-ER4 after activation of an estradiol-induced form of GATA1. (E) Model for megakaryocytic developmental priming. Genes with binding of a heptad of TFs in stem/progenitor cells are “primed” for megakaryocytic differentiation, and are activated during megakaryopoiesis and repressed during erythropoiesis. (F) Percentage of genes in each expression group showing occupancy by the TF heptad in the HPC-7 cell line. For both coding genes and lncRNAs, genes that are upregulated during megakaryopoiesis and either downregulated or unchanged during erythropoiesis show higher levels of genomic locus heptad occupancy in progenitor cells. Dn, downregulated; N, no change; Up, upregulated.

Figure 3
Figure 3

Most lncRNA erythroid genes are not cross-expressed in mice and humans. (A) Boxplots showing the extent to which mature RNAs overlap with PhastConsElements, reflecting DNA sequence conservation in multispecies alignments. The degree of overlap for lncRNA genes is greater than random, but lower than for coding genes. (B) PhastConsElement overlap of lncRNA genes subclassified by orientation relative to coding genes (see Figure 1G). (C) Percentages of mouse erythroid genes transmapping to the human genome and expressed in the human erythroblasts (combined transcriptomes of 5 maturational stages, as detailed in the text). Less than 20% of murine erythroid lncRNA genes are expressed in humans despite identified transmapping orthologous regions for most. (D) Percentages of human erythroid genes transmapping to the mouse genome and expressed in mouse erythroblasts showing low levels of conserved lncRNA expression. HS, high stringency; LS, low stringency.

Figure 4
Figure 4

Murine erythroid lncRNAs are largely conserved between different mouse strains. (A) Cross-expression of coding and lncRNA genes in splenic erythroblasts of 8 different mouse strains. Note that lncRNA expression is generally conserved between strains, although at a slightly lower level than coding genes. (B) Similar analysis as in panel A, with lowest expression quartile of lncRNAs excluded, shows more than 95% conserved lncRNA expression between most strains. (C) Comparisons of coding and lncRNA gene expression levels in 2 representative pairs of mouse strains. Note that the expression levels of both lncRNA and coding genes are conserved (2-tailed Spearman’s R >0.95 for coding mRNAs, >0.89 for low-stringency lncRNAs, and >0.80 for high-stringency lncRNAs).

Figure 5
Figure 5

Mouse lncRNAs selected for RNAi studies. Twenty-seven lncRNAs were selected according to criteria described in the text. The relative expression of these lncRNAs in purified hematopoietic lineages from adult mouse bone marrow was determined using a custom oligonucleotide array and is shown graphically. “Conserved” refers to the presence of assembled orthologous transcripts detected in human erythroblasts (Figure 3C). Candidate erythroid lncRNAs shown here are upregulated 2- to 2054-fold (median 41-fold) in erythroblasts compared with multipotent progenitors. Names assigned to lncRNAs that potentially function in erythropoiesis according to RNAi studies in Figure 6 are indicated. Lnc051 was previously named “Lincred1.” Ggnbp2os was not detectable in primary human erythroblasts, but transcription cannot be ruled out in human K562 erythroleukemia cells (supplemental Figure 8). CMP, common myeloid progenitor; EE, early erythroblast; Expr, absolute expression level by microarray; GMP, granulocyte macrophage progenitor; GR, granulocyte; HSC, hematopoietic stem cell; LE, late erythroblast.

Figure 6
Figure 6

RNAi of multiple mouse lncRNA genes inhibits erythroblast maturation. (A) Mouse fetal liver erythroid progenitors were infected with retroviruses encoding shRNAs against lncRNAs, cultured for 48 hours in an expansion medium to maintain the immature state, and then switched to a medium that facilitates terminal maturation. Representative flow cytometry plots quantifying the proportion of anucleate reticulocytes (black rectangles) after 48 hours of maturation are shown. Hoechst 33342 is a cell permeable nuclear dye. Ter119 is an erythroid maturation marker. (B) Summary of multiple experiments showing percentage of reticulocytes in shRNA-expressing erythroid cultures after 48 hours of maturation, performed as shown in panel A. Vector-only and shLuciferase shRNAs are negative controls; shGATA1 and shFOG1 shRNAs are positive controls used to suppress expression of the essential erythroid TFs. The results of 2 different shRNAs used to knockdown each candidate are shown as pairs of bars for each lncRNA. n = 3 biological replicates. (C) Representative flow cytometry plots assessing the maturation stages of nucleated (Hoechsthigh) erythroblasts at 48 hours of maturation, using the immature erythroblast marker CD44 and forward scatter (FSC), which reflects cell size. Gates (from highest to lowest) mark early-, middle-, and late-stage erythroblasts; the percentage of cells in the late stage gate is listed. (D) Summary of multiple experiments showing maturation stage distribution of nucleated erythroblasts in shRNA-expressing erythroid cultures at 24 and 48 hours, performed as shown in panel C. Knockdown of 2 lncRNAs (Erytha and Scarletltr, dotted black rectangles) reduced the proportion of mature erythroblasts by >25% at 48 hours by 2 different shRNAs. n = 3 biological replicates. (E) Single molecule RNA fluorescent in situ hybridization (FISH) on primary mouse fetal liver erythroblasts for Cyclin A2 mRNA and for lncRNAs Bloodlinc, Galont, Redrum, and Lincred1. Single-slice images depict subcellular localization; composite images depict overall lncRNA abundance. Each composite image represents a maximum projection of a stack of z-slices taken through the volume of the cell. The nucleus is labeled blue with 4,6 diamidino-2-phenylindole, and single RNA molecules are visible as white punctate spots.

Figure 7
Figure 7

The mouse erythroid lncRNA Redrum is not expressed in human erythroblasts. UCSC Genome browser images of the mouse Redrum (Lnc046) locus (A) and the orthologous human region (B). Images for RNA-seq studies, histone modifications, and TF binding in erythroid cells are indicated. Note that active transcription (RNA-seq) and histone H3 chromatin marks occur in mice, but not humans. H3K4me3 marks the active promoter and H3K36me3 marks DNA polymerase II elongation. Mouse RNA-seq tracks are from primary fetal liver (Ter119+) erythroblasts, CD41+ megakaryocytes cultured from primary fetal liver erythroblasts, fluorescent-activated cell sorting–purified MEPs (all from this study), and mouse fetal liver erythroblasts analyzed by ENCODE. Mouse ChIP-seq tracks are from fetal liver (Ter119+) erythroblasts. Human RNA-seq tracks are from umbilical cord blood CD34+-derived erythroid cultures fractionated by flow cytometry into 5 different stages (pro, early basophilic, late basophilic, polychromatophilic, and orthochromatic erythroblasts). Histone modifications tracks are from ENCODE K562 erythroleukemia cells; the GATA1 track is from ENCODE peripheral blood–derived erythroblasts.

Similar articles

Cited by

References

    1. Ulitsky I, Bartel DP. lincRNAs: genomics, evolution, and mechanisms. Cell. 2013;154(1):26–46. - PMC - PubMed
    1. Kung JT, Colognori D, Lee JT. Long noncoding RNAs: past, present, and future. Genetics. 2013;193(3):651–669. - PMC - PubMed
    1. Kowalczyk MS, Higgs DR, Gingeras TR. Molecular biology: RNA discrimination. Nature. 2012;482(7385):310–311. - PubMed
    1. Paralkar VR, Weiss MJ. Long noncoding RNAs in biology and hematopoiesis. Blood. 2013;121(24):4842–4846. - PMC - PubMed
    1. Hu W, Alvarez-Dominguez JR, Lodish HF. Regulation of mammalian cell differentiation by long non-coding RNAs. EMBO Rep. 2012;13(11):971–983. - PMC - PubMed

Publication types

MeSH terms

Substances