pubmed.ncbi.nlm.nih.gov

Single-cell RNA-seq analyses show that long non-coding RNAs are conspicuously expressed in Schistosoma mansoni gamete and tegument progenitor cell populations - PubMed

  • ️Sat Jan 01 2022

Single-cell RNA-seq analyses show that long non-coding RNAs are conspicuously expressed in Schistosoma mansoni gamete and tegument progenitor cell populations

David A Morales-Vicente et al. Front Genet. 2022.

Abstract

Schistosoma mansoni is a flatworm that causes schistosomiasis, a neglected tropical disease that affects over 200 million people worldwide. New therapeutic targets are needed with only one drug available for treatment and no vaccine. Long non-coding RNAs (lncRNAs) are transcripts longer than 200 nucleotides with low or no protein-coding potential. In other organisms, they have been shown as involved with reproduction, stem cell maintenance and drug resistance, and they tend to exhibit tissue-specific expression patterns. S. mansoni expresses thousands of lncRNA genes; however, the cell type expression patterns of lncRNAs in the parasite remain uncharacterized. Here, we have re-analyzed publicly available single-cell RNA-sequencing (scRNA-seq) data obtained from adult S. mansoni to identify the lncRNAs signature of adult schistosome cell types. A total of 8023 lncRNAs (79% of all lncRNAs) were detected. Analyses of the lncRNAs expression profiles in the cells using statistically stringent criteria were performed to identify 74 lncRNA gene markers of cell clusters. Male gamete and tegument progenitor lineages clusters contained most of the cluster-specific lncRNA markers. We also identified lncRNA markers of specific neural clusters. Whole-mount in situ hybridization (WISH) and double fluorescence in situ hybridization were used to validate the cluster-specific expression of 13 out of 16 selected lncRNA genes (81%) in the male and female adult parasite tissues; for one of these 16 gene loci, probes for two different lncRNA isoforms were used, which showed differential isoform expression in testis and ovary. An atlas of the expression profiles across the cell clusters of all lncRNAs detected in our analysis is available as a public website resource (http://verjolab.usp.br:8081). The results presented here give strong support to a tissue-specific expression and to a regulated expression program of lncRNAs in S. mansoni. This will be the basis for further exploration of lncRNA genes as potential therapeutic targets.

Keywords: RNA sequencing; RNA-seq; Schistosoma mansoni adult worms; long non-coding RNAs; parasitology; single-cell expression profiles; single-cell sequencing data analysis.

Copyright © 2022 Morales-Vicente, Zhao, Silveira, Tahira, Amaral, Collins and Verjovski-Almeida.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1

S. mansoni atlas of single-cells comprising expression data of protein-coding and lncRNA genes. UMAP plot of the 68 scRNA-seq clusters identified by Wendt et al. (2020) and projected onto our re-analyzed scRNA-seq data set. For each of the 48,094 cells recovered in our re-analysis, expression data for protein-coding and lncRNA genes was used, and all cells were assigned to one of the 68 clusters (see Methods). Cells are colored according to the cluster where they were mapped. Original cell cluster mapping data from Wendt et al. (2020) is shown in the background, colored in light grey. This atlas is available as a public website resource

http://verjolab.usp.br:8081

.

FIGURE 2
FIGURE 2

Percentage of cells mapped to each cluster in comparison to the number of cells in the original cluster annotation. The 68 clusters (indicated at left) are grouped according to the similarity of their gene expression patterns in the new, re-analyzed expression data set. The white bars indicate the percentage of cells that remained in the same cluster in the re-mapped data set, relative to the number of cells in the original cluster annotation. The black bars indicate the final percentage of cells in the cluster in the re-mapped data set, relative to the number of cells in the original cluster annotation. The numbers inside the white bars are the absolute numbers of cells that remained in the clusters after re-mapping, and on the right of the black bars are the absolute numbers of total cells in the clusters in the new, re-mapped data set. The vertical dotted red line goes through the 100% value in the x-axis.

FIGURE 3
FIGURE 3

Dot-plot of 74 lncRNA genes identified as markers of single-cell clusters. A dot-plot summarizing the cluster-specific expression of each of the 74 lncRNA genes identified as markers of cell clusters in adult S. mansoni. Cluster IDs are on the vertical axis and gene IDs are on the horizontal axis. Expression levels are colored by gene expression (blue = low, red = high). Percentage of cells in the cluster expressing the gene is indicated by the size of the circle (small = few, large = many). The colored boxes highlight the lncRNAs cited in the Results.

FIGURE 4
FIGURE 4

The lncRNAs are markers of 51 different single-cell clusters. The UpSet intersection diagram shows on the upper panel the number of S. mansoni lncRNAs (y-axis) that have been detected in each of the intersection sets, indicated by the connected points in the lower part of the plot, as being markers of the indicated single-cell clusters. On the left-most part of the plot are the lncRNAs that are markers of a unique single-cell cluster. On the right-most part are the lncRNAs that are markers of a group of single-cell clusters, joined with the connected dots.

FIGURE 5
FIGURE 5

A lncRNA marker of 13 different neuron clusters co-localizes with neuroendocrine protein 7b2 message. (A) UMAP plot of lncRNA neurons cluster marker G39666 (left) and of general neuronal marker 7b2 (right). UMAP plots are colored by gene expression (blue = low, red = high) and the scale represents log10(UMIs+1). The regions enclosed by red dashed lines indicate the location of the relevant neuron clusters on the UMAP plots. (B,C) Whole-mount in situ hybridization (WISH) of lncRNA gene G39666 in the head (left, top) and body (left, bottom) of a male (B) or a mature female (C). Scale bars are 100 µm. Double FISH with G39666 lncRNA and 7b2 of male (B) head sides (middle) and trunk (right), and of female (C) head (middle) and vitellarium (right). Nuclei: blue.

FIGURE 6
FIGURE 6

(A) lncRNA marker of 8 different neuron clusters and of muscle 2 cluster co-localizes with neuroendocrine protein 7b2 and muscle tropomyosin genes. (A) UMAP plot of lncRNA cluster marker G26764. UMAP plot is colored by gene expression (blue = low, red = high) and the scale represents log10(UMIs+1). The region enclosed by the red dashed line indicates the location of the relevant neuron cluster on the UMAP plot. (B) WISH of lncRNA G26764 in the head (top), trunk (middle) and tail (bottom) of a male [left] or a mature female [right]. Scale bars are 100 µm. (C–F) Double FISH with lncRNA G26764 and 7b2 gene in male head (C,D) and trunk (E,F); panels D and F show the dFISH images of other worm sections different from (C) and (E) (G,H) Double FISH with lncRNA G26764 and the general muscle marker gene tpm2 tropomyosin in the male head (G) and trunk (H). Nuclei: blue.

FIGURE 7
FIGURE 7

lncRNA markers of male and female gametes single-cell clusters are localized in testis and ovary. (A to C) UMAP plot (left) of the indicated lncRNA marker of male gametes cluster. WISH with the indicated lncRNA in a male head [right, top] and the ovary region of a female [right, bottom]. (D to F) UMAP plot (left) of the indicated lncRNA marker of female gametes cluster. WISH with the indicated lncRNA in a female region of the ovary and vitellarium [D, right, top] and male head [D, right, bottom]. WISH of the ovary region of a female [E,F, right, top] and male head [E,F, right, bottom]. UMAP plots are colored by gene expression (blue = low, red = high) and the scale represents log10(UMIs+1). The regions enclosed by the red dashed lines indicate the location of the relevant male or female gametes cluster on the UMAP plots. WISH scale bars are 100 µm.

FIGURE 8
FIGURE 8

lncRNA markers of tegument progenitor lineages co-localize with protein-coding genes known to mark those tegument progenitors. (A,E,I) UMAP plot of the indicated lncRNA marker of tegument progenitor lineages. (B,F,J) WISH with the indicated lncRNA in a male [left] and a female [right] head [top], trunk [middle] and tail [bottom]. (C,G,K) Double FISH in male head with the indicated lncRNA and the general tegument progenitor marker genes meg-1, zfp-1-1, egc (C), zfp-1-1 (G), zfp-1-1, meg-1 (K). (D,H,L) Double FISH in male trunk with the indicated lncRNA and the general tegument progenitor marker genes meg-1, zfp-1-1, egc (D), zfp-1-1 (H), zfp-1-1, meg-1 (L). UMAP plots are colored by gene expression (blue = low, red = high) and the scale represents log10(UMIs+1). The regions enclosed by the red dashed lines indicate the location of the relevant meg-1+ (A) or zfp-1-1+ (E,I) clusters on the UMAP plots. WISH scale bars are 100 µm.

FIGURE 9
FIGURE 9

Heterogeneity of expression of all lncRNAs or mRNAs that were detected in a given cluster of cells. For each cluster named at the top of each panel, the cumulative fraction of all lncRNAs or mRNAs that were detected as expressed in the cluster (y-axis) is shown as a function of the percentage of cells in which the lncRNAs or the mRNAs were detected (x-axis). For each cluster, the red curve shows the detected lncRNAs, the blue curve shows the set of mRNAs detected with expression levels in the same range as that of the lncRNAs, and the black curve shows the complete set of mRNAs detected in the cluster. The Kolmogorov-Smirnov (KS) statistical test False Discovery Rate (FDR) is shown for the comparison between the lncRNAs and the expression-matched set of mRNAs; in the seven panels where no KS FDR is shown, no statistical difference was found (FDR >0.05). The three clusters with the most significant differences (lowest FDRs) are marked with orange rectangles. Nine clusters with less than 100 cells each were excluded from this analysis.

Similar articles

Cited by

References

    1. Ayupe A. C., Tahira A. C., Camargo L., Beckedorff F. C., Verjovski-Almeida S., Reis E. M. (2015). Global analysis of biogenesis, stability and sub-cellular localization of lncRNAs mapping to intragenic regions of the human genome. RNA Biol. 12, 877–892. 10.1080/15476286.2015.1062960 - DOI - PMC - PubMed
    1. Bergquist R., Utzinger J., Keiser J. (2017a). Controlling schistosomiasis with praziquantel: How much longer without a viable alternative? Infect. Dis. Poverty 6, 74. 10.1186/s40249-017-0286-2 - DOI - PMC - PubMed
    1. Bergquist R., Zhou X. N., Rollinson D., Reinhard-Rupp J., Klohe K. (2017b). Elimination of schistosomiasis: The tools required. Infect. Dis. Poverty 6, 158. 10.1186/s40249-017-0370-7 - DOI - PMC - PubMed
    1. Chen J., Wang Y., Wang C., Hu J. F., Li W. (2020). LncRNA functions as a new emerging epigenetic factor in determining the fate of stem cells. Front. Genet. 11, 277. 10.3389/fgene.2020.00277 - DOI - PMC - PubMed
    1. Choudhary S., Satija R. (2022). Comparison and evaluation of statistical error models for scRNA-seq. Genome Biol. 23, 27. 10.1186/s13059-021-02584-9 - DOI - PMC - PubMed

LinkOut - more resources