pubmed.ncbi.nlm.nih.gov

Hybridization-based reconstruction of small non-coding RNA transcripts from deep sequencing data - PubMed

  • ️Invalid Date

Hybridization-based reconstruction of small non-coding RNA transcripts from deep sequencing data

Chikako Ragan et al. Nucleic Acids Res. 2012 Sep.

Abstract

Recent advances in RNA sequencing technology (RNA-Seq) enables comprehensive profiling of RNAs by producing millions of short sequence reads from size-fractionated RNA libraries. Although conventional tools for detecting and distinguishing non-coding RNAs (ncRNAs) from reference-genome data can be applied to sequence data, ncRNA detection can be improved by harnessing the full information content provided by this new technology. Here we present NorahDesk, the first unbiased and universally applicable method for small ncRNAs detection from RNA-Seq data. NorahDesk utilizes the coverage-distribution of small RNA sequence data as well as thermodynamic assessments of secondary structure to reliably predict and annotate ncRNA classes. Using publicly available mouse sequence data from brain, skeletal muscle, testis and ovary, we evaluated our method with an emphasis on the performance for microRNAs (miRNAs) and piwi-interacting small RNA (piRNA). We compared our method with Dario and mirDeep2 and found that NorahDesk produces longer transcripts with higher read coverage. This feature makes it the first method particularly suitable for the prediction of both known and novel piRNAs.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.

Steps to reconstruct small ncRNA transcripts.

Figure 2.
Figure 2.

Size distribution of contigs and predicted transcripts. The shift in the size distribution of contigs (blue) and predicted transcripts (red) in brain, muscle, testis and ovary. The x-axis shows the size in number of nucleotides and the y-axis shows the corresponding density as the smoothed and normalized contig- and transcript count, respectively.

Figure 3.
Figure 3.

Distribution of ncRNA types in different tissues. The figures show the fraction of reads overlap with known and novel classes of RNAs in brain, skeletal muscles, testis and ovary.

Figure 4.
Figure 4.

Known versus predicted structure of mmu-let-7f1. The top figure shows the known structure of mmu-let-7f1 from miRBase and the bottom shows the predicted structure of reconstructed transcript. The mature miRNA-duplex is shown in purple.

Figure 5.
Figure 5.

Example of long miRNA transcript. Predicted miRNA transcript from chr17:17967156-17967398 (+strand) overlaps with miR-99b precursor and contains additional one un-annotated contig.

Figure 6.
Figure 6.

Examples of predicted piRNA transcripts. (A) Predicted piRNAs transcripts from chr10:18517077-18517536 (-strand); 4 out of 7 contigs overlap with known piRNAs and all contigs have 5′ uridine. (B) Predicted piRNAs transcripts from chr6:128121896-128122316 (-strand); only one out of 6 contigs overlap with a known piRNA and 4 out of 6 contigs have 5′ uridine.

Similar articles

Cited by

References

    1. Mattick JS, Taft RJ, Faulkner GJ. A global view of genomic information–moving beyond the gene and the master regulator. Trends Genet. 2010;26:21–28. - PubMed
    1. Birney E, Stamatoyannopoulos JA, Dutta A, Guigo R, Gingeras TR, Margulies EH, Weng Z, Snyder M, Dermitzakis ET, Thurman RE, et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007;447:799–816. - PMC - PubMed
    1. Friedländer MR, Chen W, Adamidi C, Maaskola J, Einspanier R, Knespel S, Rajewsky N. Discovering microRNAs from deep sequencing data using miRDeep. Nat. Biotechnol. 2008;26:407–415. - PubMed
    1. Creighton CJ, Reid JG, Gunaratne PH. Expression profiling of microRNAs by deep sequencing. Brief. Bioinform. 2009;10:490–497. - PMC - PubMed
    1. Bar M, Wyman SK, Fritz BR, Qi JL, Garg KS, Parkin RK, Kroh EM, Bendoraite A, Mitchell PS, Nelson AM, et al. MicroRNA discovery and profiling in human embryonic stem cells by deep sequencing of small RNA libraries. Stem Cells. 2008;26:2496–2505. - PMC - PubMed

Publication types

MeSH terms

Substances