pubmed.ncbi.nlm.nih.gov

Complete sequence verification of plasmid DNA using the Oxford Nanopore Technologies' MinION device - PubMed

  • ️Sun Jan 01 2023

Complete sequence verification of plasmid DNA using the Oxford Nanopore Technologies' MinION device

Scott D Brown et al. BMC Bioinformatics. 2023.

Abstract

Background: Sequence verification is essential for plasmids used as critical reagents or therapeutic products. Typically, high-quality plasmid sequence is achieved through capillary-based Sanger sequencing, requiring customized sets of primers for each plasmid. This process can become expensive, particularly for applications where the validated sequence needs to be produced within a regulated and quality-controlled environment for downstream clinical research applications.

Results: Here, we describe a cost-effective and accurate plasmid sequencing and consensus generation procedure using the Oxford Nanopore Technologies' MinION device as an alternative to capillary-based plasmid sequencing options. This procedure can verify the identity of a pure population of plasmid, either confirming it matches the known and expected sequence, or identifying mutations present in the plasmid if any exist. We use a full MinION flow cell per plasmid, maximizing available data and allowing for stringent quality filters. Pseudopairing reads for consensus base calling reduces read error rates from 5.3 to 0.53%, and our pileup consensus approach provides per-base counts and confidence scores, allowing for interpretation of the certainty of the resulting consensus sequences. For pure plasmid samples, we demonstrate 100% accuracy in the resulting consensus sequence, and the sensitivity to detect small mutations such as insertions, deletions, and single nucleotide variants. In test cases where the sequenced pool of plasmids contains subclonal templates, detection sensitivity is similar to that of traditional capillary sequencing.

Conclusions: Our pipeline can provide significant cost savings compared to outsourcing clinical-grade sequencing of plasmids, making generation of high-quality plasmid sequence for clinical sequence verification more accessible. While other long-read-based methods offer higher-throughput and less cost, our pipeline produces complete and accurate sequence verification for cases where absolute sequence accuracy is required.

Keywords: Cell therapy; Long-read sequencing; MinION; Plasmid sequencing.

© 2023. The Author(s).

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1

Strand-specific errors in base calling. Sequence logos are shown for one stretch of sequence from forward (top) and reverse (bottom) reads. The height of each letter gives the fraction of reads at that position that supported that base. Position 6 (and position 3, to a lesser extent) shows a strand-specific error – the correct base is C, but in the reverse reads a T is the most-evidenced base. When forward and reverse reads are pooled, there appears to be two bases at this position

Fig. 2
Fig. 2

Overview of consensus read support across BCRxV.GagPolRev. Filled points denote the most-evidenced base at each position (x-axis) and hollow points denote the second-most-evidenced base. The y-axis shows the number of reads supporting each base. In general, nearly all reads support the top base, and there is good separation between signal (filled points) and noise (hollow points). For BCRxV.GagPolRev, the lowest signal to noise ratio is between the C and T at position 7554. We see a decrease in coverage at the extreme 5’ and 3’ ends of the plasmid

Fig. 3
Fig. 3

Decreased read support across homopolymer stretches. Points denote the most-evidenced base at each position across this region of BCRxV.VSVG with a 17mer homopolymer. Due to difficulty base calling the exact number of bases in homopolymer regions, many reads report fewer than 17 bases. This results in an apparent coverage drop off across the homopolymer. The horizontal dashed grey line at 10% of the max coverage represents the minimum base coverage required to add a base to the consensus. Note that for this visualization, indels were right-aligned to show coverage dropping from left to right

Fig. 4
Fig. 4

Subclonal variant template detection sensitivity. For three different mutation classes (SNV, insertion, deletion), the minimum percent abundance of variant reads (x-axis) that allowed detection of the mutation in the in silico generated mixed datasets is plotted. SNVs and insertions are detectable when around 30% of reads are from the variant template, while deletions were not detectable until 95% of the reads were from the variant template

Similar articles

Cited by

References

    1. FDA. Guidance for industry: considerations for plasmid DNA vaccines for infectious disease indications. 2007. https://www.fda.gov/media/73667/download. Accessed 3 Jun 2022.
    1. Health Canada therapeutic products programme. Guideline for industry quality of biotechnological products : analysis of the expression construct in cells used for production of r-DNA derived protein products, ICH Topic Q5B. 2001. https://publications.gc.ca/collections/Collection/H42-2-67-19-2000E.pdf. Accessed 3 Jun 2022.
    1. Sanger F, Nicklen S, Coulson AR. DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci USA. 1977;74:5463–5467. doi: 10.1073/pnas.74.12.5463. - DOI - PMC - PubMed
    1. Shapland EB, Holmes V, Reeves CD, Sorokin E, Durot M, Platt D, et al. Low-cost, high-throughput sequencing of DNA assemblies using a highly multiplexed nextera process. ACS Synth Biol. 2015;4:860–866. doi: 10.1021/sb500362n. - DOI - PubMed
    1. Gallegos JE, Rogers MF, Cialek CA, Peccoud J. Rapid, robust plasmid verification by de novo assembly of short sequencing reads. Nucleic Acids Res. 2020;48:e106–e106. doi: 10.1093/nar/gkaa727. - DOI - PMC - PubMed

MeSH terms

Substances

LinkOut - more resources