Benchmarking the Effectiveness and Accuracy of Multiple Mitochondrial DNA Variant Callers: Practical Implications for Clinical Application - PubMed
- ️Sat Jan 01 2022
Benchmarking the Effectiveness and Accuracy of Multiple Mitochondrial DNA Variant Callers: Practical Implications for Clinical Application
Eddie K K Ip et al. Front Genet. 2022.
Abstract
Mitochondrial DNA (mtDNA) mutations contribute to human disease across a range of severity, from rare, highly penetrant mutations causal for monogenic disorders to mutations with milder contributions to phenotypes. mtDNA variation can exist in all copies of mtDNA or in a percentage of mtDNA copies and can be detected with levels as low as 1%. The large number of copies of mtDNA and the possibility of multiple alternative alleles at the same DNA nucleotide position make the task of identifying allelic variation in mtDNA very challenging. In recent years, specialized variant calling algorithms have been developed that are tailored to identify mtDNA variation from whole-genome sequencing (WGS) data. However, very few studies have systematically evaluated and compared these methods for the detection of both homoplasmy and heteroplasmy. A publicly available synthetic gold standard dataset was used to assess four mtDNA variant callers (Mutserve, mitoCaller, MitoSeek, and MToolBox), and the commonly used Genome Analysis Toolkit "best practices" pipeline, which is included in most current WGS pipelines. We also used WGS data from 126 trios and calculated the percentage of maternally inherited variants as a metric of calling accuracy, especially for homoplasmic variants. We additionally compared multiple pathogenicity prediction resources for mtDNA variants. Although the accuracy of homoplasmic variant detection was high for the majority of the callers with high concordance across callers, we found a very low concordance rate between mtDNA variant callers for heteroplasmic variants ranging from 2.8% to 3.6%, for heteroplasmy thresholds of 5% and 1%. Overall, Mutserve showed the best performance using the synthetic benchmark dataset. The analysis of mtDNA pathogenicity resources also showed low concordance in prediction results. We have shown that while homoplasmic variant calling is consistent between callers, there remains a significant discrepancy in heteroplasmic variant calling. We found that resources like population frequency databases and pathogenicity predictors are now available for variant annotation but still need refinement and improvement. With its peculiarities, the mitochondria require special considerations, and we advocate that caution needs to be taken when analyzing mtDNA data from WGS data.
Keywords: benchmarking; heteroplasmic; homoplasmic; mitochondrial DNA; variant‐caller; whole-genome sequencing.
Copyright © 2022 Ip, Troup, Xu, Winlaw, Dunwoodie and Giannoulatou.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures

Accuracy comparison of mtDNA variant callers at heteroplasmy threshold of 1%, across different Taq polymerases and DNA extraction protocols using a synthetic benchmark dataset. (A) F1 scores of four variant callers using Clontech Taq polymerase and total DNA extraction. (B) F1 scores of four variant callers using Herk Taq polymerase and total DNA extraction. (C) F1 scores of four variant callers using NEB Taq polymerase and total DNA extraction. (D) F1 scores of four variant callers using Clontech Taq polymerase and PCR products. (E) F1 scores of four variant callers using Herk Taq polymerase and PCR products. (F) F1 scores of four variant callers using NEB Taq polymerase and PCR products.

Allelic distributions of mitochondrial variants according to four variant calling methods. (A) Allelic fraction distributions of heteroplasmic variants with default detection threshold. The heteroplasmic variant allele fractions for Mutserve, MitoSeek, and MToolBox are at the minimum and equal to their default heteroplasmy thresholds of 1%, 5%, and 20%, respectively. mitoCaller has no threshold, which is reflected in the low VAF values in the plot. With GATK Haplotypecaller, which is not a specific mitochondrial caller, the allelic fraction represents values typical of a heterozygous call from autosomal genomic variant calling (median values: GATK Haplotypecaller—29.9%; mitoCaller—0.5%; Mutserve—1.5%; MitoSeek—4.7%; MToolBox—99.7%). (B) Allelic fraction distributions for heteroplasmic variants at a detection threshold of 1% (median values: GATK Haplotypecaller—29.9%; mitoCaller—2.2%; Mutserve—1.5%; MitoSeek—1.5%; MToolBox—1.9%). (C) Allelic fraction distributions for heteroplasmic variants at a detection threshold of 5% (median values: GATK Haplotypecaller—9.9%; mitoCaller—54%; Mutserve—37.3%; MitoSeek—14.7%; MToolBox—99.6%). (D) Allelic fraction distributions for homoplasmic variants with default detection thresholds and excluding MitoSeek, which does not call homoplasmic variants. As expected for homoplasmic variants, the allelic fractions are almost all at the 100% level. (E) Allelic fraction distributions for homoplasmic variants at a detection threshold of 1%. There is no change for GATK Haplotypecaller and Mutserve calls. mitoCaller now includes homoplasmic variants that are in the 99% range (100%—1% heteroplasmy detection threshold). (F) Allelic fraction distributions for homoplasmic variants at a heteroplasmy detection threshold of 5%, showing an allele fraction change for mitoCaller to include 95% range.

Concordance of mitochondrial variants between the variant callers, Mutserve, MitoSeek, mitoCaller, and GATK Haplotypecaller. (A) Concordance of heteroplasmic variants called by the various callers using their default parameters. The first value in the diagram represents the number of mitochondrial variants, and the second value is the percentage of total. (B) Concordance of homoplasmic variants called by the various callers using their default parameters, excluding MitoSeek that does not identify homoplasmic variants. (C) Concordance of heteroplasmic variants called by the various callers using a heteroplasmic threshold of 5%. (D) Concordance of homoplasmic variants called by the various callers using a heteroplasmic threshold of 5%.
Similar articles
-
Zhu C, Tong T, Farrell JJ, Martin ER, Bush WS, Pericak-Vance MA, Wang LS, Schellenberg GD, Haines JL, Lunetta KL, Farrer LA, Zhang X. Zhu C, et al. J Alzheimers Dis Rep. 2024 Apr 8;8(1):575-587. doi: 10.3233/ADR-230120. eCollection 2024. J Alzheimers Dis Rep. 2024. PMID: 38746629 Free PMC article.
-
Analyzing Low-Level mtDNA Heteroplasmy-Pitfalls and Challenges from Bench to Benchmarking.
Fazzini F, Fendt L, Schönherr S, Forer L, Schöpf B, Streiter G, Losso JL, Kloss-Brandstätter A, Kronenberg F, Weissensteiner H. Fazzini F, et al. Int J Mol Sci. 2021 Jan 19;22(2):935. doi: 10.3390/ijms22020935. Int J Mol Sci. 2021. PMID: 33477827 Free PMC article.
-
Barbitoff YA, Abasov R, Tvorogova VE, Glotov AS, Predeus AV. Barbitoff YA, et al. BMC Genomics. 2022 Feb 22;23(1):155. doi: 10.1186/s12864-022-08365-3. BMC Genomics. 2022. PMID: 35193511 Free PMC article.
-
Bris C, Goudenege D, Desquiret-Dumas V, Charif M, Colin E, Bonneau D, Amati-Bonneau P, Lenaers G, Reynier P, Procaccio V. Bris C, et al. Front Genet. 2018 Dec 11;9:632. doi: 10.3389/fgene.2018.00632. eCollection 2018. Front Genet. 2018. PMID: 30619459 Free PMC article. Review.
-
Xu C. Xu C. Comput Struct Biotechnol J. 2018 Feb 6;16:15-24. doi: 10.1016/j.csbj.2018.01.003. eCollection 2018. Comput Struct Biotechnol J. 2018. PMID: 29552334 Free PMC article. Review.
Cited by
-
The quality and detection limits of mitochondrial heteroplasmy by long read nanopore sequencing.
Slapnik B, Šket R, Črepinšek K, Tesovnik T, Bizjan BJ, Kovač J. Slapnik B, et al. Sci Rep. 2024 Nov 5;14(1):26778. doi: 10.1038/s41598-024-78270-0. Sci Rep. 2024. PMID: 39501054 Free PMC article.
-
Benchmarking Low-Frequency Variant Calling With Long-Read Data on Mitochondrial DNA.
Lüth T, Schaake S, Grünewald A, May P, Trinh J, Weissensteiner H. Lüth T, et al. Front Genet. 2022 May 19;13:887644. doi: 10.3389/fgene.2022.887644. eCollection 2022. Front Genet. 2022. PMID: 35664331 Free PMC article.
-
Weissensteiner H, Forer L, Kronenberg F, Schönherr S. Weissensteiner H, et al. Nucleic Acids Res. 2024 Jul 5;52(W1):W102-W107. doi: 10.1093/nar/gkae296. Nucleic Acids Res. 2024. PMID: 38709886 Free PMC article.
-
A systematic comparison of human mitochondrial genome assembly tools.
Mahar NS, Satyam R, Sundar D, Gupta I. Mahar NS, et al. BMC Bioinformatics. 2023 Sep 13;24(1):341. doi: 10.1186/s12859-023-05445-3. BMC Bioinformatics. 2023. PMID: 37704952 Free PMC article.
-
mtDNA analysis using Mitopore.
Dobner J, Nguyen T, Pavez-Giani MG, Cyganek L, Distelmaier F, Krutmann J, Prigione A, Rossi A. Dobner J, et al. Mol Ther Methods Clin Dev. 2024 Mar 12;32(2):101231. doi: 10.1016/j.omtm.2024.101231. eCollection 2024 Jun 13. Mol Ther Methods Clin Dev. 2024. PMID: 38572068 Free PMC article.
References
LinkOut - more resources
Full Text Sources