Accuracy of protein-level disorder predictions - PubMed
- ️Wed Jan 01 2020
Review
. 2020 Sep 25;21(5):1509-1522.
doi: 10.1093/bib/bbz100.
Affiliations
- PMID: 31616935
- DOI: 10.1093/bib/bbz100
Review
Accuracy of protein-level disorder predictions
Akila Katuwawala et al. Brief Bioinform. 2020.
Abstract
Experimental annotations of intrinsic disorder are available for 0.1% of 147 000 000 of currently sequenced proteins. Over 60 sequence-based disorder predictors were developed to help bridge this gap. Current benchmarks of these methods assess predictive performance on datasets of proteins; however, predictions are often interpreted for individual proteins. We demonstrate that the protein-level predictive performance varies substantially from the dataset-level benchmarks. Thus, we perform first-of-its-kind protein-level assessment for 13 popular disorder predictors using 6200 disorder-annotated proteins. We show that the protein-level distributions are substantially skewed toward high predictive quality while having long tails of poor predictions. Consequently, between 57% and 75% proteins secure higher predictive performance than the currently used dataset-level assessment suggests, but as many as 30% of proteins that are located in the long tails suffer low predictive performance. These proteins typically have relatively high amounts of disorder, in contrast to the mostly structured proteins that are predicted accurately by all 13 methods. Interestingly, each predictor provides the most accurate results for some number of proteins, while the best-performing at the dataset-level method is in fact the best for only about 30% of proteins. Moreover, the majority of proteins are predicted more accurately than the dataset-level performance of the most accurate tool by at least four disorder predictors. While these results suggests that disorder predictors outperform their current benchmark performance for the majority of proteins and that they complement each other, novel tools that accurately identify the hard-to-predict proteins and that make accurate predictions for these proteins are needed.
Keywords: accuracy; disorder content; intrinsic disorder; intrinsically disordered proteins; intrinsically disordered regions; prediction; predictive performance; protein sequence.
© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Similar articles
-
Computational Prediction of Intrinsic Disorder in Proteins.
Meng F, Uversky V, Kurgan L. Meng F, et al. Curr Protoc Protein Sci. 2017 Apr 3;88:2.16.1-2.16.14. doi: 10.1002/cpps.28. Curr Protoc Protein Sci. 2017. PMID: 28369666
-
Prediction of Intrinsic Disorder with Quality Assessment Using QUARTER.
Wu Z, Hu G, Oldfield CJ, Kurgan L. Wu Z, et al. Methods Mol Biol. 2020;2165:83-101. doi: 10.1007/978-1-0716-0708-4_5. Methods Mol Biol. 2020. PMID: 32621220
-
High-throughput prediction of disordered moonlighting regions in protein sequences.
Meng F, Kurgan L. Meng F, et al. Proteins. 2018 Oct;86(10):1097-1110. doi: 10.1002/prot.25590. Epub 2018 Sep 23. Proteins. 2018. PMID: 30099775
-
Computational prediction of functions of intrinsically disordered regions.
Katuwawala A, Ghadermarzi S, Kurgan L. Katuwawala A, et al. Prog Mol Biol Transl Sci. 2019;166:341-369. doi: 10.1016/bs.pmbts.2019.04.006. Epub 2019 May 20. Prog Mol Biol Transl Sci. 2019. PMID: 31521235 Review.
-
Kurgan L, Hu G, Wang K, Ghadermarzi S, Zhao B, Malhis N, Erdős G, Gsponer J, Uversky VN, Dosztányi Z. Kurgan L, et al. Nat Protoc. 2023 Nov;18(11):3157-3172. doi: 10.1038/s41596-023-00876-x. Epub 2023 Sep 22. Nat Protoc. 2023. PMID: 37740110 Review.
Cited by
-
Structural and functional analysis of "non-smelly" proteins.
Yan J, Cheng J, Kurgan L, Uversky VN. Yan J, et al. Cell Mol Life Sci. 2020 Jun;77(12):2423-2440. doi: 10.1007/s00018-019-03292-1. Epub 2019 Sep 5. Cell Mol Life Sci. 2020. PMID: 31486849 Free PMC article.
-
Shamilov R, Vinogradova O, Aneskievich BJ. Shamilov R, et al. Biomolecules. 2020 Nov 10;10(11):1531. doi: 10.3390/biom10111531. Biomolecules. 2020. PMID: 33182596 Free PMC article.
-
Complementarity of the residue-level protein function and structure predictions in human proteins.
Biró B, Zhao B, Kurgan L. Biró B, et al. Comput Struct Biotechnol J. 2022 May 6;20:2223-2234. doi: 10.1016/j.csbj.2022.05.003. eCollection 2022. Comput Struct Biotechnol J. 2022. PMID: 35615015 Free PMC article.
-
DescribePROT Database of Residue-Level Protein Structure and Function Annotations.
Zhao B, Basu S, Kurgan L. Zhao B, et al. Methods Mol Biol. 2025;2867:169-184. doi: 10.1007/978-1-0716-4196-5_10. Methods Mol Biol. 2025. PMID: 39576581
-
Zhao B, Ghadermarzi S, Kurgan L. Zhao B, et al. Comput Struct Biotechnol J. 2023 Jun 2;21:3248-3258. doi: 10.1016/j.csbj.2023.06.001. eCollection 2023. Comput Struct Biotechnol J. 2023. PMID: 38213902 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Miscellaneous