Improved sequence-based prediction of disordered regions with multilayer fusion of multiple information sources - PubMed
- ️Fri Jan 01 2010
Comparative Study
Improved sequence-based prediction of disordered regions with multilayer fusion of multiple information sources
Marcin J Mizianty et al. Bioinformatics. 2010.
Abstract
Motivation: Intrinsically disordered proteins play a crucial role in numerous regulatory processes. Their abundance and ubiquity combined with a relatively low quantity of their annotations motivate research toward the development of computational models that predict disordered regions from protein sequences. Although the prediction quality of these methods continues to rise, novel and improved predictors are urgently needed.
Results: We propose a novel method, named MFDp (Multilayered Fusion-based Disorder predictor), that aims to improve over the current disorder predictors. MFDp is as an ensemble of 3 Support Vector Machines specialized for the prediction of short, long and generic disordered regions. It combines three complementary disorder predictors, sequence, sequence profiles, predicted secondary structure, solvent accessibility, backbone dihedral torsion angles, residue flexibility and B-factors. Our method utilizes a custom-designed set of features that are based on raw predictions and aggregated raw values and recognizes various types of disorder. The MFDp is compared at the residue level on two datasets against eight recent disorder predictors and top-performing methods from the most recent CASP8 experiment. In spite of using training chains with <or=25% similarity to the test sequences, our method consistently and significantly outperforms the other methods based on the MCC index. The MFDp outperforms modern disorder predictors for the binary disorder assignment and provides competitive real-valued predictions. The MFDp's outputs are also shown to outperform the other methods in the identification of proteins with long disordered regions.
Availability: http://biomine.ece.ualberta.ca/MFDp.html.
Figures
![Fig. 1.](https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f33/2935446/a0e2b4b714e0/btq373f1.gif)
Architecture of the MFDp method.
![Fig. 2.](https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f33/2935446/5544eb85862d/btq373f2.gif)
ROCs for the predictions on the (A) MxD and (B) CASP8 datasets.
![Fig. 3.](https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f33/2935446/e6b67bc1b712/btq373f3.gif)
ROCs for the predictions of proteins with long-disordered segments on the MxD dataset.
![Fig. 4.](https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f33/2935446/ae7a639dc6f9/btq373f4.gif)
Comparison of predictions from MFDp, DISOPRED2 (DP2), IUPREDL (IUPL), IUPREDS (IUPS) and DISOclust (DISOc), MD; and two CASP8 predictors with the highest MCC, McGuffin (379) and GeneSilicoMetaServer (297) for CASP8 targets T0480 (on the left) and T0404 (on the right). The ‘–’ and ‘D’ denote the ordered and disordered residues, respectively. The actual disorder annotations are shown in the first line.
Similar articles
-
Mizianty MJ, Peng Z, Kurgan L. Mizianty MJ, et al. Intrinsically Disord Proteins. 2013 Apr 1;1(1):e24428. doi: 10.4161/idp.24428. eCollection 2013 Jan-Dec. Intrinsically Disord Proteins. 2013. PMID: 28516009 Free PMC article.
-
In-silico prediction of disorder content using hybrid sequence representation.
Mizianty MJ, Zhang T, Xue B, Zhou Y, Dunker AK, Uversky VN, Kurgan L. Mizianty MJ, et al. BMC Bioinformatics. 2011 Jun 17;12:245. doi: 10.1186/1471-2105-12-245. BMC Bioinformatics. 2011. PMID: 21682902 Free PMC article.
-
Disfani FM, Hsu WL, Mizianty MJ, Oldfield CJ, Xue B, Dunker AK, Uversky VN, Kurgan L. Disfani FM, et al. Bioinformatics. 2012 Jun 15;28(12):i75-83. doi: 10.1093/bioinformatics/bts209. Bioinformatics. 2012. PMID: 22689782 Free PMC article.
-
Structural protein descriptors in 1-dimension and their sequence-based predictions.
Kurgan L, Disfani FM. Kurgan L, et al. Curr Protein Pept Sci. 2011 Sep;12(6):470-89. doi: 10.2174/138920311796957711. Curr Protein Pept Sci. 2011. PMID: 21787299 Review.
-
Accuracy of protein-level disorder predictions.
Katuwawala A, Oldfield CJ, Kurgan L. Katuwawala A, et al. Brief Bioinform. 2020 Sep 25;21(5):1509-1522. doi: 10.1093/bib/bbz100. Brief Bioinform. 2020. PMID: 31616935 Review.
Cited by
-
High-throughput prediction of RNA, DNA and protein binding regions mediated by intrinsic disorder.
Peng Z, Kurgan L. Peng Z, et al. Nucleic Acids Res. 2015 Oct 15;43(18):e121. doi: 10.1093/nar/gkv585. Epub 2015 Jun 24. Nucleic Acids Res. 2015. PMID: 26109352 Free PMC article.
-
PROSPER: an integrated feature-based tool for predicting protease substrate cleavage sites.
Song J, Tan H, Perry AJ, Akutsu T, Webb GI, Whisstock JC, Pike RN. Song J, et al. PLoS One. 2012;7(11):e50300. doi: 10.1371/journal.pone.0050300. Epub 2012 Nov 29. PLoS One. 2012. PMID: 23209700 Free PMC article.
-
DBC1/CCAR2 and CCAR1 Are Largely Disordered Proteins that Have Evolved from One Common Ancestor.
Brunquell J, Yuan J, Erwin A, Westerheide SD, Xue B. Brunquell J, et al. Biomed Res Int. 2014;2014:418458. doi: 10.1155/2014/418458. Epub 2014 Dec 11. Biomed Res Int. 2014. PMID: 25610865 Free PMC article.
-
Polerovirus genomic variation.
LaTourrette K, Holste NM, Garcia-Ruiz H. LaTourrette K, et al. Virus Evol. 2021 Dec 4;7(2):veab102. doi: 10.1093/ve/veab102. eCollection 2021 Sep. Virus Evol. 2021. PMID: 35299789 Free PMC article.
-
DisPredict: A Predictor of Disordered Protein Using Optimized RBF Kernel.
Iqbal S, Hoque MT. Iqbal S, et al. PLoS One. 2015 Oct 30;10(10):e0141551. doi: 10.1371/journal.pone.0141551. eCollection 2015. PLoS One. 2015. PMID: 26517719 Free PMC article.
References
-
- Bordoli L, et al. Assessment of disorder predictions in CASP7. Proteins. 2007;69(Suppl. 8):129–136. - PubMed
-
- Cheng J, et al. Accurate prediction of protein disordered regions by mining protein structure data. Data Mining Knowl. Disc. 2005;11:213–222.
-
- Dosztányi Z, et al. IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content. Bioinformatics. 2005;21:3433–3434. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Miscellaneous