pubmed.ncbi.nlm.nih.gov

Protein disorder prediction at multiple levels of sensitivity and specificity - PubMed

  • ️Invalid Date

Comparative Study

Protein disorder prediction at multiple levels of sensitivity and specificity

Joshua Hecker et al. BMC Genomics. 2008.

Abstract

Background: Many protein regions and some entire proteins have no definite tertiary structure, existing instead as dynamic, disorder ensembles under different physiochemical circumstances. Identification of these protein disorder regions is important for protein production, protein structure prediction and determination, and protein function annotation. A number of different disorder prediction software and web services have been developed since the first predictor was designed by Dunker's lab in 1997. However, most of the software packages use a pre-defined threshold to select ordered or disordered residues. In many situations, users need to choose ordered or disordered residues at different sensitivity and specificity levels.

Results: Here we benchmark a state of the art disorder predictor, DISpro, on a large protein disorder dataset created from Protein Data Bank and systematically evaluate the relationship of sensitivity and specificity. Also, we extend its functionality to allow users to trade off specificity and sensitivity by setting different decision thresholds. Moreover, we compare DISpro with seven other automated disorder predictors on the 95 protein targets used in the seventh edition of Critical Assessment of Techniques for Protein Structure Prediction (CASP7). DISpro is ranked as one of the best predictors.

Conclusion: The evaluation and extension of DISpro make it a more valuable and useful tool for structural and functional genomics.

PubMed Disclaimer

Figures

Figure 1
Figure 1

Sensitivity and specificity over a varying decision threshold from 0.01 to 0.99, in steps of 0.01.

Figure 2
Figure 2

Sensitivity vs. specificity over varying threshold

Figure 3
Figure 3

Example output from modified DISpro. Displays probability of disorder for each residue in a sequence.

Figure 4
Figure 4

ROC curves of eight predictors on the CASP7 dataset consisted of 95 protein targets.

Figure 5
Figure 5

Frequency of lengths of disordered regions.

Similar articles

Cited by

References

    1. Dunker A.K., Brown C.J., Lawson J.D., Iakoucheva L.M., Obradovic Z. Intrinsic disorder and protein function. Biochemistry. 2002;41:6573–6582. doi: 10.1021/bi012159+. - DOI - PubMed
    1. Cheng J., Sweredoski M.J., Baldi P. Accurate prediction of protein disordered regions by mining protein structure data. Data Mining and Knowledge Discovery. 2005;11:213–222. doi: 10.1007/s10618-005-0001-y. - DOI
    1. Su C.T., Chen C.Y., Ou. Y.Y. Protein disorder prediction by condensed PSSM considering propensity for order or disorder. BMC Bioinformatics. 2006;7:319. doi: 10.1186/1471-2105-7-319. - DOI - PMC - PubMed
    1. Yang Z.R., Thomson R., McNeil P., Esnouf R.M. RONN: the bio-basis function neural network technique applied to the detection of natively disordered regions in proteins. Bioinformatics. 2005;21:3369–3376. doi: 10.1093/bioinformatics/bti534. - DOI - PubMed
    1. Coeytaux K., Poupon A. Prediction of unfolded segments in a protein sequence based on amino acid composition. Bioinformatics. 2005;21:1891–1900. doi: 10.1093/bioinformatics/bti266. - DOI - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources