pubmed.ncbi.nlm.nih.gov

A novel approach for predicting disordered regions in a protein sequence - PubMed

A novel approach for predicting disordered regions in a protein sequence

Meijing Li et al. Osong Public Health Res Perspect. 2014 Aug.

Abstract

Objectives: A number of published predictors are based on various algorithms and disordered protein sequence properties. Although many predictors have been published, the study of protein disordered region prediction is ongoing because different prediction methods can find different disordered regions in a protein sequence.

Methods: Therefore we have used a new approach to find the more varying disordered regions for more efficient and accurate prediction of protein structures. In this study, we propose a novel approach called "emerging subsequence (ES) mining" without using the characteristics of the disordered protein. We first adapted the approach to generate emerging protein subsequences on public protein sequence data. Second, the disordered and ordered regions in a protein sequence were predicted by searching the generated emerging protein subsequence with a sliding window, which tends to overlap. Third, the scores of the overlapping regions were calculated based on support and growthrate values in both classes. Finally, the score of predicted regions in the target class were compared with the score of the source class, and the class having a higher score was selected.

Results: In this experiment, disordered sequence data and ordered sequence data was extracted from DisProt 6.02 and PDB respectively and used as training data. The test data come from CASP 9 and CASP 10 where disordered and ordered regions are known.

Conclusion: Comparing with several published predictors, the results of the experiment show higher accuracy rates than with other existing methods.

Keywords: amino acid sequence; disordered protein; emerging subsequence; protein structure.

PubMed Disclaimer

Figures

Figure 1
Figure 1

The example of prediction on overlap region in protein sequences. (A) Predicted disordered regions. (B) Predicted ordered region. (c) Overlapped regions of disordered emerging subsequences and ordered emerging subsequence.

Figure 2
Figure 2

Example of performance result.

Similar articles

Cited by

References

    1. Uversky V.N. Unusual biophysics of intrinsically disordered proteins. Biochim Biophys Acta. 2013 May;1834(5):932–951. - PubMed
    1. Cozzetto D., Jones D.T. The contribution of intrinsic disorder prediction to the elucidation of protein function. Curr Opin Struct Biol. 2013 Jun;23(3):467–472. - PubMed
    1. Ekman D., Light S., Björklund Å. What properties characterize the hub proteins of the protein-protein interaction network of Saccharomyces cerevisiae. Genome Biol. 2006 1.7, 6, R45. - PMC - PubMed
    1. Apic G., Ignjatovic T., Boyer S. Illuminating drug discovery with biological pathways. FEBS Lett. 2005 Mar 21;579(8):1872–1877. - PubMed
    1. Gould C.M.1, Diella F., Via A. ELM the status of the 2010 eukaryotic linear motif resource. Nucl Acids Res. 2010 Jan;38(Database issue):D167–180. - PMC - PubMed