Identifying Intrinsically Disordered Protein Regions through a Deep Neural Network with Three Novel Sequence Features - PubMed
- ️Sat Jan 01 2022
Identifying Intrinsically Disordered Protein Regions through a Deep Neural Network with Three Novel Sequence Features
Jiaxiang Zhao et al. Life (Basel). 2022.
Abstract
The fast, reliable, and accurate identification of IDPRs is essential, as in recent years it has come to be recognized more and more that IDPRs have a wide impact on many important physiological processes, such as molecular recognition and molecular assembly, the regulation of transcription and translation, protein phosphorylation, cellular signal transduction, etc. For the sake of cost-effectiveness, it is imperative to develop computational approaches for identifying IDPRs. In this study, a deep neural structure where a variant VGG19 is situated between two MLP networks is developed for identifying IDPRs. Furthermore, for the first time, three novel sequence features-i.e., persistent entropy and the probabilities associated with two and three consecutive amino acids of the protein sequence-are introduced for identifying IDPRs. The simulation results show that our neural structure either performs considerably better than other known methods or, when relying on a much smaller training set, attains a similar performance. Our deep neural structure, which exploits the VGG19 structure, is effective for identifying IDPRs. Furthermore, three novel sequence features-i.e., the persistent entropy and the probabilities associated with two and three consecutive amino acids of the protein sequence-could be used as valuable sequence features in the further development of identifying IDPRs.
Keywords: VGG19; intrinsically disordered proteins; the persistent entropy; the probabilities associated with two and three consecutive amino acids.
Conflict of interest statement
The authors declare no conflict of interest.
Figures

The overall framework for the prediction of intrinsically disordered proteins. (a) We extract five types of features from the protein sequence and obtain the feature matrix with 35 features for each amino acid. (b) The obtained feature matrix is input into the deep neural network. The output can be used to predict IDPRs.

The deep neural network configuration. (a) is the first part of the deep neural network configuration. The function of MLP1 is to convert the protein sequence features into a mode suitable for VGG19 input. (b) is the second part of the deep neural network configuration. We use a variant of VGG19 for further feature extraction and MLP2 for classification. In MLP2, a dropout algorithm is used.

The performance with different sliding window sizes on BACC and MCC.
Similar articles
-
Molecular Recognition Features in Zika Virus Proteome.
Mishra PM, Uversky VN, Giri R. Mishra PM, et al. J Mol Biol. 2018 Aug 3;430(16):2372-2388. doi: 10.1016/j.jmb.2017.10.018. Epub 2017 Nov 7. J Mol Biol. 2018. PMID: 29080786
-
Appadurai R, Uversky VN, Srivastava A. Appadurai R, et al. J Membr Biol. 2019 Oct;252(4-5):273-292. doi: 10.1007/s00232-019-00069-2. Epub 2019 May 28. J Membr Biol. 2019. PMID: 31139867 Review.
-
Targeting intrinsically disordered proteins in rational drug discovery.
Ambadipudi S, Zweckstetter M. Ambadipudi S, et al. Expert Opin Drug Discov. 2016;11(1):65-77. doi: 10.1517/17460441.2016.1107041. Epub 2015 Nov 7. Expert Opin Drug Discov. 2016. PMID: 26549326 Review.
-
Uversky VN. Uversky VN. Biotechnol J. 2015 Mar;10(3):356-66. doi: 10.1002/biot.201400374. Epub 2014 Oct 6. Biotechnol J. 2015. PMID: 25287424 Review.
-
Does lack of secondary structure imply intrinsic disorder in proteins? A sequence analysis.
Rani P, Baruah A, Biswas P. Rani P, et al. Biochim Biophys Acta. 2014 Oct;1844(10):1827-34. doi: 10.1016/j.bbapap.2014.07.020. Epub 2014 Aug 8. Biochim Biophys Acta. 2014. PMID: 25110178
References
LinkOut - more resources
Full Text Sources