Prediction and evolutionary information analysis of protein solvent accessibility using multiple linear regression - PubMed
- ️Sat Jan 01 2005
. 2005 Nov 15;61(3):481-91.
doi: 10.1002/prot.20620.
Affiliations
- PMID: 16170780
- DOI: 10.1002/prot.20620
Prediction and evolutionary information analysis of protein solvent accessibility using multiple linear regression
Jung-Ying Wang et al. Proteins. 2005.
Abstract
A multiple linear regression method was applied to predict real values of solvent accessibility from the sequence and evolutionary information. This method allowed us to obtain coefficients of regression and correlation between the occurrence of an amino-acid residue at a specific target and its sequence neighbor positions on the one hand, and the solvent accessibility of that residue on the other. Our linear regression model based on sequence information and evolutionary models was found to predict residue accessibility with 18.9% and 16.2% mean absolute error respectively, which is better than or comparable to the best available methods. A correlation matrix for several neighbor positions to examine the role of evolutionary information at these positions has been developed and analyzed. As expected, the effective frequency of hydrophobic residues at target positions shows a strong negative correlation with solvent accessibility, whereas the reverse is true for charged and polar residues. The correlation of solvent accessibility with effective frequencies at neighboring positions falls abruptly with distance from target residues. Longer protein chains have been found to be more accurately predicted than their smaller counterparts.
(c) 2005 Wiley-Liss, Inc.
Similar articles
-
Garg A, Kaur H, Raghava GP. Garg A, et al. Proteins. 2005 Nov 1;61(2):318-24. doi: 10.1002/prot.20630. Proteins. 2005. PMID: 16106377
-
Xu Z, Zhang C, Liu S, Zhou Y. Xu Z, et al. Proteins. 2006 Jun 1;63(4):961-6. doi: 10.1002/prot.20934. Proteins. 2006. PMID: 16514609
-
Real value prediction of solvent accessibility from amino acid sequence.
Ahmad S, Gromiha MM, Sarai A. Ahmad S, et al. Proteins. 2003 Mar 1;50(4):629-35. doi: 10.1002/prot.10328. Proteins. 2003. PMID: 12577269
-
Wang JY, Lee HM, Ahmad S. Wang JY, et al. Proteins. 2007 Jul 1;68(1):82-91. doi: 10.1002/prot.21422. Proteins. 2007. PMID: 17436325
-
Correlated substitution analysis and the prediction of amino acid structural contacts.
Horner DS, Pirovano W, Pesole G. Horner DS, et al. Brief Bioinform. 2008 Jan;9(1):46-56. doi: 10.1093/bib/bbm052. Epub 2007 Nov 13. Brief Bioinform. 2008. PMID: 18000015 Review.
Cited by
-
Rakhmanov SV, Makeev VJ. Rakhmanov SV, et al. BMC Struct Biol. 2007 Mar 30;7:19. doi: 10.1186/1472-6807-7-19. BMC Struct Biol. 2007. PMID: 17397537 Free PMC article.
-
Automated alphabet reduction for protein datasets.
Bacardit J, Stout M, Hirst JD, Valencia A, Smith RE, Krasnogor N. Bacardit J, et al. BMC Bioinformatics. 2009 Jan 6;10:6. doi: 10.1186/1471-2105-10-6. BMC Bioinformatics. 2009. PMID: 19126227 Free PMC article.
-
Zhang J, Chen W, Sun P, Zhao X, Ma Z. Zhang J, et al. BioData Min. 2015 Jan 31;8:3. doi: 10.1186/s13040-014-0031-3. eCollection 2015. BioData Min. 2015. PMID: 26478747 Free PMC article.
-
Liou YF, Huang HL, Ho SY. Liou YF, et al. BMC Bioinformatics. 2016 Dec 22;17(Suppl 19):503. doi: 10.1186/s12859-016-1368-z. BMC Bioinformatics. 2016. PMID: 28155647 Free PMC article.
-
Singh H, Ahmad S. Singh H, et al. BMC Struct Biol. 2009 Apr 27;9:25. doi: 10.1186/1472-6807-9-25. BMC Struct Biol. 2009. PMID: 19397821 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Research Materials