link.springer.com

DomSVR: domain boundary prediction with support vector regression from sequence information alone - Amino Acids

  • ️Wang, Bing
  • ️Thu Feb 18 2010
  • Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402

    Article  CAS  PubMed  Google Scholar 

  • Baldi P, Brunak S, Chauvin Y, Andersen CA, Nielsen H (2000) Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16:412–424

    Article  CAS  PubMed  Google Scholar 

  • Bryson K, McGuffin LJ, Marsden RL, Ward JJ, Sodhi JS, Jones DT (2005) Protein structure prediction servers at University College London. Nucleic Acids Res 33:w36–w38

    Article  CAS  PubMed  Google Scholar 

  • Chen P, Wang B, Wong HS, Huang DS (2007) Prediction of protein B-factors using multi-class bounded SVM. Protein Pept Lett 14(2):185–190

    Article  CAS  PubMed  Google Scholar 

  • Cheng J, Sweredoski MJ, Baldi P (2006) DOMpro: protein domain prediction using profiles, secondary structure, relative solvent accessibility, and recursive neural networks. Data Min Knowl Discov 13:1–10

    Article  Google Scholar 

  • Chivian D, Kim DE, Malmstrom L, Bradley P, Robertson T, Murphy P, Strauss CE, Bonneau R, Rohl CA, Baker D (2003) Automated prediction of CASP-5 structures using the Robetta server. Proteins 53(S6):524–533

    Article  CAS  PubMed  Google Scholar 

  • Copley RR, Doerksa T, Letunica I, Borka P (2002) Protein domain analysis in the era of complete genomes. FEBS Lett 513:129–134

    Article  CAS  PubMed  Google Scholar 

  • Dovidchenko NV, Lobanov MY, Galzitskaya OV (2007) Prediction of number and position of domain boundaries in multi-domain proteins by use of amino acid sequence alone. Curr Protein Pept Sci 8(2):189–195

    Article  CAS  PubMed  Google Scholar 

  • Drucker H, Burges CJC, Kaufman L, Smola AJ, Vapnik V (1996) Support vector regression machines. In: Proceedings of the NIPS, pp 155–161

  • Dumontier M, Feldman R, Yao HJ, Hogue CWV (2005) Armadillo: doamin boundary prediction by amino acid composition. J Mol Biol 350:1061–1073

    Article  CAS  PubMed  Google Scholar 

  • Edelman GM (1973) Antibody structure and molecular immunology. Science 180:830–840

    Article  CAS  PubMed  Google Scholar 

  • Fukuchi S, Nishikawa K (2001) Protein surface amino acid compositions distinctively differ between thermophilic and mesophilic bacteria. J Mol Biol 309:835–843

    Article  CAS  PubMed  Google Scholar 

  • Galzitskaya OV, Melnik BS (2003) Prediction of protein domain boundaries from sequence alone. Protein Sci 12:696–701

    Article  CAS  PubMed  Google Scholar 

  • George RA, Heringa J (2002) Protein domain identification and improved sequence similarity searching using PSI-BLAST. Proteins: Struct Funct Gen 48:672–681

    Article  CAS  Google Scholar 

  • George RA, Heringa J (2002) SNAPDRAGON: a new method to predict protein structural domain boundaries from sequence data. J Mol Biol 316:839–851

    Article  CAS  PubMed  Google Scholar 

  • Gewehr JE, Zimmer R (2006) SSEP-Domain: protein domain prediction by alignment of secondary structure elements and profiles. Bioinformatics 22:181–187

    Article  CAS  PubMed  Google Scholar 

  • Goodall C (1990) Modern methods of data analysis. Sage Publications, Newbury Park, CA

    Google Scholar 

  • Gunn SR (1998) Support vector machines for classification and regression. Faculty of Engineering and Applied Science, University of Southampton

  • Heger A, Holm L (2003) Exhaustive enumeration of protein domain families. J Mol Biol 328:749–767

    Article  CAS  PubMed  Google Scholar 

  • Jolliffe IT (2002) Principal component analysis. Springer, NY.

    Google Scholar 

  • Kawashima S, Pokarowski P, Pokarowska M, Kolinski A, Katayama T, Kanehisa M (2008) AAindex: amino acid index database, progress report. Nucleic Acids Res 36:D202–D205

    Article  CAS  PubMed  Google Scholar 

  • Levitt M, Chothia C (1976) Structural patterns in globular proteins. Nature 261:552–558

    Article  CAS  PubMed  Google Scholar 

  • Lexa M, Valle G (2003) PRIMEX: rapid identification of oligonucleotide matches in whole genomes. Bioinformatics 19:2486–2488

    Article  CAS  PubMed  Google Scholar 

  • Linding R, Russell RB, Neduva V, Gibson TJ (2003) GlobPlot: exploring protein sequences for globularity and disorder. Nucleic Acids Res 31:3701–3708

    Article  CAS  PubMed  Google Scholar 

  • Liu J, Rost B (2004) Sequence-based prediction of protein domains. Nucleic Acids Res 32:3522–3530

    Article  CAS  PubMed  Google Scholar 

  • Marchler-Bauer A, Anderson JB, Derbyshire MK, DeWeese-Scott C (2007) CDD: a conserved domain database for interactive domain family analysis. Nucleic Acids Res 35:D237–240

    Article  CAS  PubMed  Google Scholar 

  • Marsden RL, McGuffin LJ, Jones DT (2002) Rapid protein domain assignment from amino acid sequence using predicted secondary structure. Protein Sci 11:2814–2824

    Article  CAS  PubMed  Google Scholar 

  • Miyazawa S, Jernigan RL (1999) Self-consistent estimation of inter-residue protein contact energies based on an equilibrium mixture approximation of residues. Proteins 34:49–68

    Article  CAS  PubMed  Google Scholar 

  • Munoz V, Serrano L (1994) Intrinsic secondary structure propensities of the amino acids, using statistical phi–psi matrices: comparison with experimental scale. Proteins 20:301–311

    Article  CAS  PubMed  Google Scholar 

  • Nagarajan N, Yona G (2004) Automatic prediction of protein domains from sequence information using a hybrid learning system. Bioinformatics 20:1335–1360

    Article  CAS  PubMed  Google Scholar 

  • Nanduri S, Carpick BW, Yang Y, Williams BR, Qin J (1998) Structure of the double-stranded RNA-binding domain of the protein kinase PKR reveals the molecular basis of its dsRNA-mediated activation. EMBO J 17:5458–5465

    Article  CAS  PubMed  Google Scholar 

  • Orengo CA, Michie AD, Jones DT, Swindells MB, Thornton JM (1997) CATH: a hierarchic classification of protein domain structures. Structure 5:1093–1108

    Article  CAS  PubMed  Google Scholar 

  • Porter RR (1973) Structural studies of immunoglobulins. Science 180:713–716

    Article  CAS  PubMed  Google Scholar 

  • Rackovsky S, Scheraga HA (1982) Differential geometry and polymer conformation. 4. Conformational and nucleation properties of individual amino acids. Macromolecules 15:1340–1346

    Article  CAS  Google Scholar 

  • Saini HK, Fischer D (2005) Meta-DP: domain prediction meta server. Bioinformatics 21:2917–2920

    Article  CAS  PubMed  Google Scholar 

  • Sikder AR, Zomaya AY (2006) Improving the performance of DomainDiscovery of protein domain boundary assignment using inter-domain linker index. BMC Bioinform 7:S6

    Article  Google Scholar 

  • Sim J, Kim SY, Lee J (2005) PRODO: prediction of protein domain boundaries using neural networks. Proteins 59:627–632

    Article  CAS  PubMed  Google Scholar 

  • Suyama M, Ohara O (2003) DomCut: prediction of inter-domain linker regions in amino acid sequences. Bioinformatics 19:673–674

    Article  CAS  PubMed  Google Scholar 

  • von Ohsen N, Sommer I, Zimmer R, Lengauer T (2004) Arby: automatic protein structure prediction using profile-profile alignment and confidence measures. Bioinformatics 20:2228–2235

    Article  PubMed  Google Scholar 

  • Wetlaufer DB (1973) Nucleation, rapid folding, and globular intrachain regions in proteins. Proc Natl Acad Sci USA 70:697–701

    Article  CAS  PubMed  Google Scholar 

  • Ye L, Liu T, Wu Z, Zhou R (2007) Sequence-based protein domain boundary prediction using BP neural network with various property profiles. Proteins: Struct Funct Bioinform 71:300–307

    Article  Google Scholar 

  • Yoo PD, Sikder AR, Zhou BB, Zomaya AY (2008) Improved general regression network for protein domain boundary prediction. BMC Bioinform 9:S12

    Article  Google Scholar 

  • Zdobnov EM, Apweiler R (2001) InterProScan-an integration platform for the signature-recognition methods in InterPro. Bioinformatics 17:847–848

    Article  CAS  PubMed  Google Scholar 

  • Zhou Y, Vitkup D, Karplus M (1999) Native proteins are surface-molten solids: application of the Lindemann criterion for the solid versus liquid state. J Mol Biol 285:1371–1375

    Article  CAS  PubMed  Google Scholar