pubmed.ncbi.nlm.nih.gov

Integration of QUARK and I-TASSER for Ab Initio Protein Structure Prediction in CASP11 - PubMed

. 2016 Sep;84 Suppl 1(Suppl 1):76-86.

doi: 10.1002/prot.24930. Epub 2015 Sep 23.

Affiliations

Integration of QUARK and I-TASSER for Ab Initio Protein Structure Prediction in CASP11

Wenxuan Zhang et al. Proteins. 2016 Sep.

Abstract

We tested two pipelines developed for template-free protein structure prediction in the CASP11 experiment. First, the QUARK pipeline constructs structure models by reassembling fragments of continuously distributed lengths excised from unrelated proteins. Five free-modeling (FM) targets have the model successfully constructed by QUARK with a TM-score above 0.4, including the first model of T0837-D1, which has a TM-score = 0.736 and RMSD = 2.9 Å to the native. Detailed analysis showed that the success is partly attributed to the high-resolution contact map prediction derived from fragment-based distance-profiles, which are mainly located between regular secondary structure elements and loops/turns and help guide the orientation of secondary structure assembly. In the Zhang-Server pipeline, weakly scoring threading templates are re-ordered by the structural similarity to the ab initio folding models, which are then reassembled by I-TASSER based structure assembly simulations; 60% more domains with length up to 204 residues, compared to the QUARK pipeline, were successfully modeled by the I-TASSER pipeline with a TM-score above 0.4. The robustness of the I-TASSER pipeline can stem from the composite fragment-assembly simulations that combine structures from both ab initio folding and threading template refinements. Despite the promising cases, challenges still exist in long-range beta-strand folding, domain parsing, and the uncertainty of secondary structure prediction; the latter of which was found to affect nearly all aspects of FM structure predictions, from fragment identification, target classification, structure assembly, to final model selection. Significant efforts are needed to solve these problems before real progress on FM could be made. Proteins 2016; 84(Suppl 1):76-86. © 2015 Wiley Periodicals, Inc.

Keywords: CASP11; I-TASSER; QUARK; ab initio folding; protein structure prediction.

© 2015 Wiley Periodicals, Inc.

PubMed Disclaimer

Figures

Figure 1
Figure 1

The free-modeling pipelines in CASP11. (A) Flowchart for ‘QUARK’ and ‘Zhang-Server’ group predictions. (B) Flowchart of QUARK structure assembly simulations.

Figure 2
Figure 2

TM-score of the best models by Zhang-Server versus the length of the 35 FM protein domains. Dashed lines denote the TM-score cutoffs at 0.4 and 0.5 and the length cutoff of 100 to guide the eye. Circles, squares, stars and triangles denotes α-, β-proteins, αβ-proteins, and the protein with little secondary structure (l), respectively.

Figure 3
Figure 3

Modeling procedure and the first model predicted for T0837-D1. (A) I-TASSER; (B) QUARK; (C) X-ray structure and the superposition of QUARK model1 and X-ray structure. The black lines on the QUARK model indicate the contacts predicted by the fragment-based distance profiles. Black circle highlights the N-terminal domain where I-TASSER model1 has the mirror image of native but I-TASSER model2 and QUARK model1 modeled it correctly.

Figure 4
Figure 4

Comparison of threading templates with and without sorting using QUARK models. (A) TM-score of the first LOMETS templates; (B) TM-score of the best LOMETS templates; (C) TM-score of the first LOMETS templates including QUARK models; (D) TM-score of the best LOMETS templates including QUARK models.

Figure 5
Figure 5

The targets for which Zhang-Server generated models with a TM-score above 0.4, excluding T0837-D1 that is shown in Figure 3. Blue to red runs from the N- to C-terminals. The left cartoon of each panel shows the experimental structure and the right cartoon shows the predicted model. L is the length of the sequence having structure solved by X-ray, which is usually shorter than the length of the target sequence.

Figure 6
Figure 6

Structure prediction for T0820-D1. (A) Predicted secondary structure (‘prd’) compared to X-ray structure (‘exp’), where the α-helix region missed in secondary structure prediction is marked in red. ‘H’, ‘E’ and ‘-’ indicate helix, extended-strand and coil respectively. (B) QUARK model4. (C) X-ray structure for T0820-D1.

Figure 7
Figure 7

Structure prediction for T0793-D5. (A) Secondary structure prediction and LOMETS alignments. Z/Z_cut indicates the Z-score and the program-specific Z-score cutoffs that are used to define good or bad templates. β-strand regions missed in structure prediction are marked in red. Star ‘*” indicates the residues that are missed in the X-ray structure. (B) Consensus template 7r1rA that was identified by multiple LOMETS programs. (C) X-ray structure for T0793-D5.

Similar articles

Cited by

References

    1. Alder BJ, Wainwright TE. Phase transition for a hard sphere system. J Chem Phys. 1957;27:1208–1209.
    1. Brooks BR, Bruccoleri RE, Olafson BD, States DJ, Swaminathan S, et al. CHARMM: a program for macromolecular energy, minimization, and dynamics calculations. J Comput Chem. 1983;4(2):187–217.
    1. Wang J, Wolf RM, Caldwell JW, Kollman PA, Case DA. Development and testing of a general amber force field. J Comput Chem. 2004;25(9):1157–1174. - PubMed
    1. Bowie JU, Eisenberg D. An evolutionary approach to folding small alpha-helical proteins that uses sequence information and an empirical guiding fitness function. Proceedings of the National Academy of Sciences of the United States of America. 1994;91(10):4436–4440. - PMC - PubMed
    1. Simons KT, Kooperberg C, Huang E, Baker D. Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J Mol Biol. 1997;268(1):209–225. - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources