pubmed.ncbi.nlm.nih.gov

Systematic discovery of new recognition peptides mediating protein interaction networks - PubMed

Systematic discovery of new recognition peptides mediating protein interaction networks

Victor Neduva et al. PLoS Biol. 2005 Dec.

Abstract

Many aspects of cell signalling, trafficking, and targeting are governed by interactions between globular protein domains and short peptide segments. These domains often bind multiple peptides that share a common sequence pattern, or "linear motif" (e.g., SH3 binding to PxxP). Many domains are known, though comparatively few linear motifs have been discovered. Their short length (three to eight residues), and the fact that they often reside in disordered regions in proteins makes them difficult to detect through sequence comparison or experiment. Nevertheless, each new motif provides critical molecular details of how interaction networks are constructed, and can explain how one protein is able to bind to very different partners. Here we show that binding motifs can be detected using data from genome-scale interaction studies, and thus avoid the normally slow discovery process. Our approach based on motif over-representation in non-homologous sequences, rediscovers known motifs and predicts dozens of others. Direct binding experiments reveal that two predicted motifs are indeed protein-binding modules: a DxxDxxxD protein phosphatase 1 binding motif with a KD of 22 microM and a VxxxRxYS motif that binds Translin with a KD of 43 microM. We estimate that there are dozens or even hundreds of linear motifs yet to be discovered that will give molecular insight into protein networks and greatly illuminate cellular processes.

PubMed Disclaimer

Figures

**Figure 1. Schematic of the Linear Motif Discovery Strategy**
Interaction maps are probed for interaction sets (A): Partners of proteins with multiple interactions are clustered together when there are no known sequence features present (B). Domains and homologous regions are then identified (B) and removed prior to running exhaustive pattern discovery (C) to produce a list of motifs ranked by their probabilities P (D). Hypothetical motifs are shown as coloured squares in (C) and (D). “Proteins” in (D) gives the set of proteins containing at least one copy of the motif.

**Figure 2. Overview of Motifs Found in the Fly**
Significant predictions from the yeast two-hybrid set for the fly. Blue dots in the center of each cluster represent proteins with four or more interaction partners (red and white dots) containing at least one confidently predicted motif (p-value < 0.001; *S_cons* ≤ 8 × 10⁻¹⁵). Partner proteins containing the motif are represented by red dots, whereas proteins lacking the motif are indicated by white dots. Clusters are labelled as gene name→detected motif. Yellow circles enclose known motifs: SH3→PxxP [38], PP1→RVxF [22], C-terminal binding protein (CtBP)→PxDLS [52], SR splicing factors RS-rich segments [53], and CG6843→SxKSKxxK, a likely nuclear localization signal. The Translin→VxxxRxYS motif was experimentally tested (Figure 3). The grey circles enclose clusters with low-complexity patterns. Two additional known motifs were also found in the fly using more relaxed criteria than those used for the other motifs in the figure: Groucho→WRPW [7] and Dynein light chain→TQT [26] as the variant A(TI)QT(DE). The latter was also identified as significant in the domain sets. Proteins are denoted either by their FlyBase accession codes or protein names when available.

**Figure 3. A Novel Fly VxxxRxYS Motif That Binds Translin**
(A) Translin (left) shown surrounded by interaction partners containing the predicted motif VxxxRxYS. Proteins are shown as lines with domains (labelled shapes), predicted coiled coils (light blue/green segments), and the location of motifs (blue vertical bars). Sequences for the motif-containing region are shown aligned to the best homologues in closely related species. Amino acids are coloured according to residue type: blue, positive; red, negative; light blue, small; yellow, hydrophobic; green, aromatic; magenta, polar; and orange, proline. Those constituting the predicted motif are denoted by circles. Aga, *Anopheles gambiae;* Dme, *D. melanogaster;* Dps, *D. pseudoobscura*. (B) Saturation curves, showing bound fraction (fluorescently labelled peptides at saturation) as a function of Translin concentration. Polarization values (mP) at zero concentration and B_max were normalised to give the bound fraction. K _D was computed by non-linear regression on values from three independent experiments. The lower panel shows the alignment of the native and mutated peptides together with the arbitrary peptide (selected randomly). Black triangles show positions specifying the motif (VxxxRxYS). The alignment is coloured as described in (A).

**Figure 4. An Acidic Yeast PP1 Binding Motif**
(A) PP1 (Glc7) with the set of interaction partners containing the DxxDxxxD motifs. Details are as for Figure 3A. Here the location of RVxF motifs (defined as matches to (RK)x_0–1(VI)x(FW)) are shown as yellow bars, and low-complexity regions are magenta. The figure also shows the structure of PP1 bound to RVxF (red spheres) [54] with a hypothetical helix containing the motif. Blue spheres show the location of Arg or Lys residues, and the active site is circled with critical Arginines shown in ball-and-stick. Red arrows show hypothetical interactions of the motif either with sites on PP1 or elsewhere. Ani, Aspergillus nidulans; Cal, Candida albicans; Ego, Eremothecium gossypii; Gze, Gibberella zeae; Mgr, Magnaporthe grisea; Sce, S. cerevisiae; Spo, Schizosaccharomyces pombe; Str, Salinospora tropicalis; Uma, Ustilago maydis; Xla, Xenopus laevis. (B) Saturation curves, showing bound fraction as a function of PP1 concentration. The polarization values (mP) were normalized to an extrapolated B_max because B_max could not be reached experimentally. Other details are as given in Figure 3B. Red triangles in the lower panel show the location of the near match to the motif in the mutated sequence.

**Figure 5. A Lit-1 MAP Kinase SxPxxxS Motif**
The MAP kinase lit-1 surrounded by its interaction partners containing the SxPxxxS motif. Details are as for Figure 3. Yellow boxes show the location of deletion mutants known to affect the interaction. Cbr, *C. briggsae;* Cel, *C. elegans*.

Cited by

Cryptic sequence features within the disordered protein p27Kip1 regulate cell cycle signaling.
Das RK, Huang Y, Phillips AH, Kriwacki RW, Pappu RV. Das RK, et al. Proc Natl Acad Sci U S A. 2016 May 17;113(20):5616-21. doi: 10.1073/pnas.1516277113. Epub 2016 May 2. Proc Natl Acad Sci U S A. 2016. PMID: 27140628 Free PMC article.
Predict and Analyze Protein Glycation Sites with the mRMR and IFS Methods.
Liu Y, Gu W, Zhang W, Wang J. Liu Y, et al. Biomed Res Int. 2015;2015:561547. doi: 10.1155/2015/561547. Epub 2015 Apr 15. Biomed Res Int. 2015. PMID: 25961025 Free PMC article.
Rice_Phospho 1.0: a new rice-specific SVM predictor for protein phosphorylation sites.
Lin S, Song Q, Tao H, Wang W, Wan W, Huang J, Xu C, Chebii V, Kitony J, Que S, Harrison A, He H. Lin S, et al. Sci Rep. 2015 Jul 7;5:11940. doi: 10.1038/srep11940. Sci Rep. 2015. PMID: 26149854 Free PMC article.
A screen for endocytic motifs.
Kozik P, Francis RW, Seaman MN, Robinson MS. Kozik P, et al. Traffic. 2010 Jun;11(6):843-55. doi: 10.1111/j.1600-0854.2010.01056.x. Epub 2010 Mar 4. Traffic. 2010. PMID: 20214754 Free PMC article.
Protein Abundance Biases the Amino Acid Composition of Disordered Regions to Minimize Non-functional Interactions.
Dubreuil B, Matalon O, Levy ED. Dubreuil B, et al. J Mol Biol. 2019 Dec 6;431(24):4978-4992. doi: 10.1016/j.jmb.2019.08.008. Epub 2019 Aug 20. J Mol Biol. 2019. PMID: 31442477 Free PMC article.

References

1. Poglitsch CL, Meredith GD, Gnatt AL, Jensen GJ, Chang WH, et al. Electron crystal structure of an RNA polymerase II transcription elongation complex. Cell. 1999;98:791–798. - PubMed
1. Jeffrey PD, Russo AA, Polyak K, Gibbs E, Hurwitz J, et al. Mechanism of CDK activation revelated by the structure of a cyclinA-CDK2 complex. Nature. 1995;376:313–320. - PubMed
1. Pawson T, Scott JD. Signaling through scaffold, anchoring, and adaptor proteins. Science. 1997;278:2075–2080. - PubMed
1. Sudol M. From Src Homology domains to other signaling modules: Proposal of the ‘protein recognition code'. Oncogene. 1998;17:1469–1474. - PubMed
1. Puntervoll P, Linding R, Gemund C, Chabanis-Davidson S, Mattingsdal M, et al. ELM server: A new resource for investigating short functional sites in modular eukaryotic proteins. Nucleic Acids Res. 2003;31:3625–3630. - PMC - PubMed

Systematic discovery of new recognition peptides mediating protein interaction networks - PubMed