pubmed.ncbi.nlm.nih.gov

Small-molecule binding sites to explore protein-protein interactions in the cancer proteome - PubMed

  • ️Fri Jan 01 2016

. 2016 Oct 20;12(10):3067-87.

doi: 10.1039/c6mb00231e. Epub 2016 Jul 25.

Affiliations

Small-molecule binding sites to explore protein-protein interactions in the cancer proteome

David Xu et al. Mol Biosyst. 2016.

Abstract

The Cancer Genome Atlas (TCGA) offers an unprecedented opportunity to identify small-molecule binding sites on proteins with overexpressed mRNA levels that correlate with poor survival. Here, we analyze RNA-seq and clinical data for 10 tumor types to identify genes that are both overexpressed and correlate with patient survival. Protein products of these genes were scanned for binding sites that possess shape and physicochemical properties that can accommodate small-molecule probes or therapeutic agents (druggable). These binding sites were classified as enzyme active sites (ENZ), protein-protein interaction sites (PPI), or other sites whose function is unknown (OTH). Interestingly, the overwhelming majority of binding sites were classified as OTH. We find that ENZ, PPI, and OTH binding sites often occurred on the same structure suggesting that many of these OTH cavities can be used for allosteric modulation of enzyme activity or protein-protein interactions with small molecules. We discovered several ENZ (PYCR1, QPRT, and HSPA6) and PPI (CASC5, ZBTB32, and CSAD) binding sites on proteins that have been seldom explored in cancer. We also found proteins that have been extensively studied in cancer that have not been previously explored with small molecules that harbor ENZ (PKMYT1, STEAP3, and NNMT) and PPI (HNF4A, MEF2B, and CBX2) binding sites. All binding sites were classified by the signaling pathways to which the protein that harbors them belongs using KEGG. In addition, binding sites were mapped onto structural protein-protein interaction networks to identify promising sites for drug discovery. Finally, we identify pockets that harbor missense mutations previously identified from analysis of TCGA data. The occurrence of mutations in these binding sites provides new opportunities to develop small-molecule probes to explore their function in cancer.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Examples of proteins with both ENZ and PPI binding sites

Proteins are represented in cartoon format. The monomer structure with identified binding sites is in white. SiteMap binding sites are shown as spheres, bound ligands are shown as ball-and-sticks. A, The homodimeric structure of CDA (PDB: 1mq0.B) with a bound inhibitor at a binding site classified as both ENZ and PPI. B, The homodimeric structure of NAMPT (PDB: 4o0z.B) with an ENZ (peach, bound inhibitor) and a PPI (blue) binding site on the same domain. C, D, The protein kinase (PDB: 2vwy.A) and ligand binding domain (PDB: 2hle.A) of EPHB4 featuring an ENZ and a PPI binding site on separate domains. The binding site on the protein kinase domain is not shown as spheres, but is occupied by the bound inhibitor (green).

Figure 2
Figure 2. Examples of proteins with potentially allosteric OTH binding sites

Proteins are represented in cartoon format. The monomer structure with identified binding sites is in white. SiteMap binding sites are shown as spheres, bound ligands are shown as ball-and-sticks. A, SULT2B1 (PDB: 1q1q.A) with an ENZ binding site occupied by a nucleotide and three additional OTH binding sites (green, blue, yellow). B, RET (PDB: 2iiv.A) with an ENZ binding site occupied by the bound inhibitor and an additional OTH binding site (green). C, CHP2 (PDB: 2bec.A) with two PPI binding sites (green, blue) at the interface with SL9CA1 (PDB: 2bec.B) and an additional OTH binding site (peach). D, The superimposed structure of PLAUR (PDB: 1ywh.M) with two PPI binding sites at the interfaces with VTN (PDB: 3bt1.B, green) and PLAU (PDB: 3bt1.A, yellow) and an additional OTH binding site (peach).

Figure 3
Figure 3. Binding sites in cancer related signaling pathways

Proteins with binding sites were mapped to 27 cancer related signaling pathways in KEGG. Identified binding sites were divided based on whether the protein was exclusive to one signaling pathway or occurred in multiple signaling pathways. A, Identified binding sites had DrugScore greater than 0.8 on proteins with log2 fold change greater than 1.5. B, Identified binding sites had DrugScore greater than 1.0 and log2 fold change greater than 2.

Figure 4
Figure 4. Proteins with binding sites that are both overexpressed and correlate with patient outcome

A, Fold change versus hazard ratio across all cancer types on proteins with log2FC ≥ 1.5, HR > 1.0, and DrugScore > 0.8. B, SiteScore and DrugScore of binding sites by functional annotation for proteins in A. C, Degree versus betweenness centrality from PPI network for all proteins with log2FC ≥ 1.5 and HR > 1. Proteins are colored coded based on whether there was a high quality crystal structure (blue), a crystal structure but no identifiable binding sites (orange), binding sites with DrugScore between 0.8 and 1.0 (gray), and druggable binding site with DrugScore greater than 1.0 (yellow). D, Fold change versus hazard ratio across all cancer types on proteins with druggable binding sites with log2FC ≥ 2.0, HR > 1.0, and DrugScore > 1.0. E, SiteScore versus DrugScore of druggable binding sites with log2FC ≥ 2.0, HR > 1.0, and DrugScore > 1.0. F, Degree versus betweenness centrality from PPI network for all proteins with log2FC ≥ 2.0, HR > 1.0, and DrugScore > 1.0.

Figure 5
Figure 5. Proteins with missense mutations

A, Missense mutations were mapped to patients in 7 of 10 diseases (COAD, THCA, and UCEC not included). Individual mutations were mapped to the protein structure and classified as being adjacent to the binding site, elsewhere on the protein surface, or buried in the interior of the protein structure. B, Percentage of samples with missense mutations adjacent to a binding site in a given disease, showing the top 20 proteins rank-order using the sum of frequencies. C, The W167L (green stick) mutation on the PPI interface between MAD2L1 (white) and MAD1L1 (cyan) is shown in cartoon (PDB ID: 1GO4). The PPI binding site is shown as transparent spheres. D, The R121P (green stick) mutation adjacent to the DNA-binding OTH site (tan, transparent spheres) on EXO1 (white cartoon) (PDB ID: 3QEB). DNA in the binding site from the crystal structure is also shown as cartoon. E, The counts of missense mutations at the amino acid level divided classified as being adjacent to the binding site, elsewhere on the surface of the protein, or buried in the protein interior. The original amino acid is listed row-wise and the subsequent mutation is listed column-wise.

Similar articles

Cited by

References

    1. Hanahan D, Weinberg RA. The hallmarks of cancer. Cell. 2000;100:57–70. - PubMed
    1. Weinstein JN, Collisson EA, Mills GB, Shaw KRM, Ozenberger BA, Ellrott K, Shmulevich I, Sander C, Stuart JM, Network CGAR. The cancer genome atlas pan-cancer analysis project. Nat Genet. 2013;45:1113–1120. - PMC - PubMed
    1. Cancer Genome Atlas Research N. Integrated genomic analyses of ovarian carcinoma. Nature. 2011;474:609–615. - PMC - PubMed
    1. Cancer Genome Atlas N. Comprehensive molecular characterization of human colon and rectal cancer. Nature. 2012;487:330–337. - PMC - PubMed
    1. Cancer Genome Atlas N. Comprehensive molecular portraits of human breast tumours. Nature. 2012;490:61–70. - PMC - PubMed

Publication types

MeSH terms

Substances