pubmed.ncbi.nlm.nih.gov

A New Method for Identifying Essential Proteins Based on Network Topology Properties and Protein Complexes - PubMed

  • ️Fri Jan 01 2016

A New Method for Identifying Essential Proteins Based on Network Topology Properties and Protein Complexes

Chao Qin et al. PLoS One. 2016.

Abstract

Essential proteins are indispensable to the viability and reproduction of an organism. The identification of essential proteins is necessary not only for understanding the molecular mechanisms of cellular life but also for disease diagnosis, medical treatments and drug design. Many computational methods have been proposed for discovering essential proteins, but the precision of the prediction of essential proteins remains to be improved. In this paper, we propose a new method, LBCC, which is based on the combination of local density, betweenness centrality (BC) and in-degree centrality of complex (IDC). First, we introduce the common centrality measures; second, we propose the densities Den1(v) and Den2(v) of a node v to describe its local properties in the network; and finally, the combined strategy of Den1, Den2, BC and IDC is developed to improve the prediction precision. The experimental results demonstrate that LBCC outperforms traditional topological measures for predicting essential proteins, including degree centrality (DC), BC, subgraph centrality (SC), eigenvector centrality (EC), network centrality (NC), and the local average connectivity-based method (LAC). LBCC also improves the prediction precision by approximately 10 percent on the YMIPS and YMBD datasets compared to the most recently developed method, LIDC.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Graph G.
Fig 2
Fig 2. The number of true essential proteins predicted by LBCC and the other seven previously proposed methods for the YMIPS network.
Fig 3
Fig 3. The number of true essential proteins predicted by LBCC and the other seven previously proposed methods for the YMBD network.
Fig 4
Fig 4. The number of true essential proteins predicted by LBCC and the other seven previously proposed methods for the YHQ network.
Fig 5
Fig 5. The number of true essential proteins predicted by LBCC and the other seven previously proposed methods for the YDIP network.
Fig 6
Fig 6. PR curves of LBCC and the other seven previously proposed methods for the YMIPS network.
Fig 7
Fig 7. PR curves of LBCC and the other seven previously proposed methods for the YMBD network.
Fig 8
Fig 8. PR curves of LBCC and the other seven previously proposed methods for the YHQ network.
Fig 9
Fig 9. PR curves of LBCC and the other seven previously proposed methods for the YDIP network.
Fig 10
Fig 10. Jackknife curves of LBCC and the other seven previously proposed methods for the YMIPS network.
Fig 11
Fig 11. Jackknife curves of LBCC and the other seven previously proposed methods for the YMBD network.
Fig 12
Fig 12. Jackknife curves of LBCC and the other seven previously proposed methods for the YHQ network.
Fig 13
Fig 13. Jackknife curves of LBCC and the other seven previously proposed methods for the YDIP network.
Fig 14
Fig 14. The top 199 proteins in the YMIPS network identified by DC ∪ LBCC.

The green nodes and blue nodes are proteins identified by DC; the former are true essential proteins, and the latter are nonessential proteins. The red nodes and yellow nodes are proteins identified by LBCC; the former are true essential proteins, and the latter are nonessential proteins. The black nodes are the overlapping proteins.

Fig 15
Fig 15. The top 200 proteins in the YMIPS network identified by SC ∪ LBCC.

The green nodes and blue nodes are proteins identified by SC; the former are true essential proteins, and the latter are nonessential proteins. The red nodes and yellow nodes are proteins identified by LBCC; the former are true essential proteins, and the latter are nonessential proteins.

Fig 16
Fig 16. The top 189 proteins in the YMBD network identified by LAC ∪ LBCC.

The green nodes and blue nodes are proteins identified by LAC; the former are true essential proteins, and the latter are nonessential proteins. The red nodes and yellow nodes are proteins identified by LBCC; the former are true essential proteins, and the latter are nonessential proteins. The black nodes are the overlapping proteins.

Fig 17
Fig 17. The top 182 proteins in the YMBD network identified by LIDC ∪ LBCC.

The green nodes and blue nodes are proteins identified by LIDC; the former are true essential proteins, and the latter are nonessential proteins. The red nodes and yellow nodes are proteins identified by LBCC; the former are true essential proteins, and the latter are nonessential proteins. The black nodes are the overlapping proteins.

Fig 18
Fig 18. The top 195 proteins in the YHQ network identified by BC ∪ LBCC.

The green nodes and blue nodes are proteins identified by BC; the former are true essential proteins, and the latter are nonessential proteins. The red nodes and yellow nodes are proteins identified by LBCC; the former are true essential proteins, and the latter are nonessential proteins. The black nodes are the overlapping proteins.

Fig 19
Fig 19. The top 163 proteins in the YHQ network identified by NC ∪ LBCC.

The green nodes and blue nodes are proteins identified by NC; the former are true essential proteins, and the latter are nonessential proteins. The red nodes and yellow nodes are proteins identified by LBCC; the former are true essential proteins, and the latter are nonessential proteins. The black nodes are the overlapping proteins.

Fig 20
Fig 20. The top 200 proteins in the YDIP network identified by EC ∪ LBCC.

The green nodes and blue nodes are proteins identified by EC; the former are true essential proteins, and the latter are nonessential proteins. The red nodes and yellow nodes are proteins identified by LBCC; the former are true essential proteins, and the latter are nonessential proteins.

Fig 21
Fig 21. The top 196 proteins in the YDIP network identified by DC ∪ LBCC.

The green nodes and blue nodes are proteins identified by DC; the former are true essential proteins, and the latter are nonessential proteins. The red nodes and yellow nodes are proteins identified by LBCC; the former are true essential proteins, and the latter are nonessential proteins. The black nodes are the overlapping proteins.

Fig 22
Fig 22. The number of true essential proteins predicted by LBCC and the other seven previously proposed methods for the HDIP network.
Fig 23
Fig 23. PR curves of LBCC and the other seven previously proposed methods for the HDIP network.
Fig 24
Fig 24. Jackknife curves of LBCC and the other seven previously proposed methods for the HDIP network.

Similar articles

Cited by

References

    1. Pál C, Papp B, Hurst LD. Genomic function (communication arising): Rate of evolution and gene dispensability. Nature. 2003;421(6922):496–497. 10.1038/421496b - DOI - PubMed
    1. Winzeler EA, Shoemaker DD, Astromoff A, Liang H, Anderson K, Andre B, et al. Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. Science. 1999;285(5429):901–906. 10.1126/science.285.5429.901 - DOI - PubMed
    1. Fraser HB, Hirsh AE, Steinmetz LM, Scharfe C, Feldman MW. Evolutionary rate in the protein interaction network. Science. 2002;296(5568):750–752. 10.1126/science.1068696 - DOI - PubMed
    1. Wang Y, Sun H, Du W, Blanzieri E, Viero G, Xu Y, et al. Identification of essential proteins based on ranking edge-weights in protein-protein interaction networks. PloS One. 2014;9(9):e108716 10.1371/journal.pone.0108716 - DOI - PMC - PubMed
    1. Cole S. Comparative mycobacterial genomics as a tool for drug target and antigen discovery. Eur Respir J. 2002;20(36 suppl):78s–86s. 10.1183/09031936.02.00400202 - DOI - PubMed

MeSH terms

Substances

Grants and funding

This work was supported by NO.61572005, National Natural Science Foundation of China, www.nsfc.gov.cn, YQS CQ; NO.61562066, National Natural Science Foundation of China, www.nsfc.gov.cn, YQS; and NO.61272004, National Natural Science Foundation of China, www.nsfc.gov.cn, YQS. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

LinkOut - more resources