Orthologous gene clusters and taxon signature genes for viruses of prokaryotes - PubMed
Orthologous gene clusters and taxon signature genes for viruses of prokaryotes
David M Kristensen et al. J Bacteriol. 2013 Mar.
Abstract
Viruses are the most abundant biological entities on earth and encompass a vast amount of genetic diversity. The recent rapid increase in the number of sequenced viral genomes has created unprecedented opportunities for gaining new insight into the structure and evolution of the virosphere. Here, we present an update of the phage orthologous groups (POGs), a collection of 4,542 clusters of orthologous genes from bacteriophages that now also includes viruses infecting archaea and encompasses more than 1,000 distinct virus genomes. Analysis of this expanded data set shows that the number of POGs keeps growing without saturation and that a substantial majority of the POGs remain specific to viruses, lacking homologues in prokaryotic cells, outside known proviruses. Thus, the great majority of virus genes apparently remains to be discovered. A complementary observation is that numerous viral genomes remain poorly, if at all, covered by POGs. The genome coverage by POGs is expected to increase as more genomes are sequenced. Taxon-specific, single-copy signature genes that are not observed in prokaryotic genomes outside detected proviruses were identified for two-thirds of the 57 taxa (those with genomes available from at least 3 distinct viruses), with half of these present in all members of the respective taxon. These signatures can be used to specifically identify the presence and quantify the abundance of viruses from particular taxa in metagenomic samples and thus gain new insights into the ecology and evolution of viruses in relation to their hosts.
Figures
![Fig 1](https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4977/3571318/c1861a6bf81b/zjb9990924260001.gif)
Proportions of prokaryotic virus types (dsDNA phages, dsDNA archaeal viruses, ssDNA, ssRNA, or dsRNA) in the data set and distribution of the number of protein-coding genes in virus genomes. The inset shows in more detail the part of the distribution that includes small virus genomes with <20 protein-coding genes.
![Fig 2](https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4977/3571318/431028245829/zjb9990924260002.gif)
Functions and sizes of the 20 largest POGs. When the number of proteins (dark blue) is greater than the number of organisms (light blue), the excess is due to paralogy.
![Fig 3](https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4977/3571318/4475e65d8759/zjb9990924260003.gif)
Distribution of the number of organisms in POGs, with the inset using a log scale on the y axis. The color scheme is the same as that for Fig. 1.
![Fig 4](https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4977/3571318/4ba4b992a762/zjb9990924260004.gif)
Network of phage genomes. The genomes of each phage are represented as boxes, which are colored according to the indicated taxonomic affiliation (type of dsDNA and with bacteria as their host except where specified otherwise), with connections drawn between genomes that share at least one POG. The distances between genomes are inversely proportional to the number of genes shared between neighbors. The inset is a zoomed-in region of the tightly connected subnetwork among the tailed phages.
![Fig 5](https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4977/3571318/eab0ebe343c6/zjb9990924260005.gif)
Distribution of the frequency of POGs with the indicated range of VQ. The inset shows the y axis on a log scale. The color scheme is the same as that for Fig. 1.
![Fig 6](https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4977/3571318/0b9cdcfe25dc/zjb9990924260006.gif)
Number and percentage of taxa that can be represented by at least one signature gene, with precision fixed at 100% and recall (x axis) allowed to vary. (a) The dependence of signatures on VQ value. (b) Breakdown of signatures into taxonomic levels.
Similar articles
-
Yutin N, Bäckström D, Ettema TJG, Krupovic M, Koonin EV. Yutin N, et al. Virol J. 2018 Apr 10;15(1):67. doi: 10.1186/s12985-018-0974-y. Virol J. 2018. PMID: 29636073 Free PMC article.
-
Iranzo J, Koonin EV, Prangishvili D, Krupovic M. Iranzo J, et al. J Virol. 2016 Nov 28;90(24):11043-11055. doi: 10.1128/JVI.01622-16. Print 2016 Dec 15. J Virol. 2016. PMID: 27681128 Free PMC article.
-
Aiewsakun P, Adriaenssens EM, Lavigne R, Kropinski AM, Simmonds P. Aiewsakun P, et al. J Gen Virol. 2018 Sep;99(9):1331-1343. doi: 10.1099/jgv.0.001110. Epub 2018 Jul 17. J Gen Virol. 2018. PMID: 30016225 Free PMC article.
-
Genomics of bacterial and archaeal viruses: dynamics within the prokaryotic virosphere.
Krupovic M, Prangishvili D, Hendrix RW, Bamford DH. Krupovic M, et al. Microbiol Mol Biol Rev. 2011 Dec;75(4):610-35. doi: 10.1128/MMBR.00011-11. Microbiol Mol Biol Rev. 2011. PMID: 22126996 Free PMC article. Review.
-
Curated list of prokaryote viruses with fully sequenced genomes.
Ackermann HW, Kropinski AM. Ackermann HW, et al. Res Microbiol. 2007 Sep;158(7):555-66. doi: 10.1016/j.resmic.2007.07.006. Epub 2007 Jul 29. Res Microbiol. 2007. PMID: 17889511 Review.
Cited by
-
Allers E, Moraru C, Duhaime MB, Beneze E, Solonenko N, Barrero-Canosa J, Amann R, Sullivan MB. Allers E, et al. Environ Microbiol. 2013 Aug;15(8):2306-18. doi: 10.1111/1462-2920.12100. Epub 2013 Mar 14. Environ Microbiol. 2013. PMID: 23489642 Free PMC article.
-
Gene Transfer Agents in Symbiotic Microbes.
Christensen S, Serbus LR. Christensen S, et al. Results Probl Cell Differ. 2020;69:25-76. doi: 10.1007/978-3-030-51849-3_2. Results Probl Cell Differ. 2020. PMID: 33263868 Review.
-
Origins and evolution of viruses of eukaryotes: The ultimate modularity.
Koonin EV, Dolja VV, Krupovic M. Koonin EV, et al. Virology. 2015 May;479-480:2-25. doi: 10.1016/j.virol.2015.02.039. Epub 2015 Mar 12. Virology. 2015. PMID: 25771806 Free PMC article. Review.
-
Kazlauskas D, Krupovic M, Venclovas Č. Kazlauskas D, et al. Nucleic Acids Res. 2016 Jun 2;44(10):4551-64. doi: 10.1093/nar/gkw322. Epub 2016 Apr 25. Nucleic Acids Res. 2016. PMID: 27112572 Free PMC article.
-
Bumunang EW, McAllister TA, Polo RO, Ateba CN, Stanford K, Schlechte J, Walker M, MacLean K, Niu YD. Bumunang EW, et al. Phage (New Rochelle). 2022 Dec 1;3(4):221-230. doi: 10.1089/phage.2022.0003. Epub 2022 Dec 19. Phage (New Rochelle). 2022. PMID: 36793886 Free PMC article.
References
-
- Breitbart M, Rohwer F. 2005. Here a virus, there a virus, everywhere the same virus? Trends Microbiol. 13(6):278–284 - PubMed
-
- Suttle CA. 2007. Marine viruses–major players in the global ecosystem. Nat. Rev. Microbiol. 5(10):801–812 - PubMed
-
- Ansorge WJ. 2009. Next-generation DNA sequencing techniques. Nat. Biotechnol. 25(4):195–203 - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources