Patterns and implications of gene gain and loss in the evolution of Prochlorococcus - PubMed
- ️Sat Nov 10 2164
Comparative Study
doi: 10.1371/journal.pgen.0030231.
Adam C Martiny, Katherine Huang, Jeremy Zucker, Maureen L Coleman, Sebastien Rodrigue, Feng Chen, Alla Lapidus, Steven Ferriera, Justin Johnson, Claudia Steglich, George M Church, Paul Richardson, Sallie W Chisholm
Affiliations
- PMID: 18159947
- PMCID: PMC2151091
- DOI: 10.1371/journal.pgen.0030231
Comparative Study
Patterns and implications of gene gain and loss in the evolution of Prochlorococcus
Gregory C Kettler et al. PLoS Genet. 2007 Dec.
Abstract
Prochlorococcus is a marine cyanobacterium that numerically dominates the mid-latitude oceans and is the smallest known oxygenic phototroph. Numerous isolates from diverse areas of the world's oceans have been studied and shown to be physiologically and genetically distinct. All isolates described thus far can be assigned to either a tightly clustered high-light (HL)-adapted clade, or a more divergent low-light (LL)-adapted group. The 16S rRNA sequences of the entire Prochlorococcus group differ by at most 3%, and the four initially published genomes revealed patterns of genetic differentiation that help explain physiological differences among the isolates. Here we describe the genomes of eight newly sequenced isolates and combine them with the first four genomes for a comprehensive analysis of the core (shared by all isolates) and flexible genes of the Prochlorococcus group, and the patterns of loss and gain of the flexible genes over the course of evolution. There are 1,273 genes that represent the core shared by all 12 genomes. They are apparently sufficient, according to metabolic reconstruction, to encode a functional cell. We describe a phylogeny for all 12 isolates by subjecting their complete proteomes to three different phylogenetic analyses. For each non-core gene, we used a maximum parsimony method to estimate which ancestor likely first acquired or lost each gene. Many of the genetic differences among isolates, especially for genes involved in outer membrane synthesis and nutrient transport, are found within the same clade. Nevertheless, we identified some genes defining HL and LL ecotypes, and clades within these broad ecotypes, helping to demonstrate the basis of HL and LL adaptations in Prochlorococcus. Furthermore, our estimates of gene gain events allow us to identify highly variable genomic islands that are not apparent through simple pairwise comparisons. These results emphasize the functional roles, especially those connected to outer membrane synthesis and transport that dominate the flexible genome and set it apart from the core. Besides identifying islands and demonstrating their role throughout the history of Prochlorococcus, reconstruction of past gene gains and losses shows that much of the variability exists at the "leaves of the tree," between the most closely related strains. Finally, the identification of core and flexible genes from this 12-genome comparison is largely consistent with the relative frequency of Prochlorococcus genes found in global ocean metagenomic databases, further closing the gap between our understanding of these organisms in the lab and the wild.
Conflict of interest statement
Competing interests. The authors have declared that no competing interests exist.
Figures

The calculated sizes depend on the number of genomes used in the analysis. If k genomes are selected from 12, there are 12!/(k!(12 − k)!) possible selections from which to calculate the core and pan-genomes. Each possible selection is plotted as a grey point, and the line is drawn through the average. This analysis is based on a similar one in [15].

(A) 16S rRNA and (B) 16S-23S rRNA ITS region reconstructed with maximum parsimony, neighbor-joining, and maximum likelihood. Numbers represent bootstrap values (100 resamplings). (C) Maximum parsimony reconstruction of random concatenation of 100 protein sequences sampled from core genome. Values represent average bootstrap values (100 resamplings) from 100 random concatenation runs. (D) Consensus tree of all core genes using maximum parsimony on protein sequence alignments. Values represent fraction of genes supporting each node. (E) Genome phylogeny based on gene content using the approach of [34]. Values represent bootstrap values from 100 resamplings.

The ancestor node in which a gain or loss event took place was estimated by maximum parsimony. Four marine Synechococcus genomes (not shown) were included in the calculation, and the phylogenetic tree from Figure 2C was rooted between the Synechococcus and Prochlorococcus lineages. (A) The total number of genes gained and lost at each node. (B) The loss and gain of genes in that could be assigned functional roles through homology. Note that (B) focuses on the small minority of genes that do have an assigned function. Genes were assigned to one of five categories on the basis of keyword matches against the gene name or COG description. “Other Putative Function” refers to genes with assigned function but not belonging to the four major categories. Note the difference in scale for (A and B).

The dot plots indicate the location on the chromosome and the ancestor node in which the gene is estimated to be gained. The color indicates where the best match was found. In MIT9301, The shaded regions are islands as defined by [60]. Gained genes are defined for each node as in Figure 3. The lower plot is the number of genes gained in a sliding window (size 10,000 bp, interval 1,000 bp) along the chromosome.

(A) Frequency distribution of GOS hits per gene, using genes in the Prochlorococcus MIT9301 genome as queries. Most core genes retrieve a similar number of GOS hits, as one would expect from single copy genes shared by all Prochlorococcus, resulting in a relatively tight frequency distribution. In contrast, flexible genes retrieve a broad range of GOS hits per gene, consistent with their scattered distribution among genomes. (B) The number of GOS hits per gene, again using MIT9301 genes as queries, plotted against position along the chromosome. Shaded regions represent genomic islands, after [60]. Flexible genes with low representation in the GOS dataset tend to be located in genomic islands. In both (A) and (B), the number of GOS hits per gene is normalized to gene length and plotted as hits per gene, per 1,000 bp.
Similar articles
-
Prabha R, Singh DP, Gupta SK, Rai A. Prabha R, et al. Interdiscip Sci. 2014 Jun;6(2):149-57. doi: 10.1007/s12539-013-0024-9. Epub 2014 Jun 17. Interdiscip Sci. 2014. PMID: 25172453
-
Becker JW, Pollak S, Berta-Thompson JW, Becker KW, Braakman R, Dooley KD, Hackl T, Coe A, Arellano A, LeGault KN, Berube PM, Biller SJ, Cubillos-Ruiz A, Van Mooy BAS, Chisholm SW. Becker JW, et al. mBio. 2024 Nov 13;15(11):e0349723. doi: 10.1128/mbio.03497-23. Epub 2024 Oct 18. mBio. 2024. PMID: 39422514 Free PMC article.
-
Genomic islands and the ecology and evolution of Prochlorococcus.
Coleman ML, Sullivan MB, Martiny AC, Steglich C, Barry K, Delong EF, Chisholm SW. Coleman ML, et al. Science. 2006 Mar 24;311(5768):1768-70. doi: 10.1126/science.1122050. Science. 2006. PMID: 16556843
-
A minimum set of regulators to thrive in the ocean.
Lambrecht SJ, Steglich C, Hess WR. Lambrecht SJ, et al. FEMS Microbiol Rev. 2020 Mar 1;44(2):232-252. doi: 10.1093/femsre/fuaa005. FEMS Microbiol Rev. 2020. PMID: 32077939 Review.
-
On the culture-independent assessment of the diversity and distribution of Prochlorococcus.
Mühling M. Mühling M. Environ Microbiol. 2012 Mar;14(3):567-79. doi: 10.1111/j.1462-2920.2011.02589.x. Epub 2011 Sep 30. Environ Microbiol. 2012. PMID: 21957972 Review.
Cited by
-
Fu X, Gong L, Liu Y, Lai Q, Li G, Shao Z. Fu X, et al. Front Microbiol. 2021 May 7;12:571212. doi: 10.3389/fmicb.2021.571212. eCollection 2021. Front Microbiol. 2021. PMID: 34025591 Free PMC article.
-
Mining genomes of marine cyanobacteria for elements of zinc homeostasis.
Barnett JP, Millard A, Ksibe AZ, Scanlan DJ, Schmid R, Blindauer CA. Barnett JP, et al. Front Microbiol. 2012 Apr 11;3:142. doi: 10.3389/fmicb.2012.00142. eCollection 2012. Front Microbiol. 2012. PMID: 22514551 Free PMC article.
-
Daakour S, Nelson DR, Fu W, Jaiswal A, Dohai B, Alzahmi AS, Koussa J, Huang X, Shen Y, Twizere JC, Salehi-Ashtiani K. Daakour S, et al. Microorganisms. 2024 Aug 20;12(8):1720. doi: 10.3390/microorganisms12081720. Microorganisms. 2024. PMID: 39203562 Free PMC article.
-
Paul S, Dutta A, Bag SK, Das S, Dutta C. Paul S, et al. BMC Genomics. 2010 Feb 10;11:103. doi: 10.1186/1471-2164-11-103. BMC Genomics. 2010. PMID: 20146791 Free PMC article.
-
Richter AS, Schleberger C, Backofen R, Steglich C. Richter AS, et al. Bioinformatics. 2010 Jan 1;26(1):1-5. doi: 10.1093/bioinformatics/btp609. Epub 2009 Oct 22. Bioinformatics. 2010. PMID: 19850757 Free PMC article.
References
-
- Goericke RE, Welschmeyer NA. The marine prochlorophyte Prochlorococcus contributes significantly to phytoplankton biomass and primary production in the Sargasso Sea. Deep Sea Research (Part I, Oceanographic Research Papers) 1993;40:2283–2294.
-
- Waterbury JB, Watson SW, Valois FW, Franks DG. Biological and ecological characterization of the marine unicellular bacterium Synechococcus . Can Bull Fish Aquat Sci. 1986;214:71–120.
-
- Moore LR, Rocap G, Chisholm SW. Physiology and molecular phylogeny of coexisting Prochlorococcus ecotypes. Nature. 1998;393:464–467. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Miscellaneous