From gene trees to organismal phylogeny in prokaryotes: the case of the gamma-Proteobacteria - PubMed
From gene trees to organismal phylogeny in prokaryotes: the case of the gamma-Proteobacteria
Emmanuelle Lerat et al. PLoS Biol. 2003 Oct.
Abstract
The rapid increase in published genomic sequences for bacteria presents the first opportunity to reconstruct evolutionary events on the scale of entire genomes. However, extensive lateral gene transfer (LGT) may thwart this goal by preventing the establishment of organismal relationships based on individual gene phylogenies. The group for which cases of LGT are most frequently documented and for which the greatest density of complete genome sequences is available is the gamma-Proteobacteria, an ecologically diverse and ancient group including free-living species as well as pathogens and intracellular symbionts of plants and animals. We propose an approach to multigene phylogeny using complete genomes and apply it to the case of the gamma-Proteobacteria. We first applied stringent criteria to identify a set of likely gene orthologs and then tested the compatibilities of the resulting protein alignments with several phylogenetic hypotheses. Our results demonstrate phylogenetic concordance among virtually all (203 of 205) of the selected gene families, with each of the exceptions consistent with a single LGT event. The concatenated sequences of the concordant families yield a fully resolved phylogeny. This topology also received strong support in analyses aimed at excluding effects of heterogeneity in nucleotide base composition across lineages. Our analysis indicates that single-copy orthologous genes are resistant to horizontal transfer, even in ancient bacterial groups subject to high rates of LGT. This gene set can be identified and used to yield robust hypotheses for organismal phylogenies, thus establishing a foundation for reconstructing the evolutionary transitions, such as gene transfer, that underlie diversity in genome content and organization.
Conflict of interest statement
The authors have declared that no conflicts of interest exist.
Figures

(A) Distribution of the number of genes contained in the homolog families. (B) Number of orphan genes in each species in parentheses. Abbreviations: Ba, Buchnera aphidicola; Ec, Escherichia coli; Hi, Haemophilus influenzae; Pa, Pseudomonas aeruginosa; Pm, Pasteurella multocida; St, Salmonella typhimurium; Vc, Vibrio cholerae; Wb, Wigglesworthia brevipalpis; Xa, Xanthomonas axonopodis; Xc, Xanthomonas campestris; Xf, Xylella fastidiosa; Yp CO92, Yersinia pestis CO_92; Yp KIM, Yersinia pestis KIM. (C) Distribution of the number of species contained in the homolog families.

Topologies 1–4 correspond to tree reconstructions based on SSU rRNA. Topologies 5 and 6 correspond to the trees based on the concatenation of the proteins. Topologies 7–13 correspond to additional topologies constructed to test the sister relationship of the two symbiont species. Species abbreviations as in Figure 1. Abbreviations: ML, maximum likelihood; NJ, neighbor joining; K, Kimura distance; G&G, Galtier and Gouy distance; γ, gamma-based method for correcting the rate heterogeneity among sites. The position of the root corresponds to the one obtained repeatedly using SSU rRNA.

The graph shows the number of alignments accepting or rejecting each topology. The “Other Topologies” are those built to test the sister relationship of Wigglesworthia and Buchnera. The “Proteins” topologies are those obtained using both the protein concatenation and the consensus of trees from all 205 alignments. The “SSU rRNA” topologies were obtained using the SSU rRNA sequences with different methods.

(A) ML trees obtained for BioB (left) and MviN (right). (B) NJ trees obtained for BioB (left) and MviN (right). Abbreviations: Pf, Pseudomonas fluorescens; Pp, Pseudomonas putida; Ps, Pseudomonas syringae. Other species abbreviations as in Figure 1.

The topology shown agrees with almost all individual gene alignments (topology 5 of Figure 2). The same tree is obtained after removing the two genes showing evidence for LGT. The position of the root corresponds to the one obtained repeatedly using SSU rRNA.

Frequency distribution of the ratio (bit score/maximal bit score) in a BLASTP query of the proteins from E. coli on the proteins from the genomes of Salmonella enterica (solid line) and Vibrio cholerae (dashed line). The ratio of 0.3 allows identification of most homologs but excludes probable nonspecific matches (NS).
Similar articles
-
Concatenated alignments and the case of the disappearing tree.
Thiergart T, Landan G, Martin WF. Thiergart T, et al. BMC Evol Biol. 2014 Dec 30;14:266. doi: 10.1186/s12862-014-0266-0. BMC Evol Biol. 2014. PMID: 25547755 Free PMC article.
-
Genome trees constructed using five different approaches suggest new major bacterial clades.
Wolf YI, Rogozin IB, Grishin NV, Tatusov RL, Koonin EV. Wolf YI, et al. BMC Evol Biol. 2001 Oct 20;1:8. doi: 10.1186/1471-2148-1-8. BMC Evol Biol. 2001. PMID: 11734060 Free PMC article.
-
Kloesges T, Popa O, Martin W, Dagan T. Kloesges T, et al. Mol Biol Evol. 2011 Feb;28(2):1057-74. doi: 10.1093/molbev/msq297. Epub 2010 Nov 8. Mol Biol Evol. 2011. PMID: 21059789 Free PMC article.
-
Detecting lateral genetic transfer : a phylogenetic approach.
Beiko RG, Ragan MA. Beiko RG, et al. Methods Mol Biol. 2008;452:457-69. doi: 10.1007/978-1-60327-159-2_21. Methods Mol Biol. 2008. PMID: 18566777 Review.
-
Phylogenetic reconstruction and lateral gene transfer.
Bapteste E, Boucher Y, Leigh J, Doolittle WF. Bapteste E, et al. Trends Microbiol. 2004 Sep;12(9):406-11. doi: 10.1016/j.tim.2004.07.002. Trends Microbiol. 2004. PMID: 15337161 Review.
Cited by
-
Gene family assignment-free comparative genomics.
Doerr D, Thévenin A, Stoye J. Doerr D, et al. BMC Bioinformatics. 2012;13 Suppl 19(Suppl 19):S3. doi: 10.1186/1471-2105-13-S19-S3. Epub 2012 Dec 19. BMC Bioinformatics. 2012. PMID: 23281826 Free PMC article.
-
Two host clades, two bacterial arsenals: evolution through gene losses in facultative endosymbionts.
Rollat-Farnier PA, Santos-Garcia D, Rao Q, Sagot MF, Silva FJ, Henri H, Zchori-Fein E, Latorre A, Moya A, Barbe V, Liu SS, Wang XW, Vavre F, Mouton L. Rollat-Farnier PA, et al. Genome Biol Evol. 2015 Feb 20;7(3):839-55. doi: 10.1093/gbe/evv030. Genome Biol Evol. 2015. PMID: 25714744 Free PMC article.
-
Dewhirst FE, Shen Z, Scimeca MS, Stokes LN, Boumenna T, Chen T, Paster BJ, Fox JG. Dewhirst FE, et al. J Bacteriol. 2005 Sep;187(17):6106-18. doi: 10.1128/JB.187.17.6106-6118.2005. J Bacteriol. 2005. PMID: 16109952 Free PMC article.
-
Improving the specificity of high-throughput ortholog prediction.
Fulton DL, Li YY, Laird MR, Horsman BG, Roche FM, Brinkman FS. Fulton DL, et al. BMC Bioinformatics. 2006 May 28;7:270. doi: 10.1186/1471-2105-7-270. BMC Bioinformatics. 2006. PMID: 16729895 Free PMC article.
-
Mau B, Glasner JD, Darling AE, Perna NT. Mau B, et al. Genome Biol. 2006;7(5):R44. doi: 10.1186/gb-2006-7-5-r44. Epub 2006 May 31. Genome Biol. 2006. PMID: 16737554 Free PMC article.
References
-
- Akman L, Yamashita A, Watanabe H, Oshima K, Shiba T, et al. Genome sequence of the endocellular obligate symbiont of tsetse flies, Wigglesworthia glossinidia . Nat Genet. 2002;32:402–407. - PubMed
-
- Blattner FR, Plunkett G, Bloch CA, Perna NT, Burland V, et al. The complete genome sequence of Escherichia coli K-12. Science. 1997;277:1453–1474. - PubMed
-
- Brochier C, Philippe H, Moreira D. The evolutionary history of ribosomal protein RpS14: Horizontal gene transfer at the heart of the ribosome. Trends Genet. 2000;16:529–533. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases