Experimental determination and system level analysis of essential genes in Escherichia coli MG1655 - PubMed
. 2003 Oct;185(19):5673-84.
doi: 10.1128/JB.185.19.5673-5684.2003.
M D Scholle, J W Campbell, G Balázsi, E Ravasz, M D Daugherty, A L Somera, N C Kyrpides, I Anderson, M S Gelfand, A Bhattacharya, V Kapatral, M D'Souza, M V Baev, Y Grechkin, F Mseeh, M Y Fonstein, R Overbeek, A-L Barabási, Z N Oltvai, A L Osterman
Affiliations
- PMID: 13129938
- PMCID: PMC193955
- DOI: 10.1128/JB.185.19.5673-5684.2003
Experimental determination and system level analysis of essential genes in Escherichia coli MG1655
S Y Gerdes et al. J Bacteriol. 2003 Oct.
Abstract
Defining the gene products that play an essential role in an organism's functional repertoire is vital to understanding the system level organization of living cells. We used a genetic footprinting technique for a genome-wide assessment of genes required for robust aerobic growth of Escherichia coli in rich media. We identified 620 genes as essential and 3,126 genes as dispensable for growth under these conditions. Functional context analysis of these data allows individual functional assignments to be refined. Evolutionary context analysis demonstrates a significant tendency of essential E. coli genes to be preserved throughout the bacterial kingdom. Projection of these data over metabolic subsystems reveals topologic modules with essential and evolutionarily preserved enzymes with reduced capacity for error tolerance.
Figures

Distribution of transposon insertion densities, densities of essential genes, and ERIs along the E. coli chromosome. (A) Gray lines show the transposon insertion densities calculated as the number of transposition events per 100-kb sliding window over the entire E. coli MG1655 chromosome. Values indicated by the blue lines were computed in a similar manner, except that all chromosomal regions corresponding to essential and ambiguous genes were excluded from the calculations in order to reconstruct insert distribution prior to selective outgrowth (see also Materials and Methods). Gaps in the data (chromosomal regions where transposition events could not be detected due to technical reasons) are indicated by short vertical lines along the x axis. These regions were excluded from all analyses. Nucleotide positions of the E. coli genome sequence correspond to those in reference . The regions where the distributions of transposition events significantly deviate (P < 0.01) from a Poisson process are marked by horizontal green lines. oriC shows the origin of chromosomal replication, and dif denotes the dif locus within the replication termination area. (B) Distribution of essential genes along the E. coli chromosome, defined as a percentage of essential genes in the total number of genes within a 100-kb-long chromosomal region (calculated per sliding window as described above). The regions where the numbers of essential genes significantly deviate (P < 0.01) from values that could arise by chance are marked by horizontal green lines. (C) ERIs along the E. coli chromosome, defined as the average ERI for all genes within each 100-kb region. The ERI for a gene is defined as the fraction of organisms in a diverse set of 33 bacterial species which contain an ortholog of the gene in their genomes.

Essentiality of genes controlling amino acid biosynthesis in E. coli. (A) Functional overview of amino acid biosynthesis. Each block represents one or more pathways leading to production of a particular amino acid or its key intermediates (shown in smaller boxes). Within each block, stacked bars represent the gene products involved in the pathway (according to SWISS-PROT release of June 2002). Bars are colored according to gene essentiality (green, nonessential; red, essential; gray, undefined). (B) Detailed representation of the lysine biosynthetic pathway. Genes predicted in the ERGO database to be paralogs in this pathway are shown, in addition to genes whose roles in the biosynthesis of lysine have been experimentally verified (in bold).

Distribution of E. coli genes as a function of ERIs. (A) Total number of genes with an ERI above the threshold plotted versus the ERI threshold. Color coding within bars represents fractions of essential (red), nonessential (green), ambiguous (yellow), and missing (gray) genes for each incremental increase of ERI threshold (with 33 diverse genomes in the reference set). (B) Fractions of essential genes at different ERI values. The data were fitted with the following function: y = yo+aebx, where yo is 12.0 ± 0.9, a is 0.023 ± 0.019, and b is 7.8 ± 0.8 (dashed red line). The dotted line represents the fractions of essential genes for the whole genome. (The fractions plotted are defined as the number of essential genes versus the number of essential (E) and nonessential (N) genes. Unknown or ambiguous genes are not taken into account.)

Distribution of essential genes among functional categories as a function of ERI thresholds. Functional categories are color coded and specified by three-letter designations as in Table 2. Within every threshold group, each bar represents the fraction (percent plotted on y axis) of all categorized essential genes corresponding to the number of essential genes in a given category (x axis) with ERI values above the set threshold (z axis).

E. coli genes found to be essential and preserved in over 80% of diverse bacterial genomes (ERI > 0.8). These universal essential genes are grouped by functional categories (described in Table 2). NTP, nucleotide triphosphate; FMN, flavin mononucleotide; FAD, flavin adenine dinucleotide; CoA, coenzyme A; TCA, tricarboxylic acid cycle; PRPP, phosphoribosyl pyrophosphate.

The evolutionary retention and essentiality ratio of enzymes in the topologic modules of E. coli metabolism. The hierarchical tree derived from the topologic overlap matrix of E. coli metabolism that quantifies the relation between the various modules is shown, as previously described (28). The branches of the tree are color coded according to the fraction of essential enzymes (top panel) and the average ERI score of enzymes (bottom panel) catalyzing the biochemical reactions within a given topologic module. Red indicates a 100% essentiality/conservation ratio within a module. Note that essentiality is not uniformly distributed across all modules (branches), but we observe a few small modules with very high fractions of essential enzymes, while the majority of modules contain no or only a few essential enzymes. A similar segregation of modules with high evolutionary conservation is observed in the second panel, with their locations often correlating with those of the high essentiality modules. The predominant biochemical classes of substrates used to group the metabolites are shown. Polysacch., polysaccharide; disacch., disaccharide; monosacch., monosaccharide; met. sugar alc., metabolic sugar alcohols.
Similar articles
-
Scholle MD, Gerdes SY. Scholle MD, et al. Methods Mol Biol. 2008;416:83-102. doi: 10.1007/978-1-59745-321-9_6. Methods Mol Biol. 2008. PMID: 18392962
-
Genome-scale identification of conditionally essential genes in E. coli by DNA microarrays.
Tong X, Campbell JW, Balázsi G, Kay KA, Wanner BL, Gerdes SY, Oltvai ZN. Tong X, et al. Biochem Biophys Res Commun. 2004 Sep 10;322(1):347-54. doi: 10.1016/j.bbrc.2004.07.110. Biochem Biophys Res Commun. 2004. PMID: 15313213
-
Selection analyses of insertional mutants using subgenic-resolution arrays.
Badarinarayana V, Estep PW 3rd, Shendure J, Edwards J, Tavazoie S, Lam F, Church GM. Badarinarayana V, et al. Nat Biotechnol. 2001 Nov;19(11):1060-5. doi: 10.1038/nbt1101-1060. Nat Biotechnol. 2001. PMID: 11689852
-
Gene expression analysis of the response by Escherichia coli to seawater.
Rozen Y, Larossa RA, Templeton LJ, Smulski DR, Belkin S. Rozen Y, et al. Antonie Van Leeuwenhoek. 2002 Aug;81(1-4):15-25. doi: 10.1023/a:1020500821856. Antonie Van Leeuwenhoek. 2002. PMID: 12448701 Review.
-
[Systematic analysis of the functions of Escherichia coli genes].
Kato J, Yamamoto Y, Miki T. Kato J, et al. Tanpakushitsu Kakusan Koso. 2001 Dec;46(16 Suppl):2386-92. Tanpakushitsu Kakusan Koso. 2001. PMID: 11802399 Review. Japanese. No abstract available.
Cited by
-
A statistical framework for improving genomic annotations of prokaryotic essential genes.
Deng J, Su S, Lin X, Hassett DJ, Lu LJ. Deng J, et al. PLoS One. 2013;8(3):e58178. doi: 10.1371/journal.pone.0058178. Epub 2013 Mar 8. PLoS One. 2013. PMID: 23520492 Free PMC article.
-
Gray AN, Henderson-Frost JM, Boyd D, Sharafi S, Niki H, Goldberg MB. Gray AN, et al. mBio. 2011 Nov 22;2(6):e00238-11. doi: 10.1128/mBio.00238-11. Print 2011. mBio. 2011. PMID: 22108384 Free PMC article.
-
Genome-wide essential gene identification in Streptococcus sanguinis.
Xu P, Ge X, Chen L, Wang X, Dou Y, Xu JZ, Patel JR, Stone V, Trinh M, Evans K, Kitten T, Bonchev D, Buck GA. Xu P, et al. Sci Rep. 2011;1:125. doi: 10.1038/srep00125. Epub 2011 Oct 20. Sci Rep. 2011. PMID: 22355642 Free PMC article.
-
Deutschbauer A, Price MN, Wetmore KM, Shao W, Baumohl JK, Xu Z, Nguyen M, Tamse R, Davis RW, Arkin AP. Deutschbauer A, et al. PLoS Genet. 2011 Nov;7(11):e1002385. doi: 10.1371/journal.pgen.1002385. Epub 2011 Nov 17. PLoS Genet. 2011. PMID: 22125499 Free PMC article.
-
Superessential reactions in metabolic networks.
Barve A, Rodrigues JF, Wagner A. Barve A, et al. Proc Natl Acad Sci U S A. 2012 May 1;109(18):E1121-30. doi: 10.1073/pnas.1113065109. Epub 2012 Apr 16. Proc Natl Acad Sci U S A. 2012. PMID: 22509034 Free PMC article.
References
-
- Anderson, R. P., and J. R. Roth. 1978. Tandem chromosomal duplications in Salmonella typhimurium: fusion of histidine genes to novel promoters. J. Mol. Biol. 119:147-166. - PubMed
-
- Badarinarayana, V., P. W. Estep III, J. Shendure, J. Edwards, S. Tavazoie, F. Lam, and G. M. Church. 2001. Selection analyses of insertional mutants using subgenic-resolution arrays. Nat. Biotechnol. 19:1060-1065. - PubMed
-
- Blattner, F. R., G. Plunkett III, C. A. Bloch, N. T. Perna, V. Burland, M. Riley, J. Collado-Vides, J. D. Glasner, C. K. Rode, G. F. Mayhew, J. Gregor, N. W. Davis, H. A. Kirkpatrick, M. A. Goeden, D. J. Rose, B. Mau, and Y. Shao. 1997. The complete genome sequence of Escherichia coli K-12. Science 277:1453-1474. - PubMed
-
- Csete, M. E., and J. C. Doyle. 2002. Reverse engineering of biological complexity. Science 295:1664-1669. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Research Materials