The genome sequence of the ethanologenic bacterium Zymomonas mobilis ZM4 - Nature Biotechnology
- ️Kang, Hyen Sam
- ️Sat Jan 01 2005
General features
The complete genome of Z. mobilis ZM4 consists of a single circular chromosome of 2,056,416 bp with an average G+C content of 46.33% (Table 1 and Supplementary Table 1 online). The 1,998 predicted coding ORFs cover 87% of the genome, and each ORF has an average length of 898 bp. Among these, 1,346 (67.4%) could be assigned putative functions, 258 (12.9%) were matched to conserved hypothetical coding sequences of unknown function and the remaining 394 (19.7%) showed no similarities to known genes. The functions of the predicted ORFs were categorized by comparison with the COG database (Table 2).
Of the 0.84% of the genome that encodes stable RNA, 51 genes encode transfer RNAs, corresponding to 42 different isoacceptor-tRNA species. Of these ribosomal RNA transcriptional units, rrnA is located at coordinate 140,000, rrnB at 360,000 and rrnC at 520,000, all three being transcribed in the same predicted direction of replication.
The replication origin predicted by calculating GC skew (G−C/G+C) values16 (Fig. 1) closely coincided with a 656-bp region containing one copy of a likely site (5′-GATCTNTTNTTTT-3′) for initial DNA unwinding, and eight copies of probable sites (5′-TTATNCANA-3′) for DnaA binding. We also found that genes such as parA and parB, which are involved in chromosome partitioning, and gidA and gidB, the glucose-inhibited division genes, were also located near the origin, which has often been observed in other bacterial genomes17.
The putative origin of replication is around 0 kb. The outer scale indicates the coordinates in base pairs. The distribution of genes is shown on the first two rings within the scale according to the direction of the reading frame. The locations of rRNA and tRNA genes are shown by green dots and black dots, respectively. Putative transposases are shown by red dots. The next circle shows GC content values. Cyan and green colors indicate positive and negative signs, respectively. The central circle shows GC-skew values (G−C/G+C) of the third bases of codons measured over the genome. Yellow and purple colors denote positive and negative signs, respectively. The window size was 10,000 nucleotides and the step size was 1,000 nucleotides.
Comparison with other sequenced genomes
Comparison of the Z. mobilis ZM4 ORFs (amino acid sequences) with those of other organisms revealed that 768 out of 1,668 ORFs listed in the COG database have the closest similarity to the corresponding ORFs of Novosphingobium aromaticivorans (Supplementary Table 2 online) in line with a previous phylogenetic study on Z. mobilis ZM4 based on the 16S ribosomal RNA sequence, where it was found that Z. mobilis ZM4 belonged to the Sphingomonas spp. group15. In particular, the ORFs classified into COG category J (translation, ribosomal structure and biogenesis) and category D (cell division and chromosome partitioning) showed high similarities to N. aromaticivorans. In contrast, only 2 out of 40 total ORFs classified into the COG category N (cell motility) and 5 out of 25 in category V (defense mechanisms) matched ORFs of N. aromaticivorans.
General metabolism
Z. mobilis uses glucose, fructose and sucrose anaerobically through the ED pathway, leading to the production of ethanol and CO2 (ref. 1). Analysis of the Z. mobilis genome sequence revealed the determinants of hexose-metabolizing enzymes such as invertase (ZMO0375, ZMO0942), levansucrase (ZMO0374), glucokinase (ZMO0369), glucose-6-phosphate isomerase (ZMO1212) and glucose-fructose oxidoreductase (ZMO0689) that would enable Z. mobilis to use sucrose, fructose and glucose as well as probably mannose, raffinose and sorbitol. However, there are no obvious genes for using lactose, maltose or cellobiose.
In the ED pathway, glucose-6-phosphate dehydrogenase (zwf, ZMO0367) oxidizes glucose-6-phosphate to 6-phosphonolactone. The lactone is dehydrated to 6-phosphogluconate by lactonase (ZMO1478). 6-phosphogluconate is dehydrated by 6-phosphogluconate dehydratase (edd, ZMO0368) to yield 2-keto-3-deoxy-6-phosphogluconate (KDPG). KDPG aldolase (eda, ZMO0997) cleaves KDPG to form pyruvate and glyceraldehyde-3-phosphate (Fig. 2). Glyceraldehyde-3-phosphate is then metabolized via the triose phosphate common to the Embden-Meyerhof-Parnass (EMP) pathway to yield ethanol and carbon dioxide. All the genes for all of the enzymes of the EMP pathway except 6-phosphofructokinase are present in Z. mobilis (Fig. 2). The zwf and edd genes are clustered with glf (ZMO0366; encodes facilitated diffusion protein for glucose) and glk (ZMO0369; glucokinase), whereas eda is separately located. This contrasts with Escherichia coli, in which zwf, edd and eda are closely linked although regulation of the zwf and edd-eda operon is independent17. By using the ED pathway instead of the EMP pathway, Z. mobilis yields only 1 mole of ATP per mole of fermented hexose, and produces ethanol at a theoretical yield of 2 moles/mole of substrate. Rapid production and high yield of ethanol as the only sugar fermentation product can be attributed to the presence of pyruvate decarboxylase (ZMO1360), an enzyme not frequently observed in bacteria, and two highly specific alcohol dehydrogenases (ZMO1236, ZMO1596).
The genes encoding two enzymes in the tricarboxylic acid cycle—the 2-oxoglutarate dehydrogenase complex and malate dehydrogenase—were not found. However, all the key building blocks, including oxaloacetate, malate, fumarate and succinate have been detected by means of high-performance liquid chromatography, and Z. mobilis is known to be able to synthesize all essential amino acids except for lysine and methionine. These results strongly indicate that other metabolic pathways are involved in producing oxaloacetate, malate, fumarate and succinate. Oxaloacetate can be produced from phosphoenolpyruvate and CO2 by phosphoenolpyruvate carboxylase (ZMO1496) or citrate lyase (ZMO0487: citrate ↔ oxaloacetate + acetate). Malate can be synthesized by pyruvate carboxylation with malic enzyme (ZMO1955). Fumarate can be produced by fumarate dehydratase (ZMO1307). However, evidence for an alternative metabolic pathway for succinate production, such as the glyoxylate cycle, has not yet been found.
Although most genes for the pentose phosphate pathway are missing, all genes encoding enzymes necessary for the synthesis of phosphoribosyl-pyrophosphate, a precursor for purine/pyrimidine metabolism, are present. We also identified all genes required for the de novo biosynthesis of RNA and DNA. Z. mobilis possesses a complete set of genes for the sulfate reduction pathway as well as all the genes required for the synthesis of all amino acids, except for one gene in the lysine (yfdZ) and one gene in the methionine (metB) pathways. For vitamins, all genes for riboflavin and folate synthesis and most genes for thiamin, ubiquinone, NAD+ and pyridoxal are present. The absence of genes for pantothenate and biotin biosynthesis genes is in accordance with the known nutritional requirement of Z. mobilis for these vitamins.
Transport systems and motility
We recognized 180 genes encoding transport-related membrane proteins, on the basis of a search of the Transport Protein Database (http://tcdb.ucsd.edu/index.php). The largest number (83) of these proteins were electrochemical potential-driven transporters (class 2), and included 20 involved in iron metabolism, 13 multi-drug resistance exporters, three members of the resistance nodulation cell-division family, eight permeases of the major facilitator superfamily, seven cation transporters, seven amino acid transporters, three nucleoside permeases and four sugar transporters. There are several ORFs for the sec-independent protein secretion pathway and others for the TonB-ExbB-ExbD/TolA-TolQ-TolR (TonB) family of auxiliary proteins for energization of outer membrane receptor–mediated active transport systems. The second most numerous class (55) contained primary active transporters (class 3), including 41 members of the ATP-binding cassette (ABC) transporter superfamily. There were five ORFs for the sec-dependent general secretory pathway, two for type III secretory pathway proteins and four for the type IV secretory pathway. The third largest class (14 members) was the channels/pores (class 1), consisting of five capsule polysaccharide export proteins and two carbohydrate (glucose)-facilitated diffusion proteins. The four remaining classes were group translocators (class 4; 4 ORFs), transport electron carriers (class 5; 3 ORFs), accessory factors involved in transport (class 8; 1 ORF) and incompletely characterized transport systems (class 9; 20 ORFs).
The flagellar cluster consists of 32 ORFs (ZMO0602–ZMO0652: flgABCDEFGHIJKL, flhAB, fliDEFGHIKLMNPQRS, motAB) encoding flagellar structure proteins, motor proteins and biosynthesis proteins. Classical chemotaxis signal transduction genes (cheABDRWY) and methyl-accepting chemotaxis genes (mcpAJ), similar to those in E. coli, were present.
Oxidative stress and respiration
Z. mobilis is not an obligatory but a facultative anaerobe, implying that there must be a defense mechanism against oxidative stress. The most well-known reduction-oxidation cycling machinery is the glutathione system. Both glutathione reductase (ZMO1211) and glutathione synthase (ZMO1913) are present, as well as a Gamma-glutamylcysteine synthetase (ZMO1556). Genes encoding a catalase (ZMO0928), an iron-dependent superoxide dismutase (Fe-SOD; ZMO1060) and two kinds of peroxidases (ZMO1136, ZMO1573), which are thought to be responsible for protection from the toxic effects of superoxide and hydrogen peroxide in most aerobic organisms, are also present.
In addition to the genes that respond to oxidative stress, the genome contained several genes related to the electron transport system such as the Fe-S-cluster redox enzyme (ZMO1032), cytochrome b (ZMO0957), cytochrome c1 (ZMO0958), cytochrome c-type biogenesis proteins (ZMO1252–1256), electron transfer flavoprotein (ZMO1479, ZMO1480) and a ubiquinone biosynthesis protein (ZMO1189, ZMO1669). Genes for electron donor and receptor modules such as NADH dehydrogenase (ZMO1113) NADH:flavin oxidoreductase (ZMO1885), NADH:ubiquinone oxidoreductase complex (ZMO1809–1814), nitroreductase (ZMO0678) and fumarate reductase (ZMO0569) were also found. However, genes for cytochrome o and cytochrome d, which use oxygen as a final electron acceptor, appeared to be absent.
It was reported that Z. mobilis has a respiratory electron transport chain19 and that it shows elevated molar growth yield during exponential aerobic growth20. Relative to anaerobic conditions, this leads to a decrease in the yield of ethanol and an accumulation of other less reduced metabolites such as acetaldehyde, acetone and acetate21,22. These results indicate that some NADH is oxidized in the respiratory chain with the simultaneous participation of the alcohol dehydrogenase reaction in aerobic culture conditions.
Stress adaptation
Protein denaturation and aggregation, resulting from exposure to heat or other stresses such as ethanol, are severe problems for cells, and are combated by induction of highly conserved heat shock proteins, whose function is to remove or refold the damaged cellular proteins23. Z. mobilis, an efficient ethanol producer, exhibits very high ethanol tolerance3. The Z. mobilis contains ORFs for the complete sets of heat shock–responsive molecular chaperones, such as DnaK (ZMO0660), DnaJ (ZMO0661, ZMO1069, ZMO1545, ZMO1546, ZMO1690) and GrpE (ZMO0016) of the HSP-70 chaperone complex, GroES (ZMO1928; HSP-10), GroEL (ZMO1929; HSP-60) and HSP-33 (ZMO0410). ATP-dependent heat shock–responsive proteases, such as HslVU (ZMO0246, ZMO0247) and Clp (ZMO0948, ZMO0949, ZMO1424), were also found. As in the well-known E. coli system23, genes for alternative sigma factors, sigma-32 (σ32; ZMO0749) and sigma-E (σE; ZMO1404), for the pertinent responses against various stresses are present. It is known that sigma-32 of E. coli induces a 'classic' set of chaperones, proteases and other heat shock proteins, thereby playing a central role in heat shock responses, whereas sigma-E induces periplasmic protease, chaperone and sigma-32 factor by specific extracytoplasmic stress. It is also well known that the induction of sigma-32 factor is turned on when E. coli cells grown at 30 °C are shifted to 42 °C, whereas proteins encoded by the sigma-E regulon are rapidly induced when E. coli cells are exposed to a more extreme temperature (e.g., 50 °C) or 10% ethanol23. We suppose that sigma-E plays a key role in resisting high ethanol conditions in Z. mobilis. We also found genes for a sigma-E positive regulator (ZMO1842) and a transcriptional regulator of heat shock genes (ZMO0015), two tight regulators of heat shock gene expression.
The appropriate controls of gene expression are carried out by a combination of basic transcriptional machineries, including RNA polymerase and sigma factors. Genes for other sigma factors, σ70 (rpoD; ZMO1623), σ54 (rpoN; ZMO0274), and σ28 (fliA; ZMO0626) were also found in the genome of Z. mobilis. We also identified 54 transcriptional activators and repressors.
Higher G+C-content genes found only in strain ZM4
To compare the Z. mobilis ZM4 genome with the unsequenced type strain (ZM1: ATCC10988) of Z. mobilis, labeled ZM1 and ZM4 genomic DNA were cohybridized with DNA microarrays containing probes for all the ORFs of Z. mobilis ZM4. It was found that most of the probes on the microarray hybridized equally with both labeled genomic DNAs (Fig. 3a). In addition, the two strains showed similar patterns of gene expression in microarray analysis of cultures grown under various growth stages (data not shown). Probably the overall genome structure of ZM1 and ZM4 is very similar.
(a) Cohybridization of cy3-labeled ZM1 genomic DNA (green) and cy5-labeled ZM4 genomic DNA (red) on a Z. mobilis microarray. Arrows indicate extra sequences in strain ZM4. (b) Cohybridization of cy3-labeled ZM1 cDNA (green) and cy5-labeled ZM4 cDNA (red) on a Z. mobilis microarray. Most ORFs in the extra sequence of strain ZM4 (same locations that arrows indicate on panel a) were actively expressed (red spot). RNAs were isolated at exponential growth phase.
However, it is interesting to note that strain ZM4 contains sequences that are absent from ZM1. These sequences consist of 54 genes that are clustered separately in five regions. Among the products of the 54 ORFs, there were four kinds of membrane transport proteins, and four kinds of proteins involved in a type IV secretory system, an oxidoreductase related to short chain alcohol dehydrogenase and several transcriptional regulators (Table 3). Two genes, bcbG (ZMO1299) and bcbE (ZMO1300), encoding capsular polysaccharide biosynthesis proteins, were also peculiar to strain ZM4. One of the five clusters, spanning from 1,984,100 nt to 2,009,434 nt (25.3 kb), contains 25 ORFs and shows a higher G+C content (61.0%) (Fig. 1) than the average (46.3%) for the full genome of ZM4. The 25.3-kb sequence contains some interesting ORFs: ZMO1930 for phage-related integrase, ZMO1941 for conjugal transfer TraF protein, ZMO1954 conjugal transfer TrbL protein, and ZMO1933 and ZMO1934 for type I restriction-modification enzyme S and M subunits, respectively.
Most of the additional 54 ORFs in ZM4 were actively transcribed during the exponential growth phase, when ethanol is vigorously produced (Fig. 3b). Global expression profiles of the ZM1 and ZM4 strains were analyzed in a sample taken when half of the glucose (50 g/l) in the medium had been consumed and the data showed that a total of 294 ORFs were upregulated more than twofold in ZM4 compared to ZM1, whereas 153 ORFs were expressed more than twice in ZM1 (Supplementary Tables 3 and 4 online).
It has been reported that strain ZM4 is more tolerant of higher alcohol concentration than the type strain ZM1 and that ZM4 shows higher specific rates for growth, ethanol production and glucose uptake5,24. Perhaps some of the genes peculiar to ZM4 and actively expressed at the higher glucose concentration will prove to be good target genes for constructing recombinant strains that ferment ethanol with higher productivity.