The Origin of Prebiotic Information System in the Peptide/RNA World: A Simulation Model of the Evolution of Translation and the Genetic Code - PubMed
- ️Tue Jan 01 2019
The Origin of Prebiotic Information System in the Peptide/RNA World: A Simulation Model of the Evolution of Translation and the Genetic Code
Sankar Chatterjee et al. Life (Basel). 2019.
Abstract
Information is the currency of life, but the origin of prebiotic information remains a mystery. We propose transitional pathways from the cosmic building blocks of life to the complex prebiotic organic chemistry that led to the origin of information systems. The prebiotic information system, specifically the genetic code, is segregated, linear, and digital, and it appeared before the emergence of DNA. In the peptide/RNA world, lipid membranes randomly encapsulated amino acids, RNA, and peptide molecules, which are drawn from the prebiotic soup, to initiate a molecular symbiosis inside the protocells. This endosymbiosis led to the hierarchical emergence of several requisite components of the translation machine: transfer RNAs (tRNAs), aminoacyl-tRNA synthetase (aaRS), messenger RNAs (mRNAs), ribosomes, and various enzymes. When assembled in the right order, the translation machine created proteins, a process that transferred information from mRNAs to assemble amino acids into polypeptide chains. This was the beginning of the prebiotic information age. The origin of the genetic code is enigmatic; herein, we propose an evolutionary explanation: the demand for a wide range of protein enzymes over peptides in the prebiotic reactions was the main selective pressure for the origin of information-directed protein synthesis. The molecular basis of the genetic code manifests itself in the interaction of aaRS and their cognate tRNAs. In the beginning, aminoacylated ribozymes used amino acids as a cofactor with the help of bridge peptides as a process for selection between amino acids and their cognate codons/anticodons. This process selects amino acids and RNA species for the next steps. The ribozymes would give rise to pre-tRNA and the bridge peptides to pre-aaRS. Later, variants would appear and evolution would produce different but specific aaRS-tRNA-amino acid combinations. Pre-tRNA designed and built pre-mRNA for the storage of information regarding its cognate amino acid. Each pre-mRNA strand became the storage device for the genetic information that encoded the amino acid sequences in triplet nucleotides. As information appeared in the digital languages of the codon within pre-mRNA and mRNA, and the genetic code for protein synthesis evolved, the prebiotic chemistry then became more organized and directional with the emergence of the translation and genetic code. The genetic code developed in three stages that are coincident with the refinement of the translation machines: the GNC code that was developed by the pre-tRNA/pre-aaRS /pre-mRNA machine, SNS code by the tRNA/aaRS/mRNA machine, and finally the universal genetic code by the tRNA/aaRS/mRNA/ribosome machine. We suggest the coevolution of translation machines and the genetic code. The emergence of the translation machines was the beginning of the Darwinian evolution, an interplay between information and its supporting structure. Our hypothesis provides the logical and incremental steps for the origin of the programmed protein synthesis. In order to better understand the prebiotic information system, we converted letter codons into numerical codons in the Universal Genetic Code Table. We have developed a software, called CATI (Codon-Amino Acid-Translator-Imitator), to translate randomly chosen numerical codons into corresponding amino acids and vice versa. This conversion has granted us insight into how the genetic code might have evolved in the peptide/RNA world. There is great potential in the application of numerical codons to bioinformatics, such as barcoding, DNA mining, or DNA fingerprinting. We constructed the likely biochemical pathways for the origin of translation and the genetic code using the Model-View-Controller (MVC) software framework, and the translation machinery step-by-step. While using AnyLogic software, we were able to simulate and visualize the entire evolution of the translation machines, amino acids, and the genetic code.
Keywords: AnyLogic software for computer simulation of translation machine; MVC architecture pattern and biological information; bridge peptide and aaRS; coevolution of translation machine and the genetic code; numerical codons; peptide/RNA world; prebiotic information system; ribozyme and tRNA; tRNA and mRNA; translation and the genetic code.
Conflict of interest statement
The authors declare no conflict of interest.
Figures

A screen shot of the web page showing the button to be pushed.

A hierarchy of Information Systems and Nanobots. This diagram shows a unified definition of various terms used for molecular systems. It related the idea of an information system with terms like ‘nanomachines’, etc.

Six main steps represent the early evolution of non-coding RNA in the hydrothermal crater vent environment. In the first stage, there is an assortment of different nucleotides (including some not found in RNA). In the second stage, these nucleotides randomly assemble into polynucleotides by polymerization with the removal of water molecules. In the third stage, four nucleotides, A, U, G, and C were selected out during replication by the Watson–Crick base pairing. In the fourth stage, the nucleotides undergo polymerization to create a mixture of polynucleotides that are random in length and sequence. In the fifth stage, a variety of biomolecules from the vent environment, such as amino acids, mononucleotides, oligonucleotides, and peptides, are randomly encapsulated, creating molecular crowding. Because of crowding, the single-stranded RNA begins to fold, forming the double-stranded stem and single stranded loop that make the hairpin. In the sixth stage, this secondary structure of RNA is shown separately: it forms a ribozyme and begins to act as an enzyme. Stems are created by hydrogen bonding between complementary base pairs. The ribozyme acquires amino acids, at the CCA sequence of the stem, as “cofactors” increasing its catalytic efficiency. The opposite end of the loop consists of three, unpaired bases facing outward, forming a binding site for the attaching of three corresponding mononucleotides. This is the beginning of the emergence of the proto-tRNA.

The origin of hairpin ribozyme and its chemical bonding with appropriate amino acid. A single-stranded RNA can develop secondary structure by infolding with double-stranded stem and single-stranded loop forming a hairpin ribozyme. The ribozyme acquired amino acid as cofactor to form a more efficient catalyst [46]. The amino acid is bound to an oligonucleotide (RNA molecule containing only three nucleotides) by an activation enzyme such as ‘bridge peptide’ [24], and the oligonucleotide is bound to the surface of the ribozyme by base pairing (ribozyme 1). The activating enzyme 2 would bind the next batch of amino acid and oligonucleotide is attached to ribozyme 2, forming the peptide bond.

The double-hairpin model of the transfer RNAs (tRNA) formation, showing its evolutionary transitions [76,78,81]. (A,B), shows a secondary hairpin structure of two RNA molecules (such as ribozymes), each with a stem and a loop: the CCA sequence at the 3’ end of the stem offers a binding site for an amino acid, whereas the 5’ end offers a binding site for phosphorous; (C), the direct duplication or ligation of the hairpin structure may have generated a double hairpin structure, creating a D-hairpin and a T-hairpin. An anticodon (ANT) site forms between the two stems. In this newly conFigured pre-tRNA molecule, the acceptor site and anticodon site are now closer together, enabling it to decode a pre-messenger RNA (mRNA) molecule for protein synthesis (see Figure 20); (D), a schematic diagram showing the salient features of the pre-tRNA molecule, with the anticodon site; (E), the contemporary full-length tRNA molecule could have been formed by the ligation of two half-sized pre-tRNA structures. Its acceptor stem bases and anticodon stem/loop bases, at the tRNA 5’-half and the 3’-half, fit the double–hairpin folding. This suggests that the primordial double–hairpin RNA molecules could have evolved to modern tRNA. This new secondary structure of tRNA resembles a cloverleaf, its anticodon end forms a complementary base pair with the mRNA codon; (F), a cloverleaf from nature illustrates the structural similarity with the new tRNA molecule; (G), a schematic diagram showing the salient features of the tRNA molecule, emphasizing the anticodon. The tRNA serves a crucial role in matching an amino acid with a specific codon. When tRNA is bound to an amino acid it is called an aminoacyl tRNA. There is now a corresponding tRNA, with an appropriate anticodon, for each amino acid.; (H), the cloverleaf secondary structure of tRNA then folds to the L-shaped tertiary structure. At the CCA minihelix end, the aminoacylation site interacts with a large ribosomal unit for a peptide bond formation. The opposite end interacts with the small ribosomal subunit, to decode mRNA triplets through codon-anticodon interactions.

Primitive translation process began with interaction between pre-mRNA and pre-tRNA before the appearance of ribosomes. Pre-tRNA molecule serves as a crucial role in matching a prebiotic amino acid to a specific codon. (A), a pre-tRNA molecule with two hairpin loops of 3’ and 5’ terminals and an anticodon (ANT); the acceptor stem at the 3’ end forms a covalent attachment to a specific amino acid that corresponds to the anticodon sequence; (B), schematic representation of pre-tRNA emphasizing the 3’ end and corresponding anticodon; (C), encapsulated pre-tRNA and pre-mRNA molecule with codon-anticodon interaction; the inner cell membrane acts as a substrate to hold the pre-mRNA molecule in place. (D), the anticodon of a pre-tRNA molecule began to hybridize with corresponding nucleotide by base pairing; the triplet nucleotides were kinked to form a codon; In the abiotic stage, the primitive GNC code appeared, which codes four amino acids: valine, alanine, aspartic acid, and glycine [87,88]; (E), codons thus produced by pre-tRNAs, began to link in a strand to form a pre-mRNA with coding sequence; (F), pre-tRNA and pre-mRNA interactions to form rudimentary translation; the 3’ acceptor end of pre-tRNA gathers appropriate amino acid from the pool and binds it by activation enzyme; an aminoacetyl pre-tRNA with appropriate anticodon hybridizes with codon, ejecting the pre-tRNA; the next aminoacyl pre-tRNA then moves down another codon and repeats the process; amino acid released from the old pre-tRNA begins to join to form a protein chain.

The origin of the ribosome. The ribosome consists of two subunits each with specific roles in protein synthesis. The basic form of the ribosome has been conserved in evolution. Perhaps, the early ribosome was similar to that of modern prokaryotes, which is a large ribonucleoprotein complex of three rRNAs and 52 r-protein molecules. Although ribosomal proteins greatly outnumber ribosomal RNAs, the rRNAs pervade both subunits. There is now evidence that rRNA interacts with mRNA or tRNA at each stage of translation, and that ribosomal proteins are necessary to maintain rRNA in a structure in which it can perform the catalytic functions. Most likely, the symbiotic interactions of ribosomal RNAs and ribosomal proteins gave rise to ribosomes, which grew by accretion. However, there is some controversy whether the small or large subunit appeared first. In our view, both units coevolved by the accretion of ribosomal RNAs and ribosomal proteins.

The translation machinery of the ribosome, where the mRNA message is decoded. The ribosome provides the substrate for controlling the interaction between mRNA and aminoacyl-tRNA. (A), an aminoacyl tRNA, with appropriate anticodon. (B), each ribosome has a binding site for mRNA, and three binding sites for tRNA. The tRNA binding sites are designated E-, P-, and A-sites (for exit, peptidyl-tRNA, aminoacyl-tRNA, respectively). The small subunit contains the binding site for mRNA. Translation takes place in a four-step cycle (C–F) that is repeated over and over during the synthesis of protein. (C), in step 1, an aminoacyl-tRNA, with appropriate anticodon, enters the vacant A-site on the ribosome where it hybridizes with a codon. (D), in step 2, the carboxyl end of the protein chain is uncoupled from the tRNA at the P-site, then joined by a peptide bond to the free amino group of the amino acid linked to the tRNA at the A-site. This reaction is catalyzed by an enzymatic site in the large subunit, called peptidyl transferase (PT). (E), in step 3, a shift in the large subunit (shown by arrow) relative to the small subunit in the 3’ direction, moves the two tRNAs into the E- and P-sites of the large unit, and then ejects the empty tRNA from E-site. (F), in step 4, the small subunit moves exactly three nucleotides along the mRNA molecule, bringing it back to its original position relative to the large subunit. This movement resets the ribosome with an empty A-site so that the next aminoacyl-tRNA molecule can bind. The cycle repeats when the incoming aminoacyl-tRNA binds to the codon of the A-site; (G), summarizes the life cycle of the ribosome during its translation [77,99].

The inferred biochemical pathways for the origin of translation and the genetic code in the RNA/peptide world. The hydrothermal crater vent was crowded with several monomers such as amino acids and nucleotides, which were polymerized on the mineral substrate to form various peptides and RNAs. As ribozymes evolved into pre-tRNAs, each pre-tRNA molecule captures specific amino acid, assisted by pre-aaRS enzyme. Eventually, anticodons of pre-tRNAs created custom-made pre-mRNAs for the storage of genetic information. The interaction between pre-tRNA and pre-mRNA generated small protein chain by rudimentary translation and GNC primitive code with four codon and four amino acids. With the refinement of translation, pre-tRNA evolved in tRNA and pre-mRNA to mRNA with the expansion SNS code with 16 codons, and 10 amino acids. Finally, as ribosome appeared by fusion of ribosomal proteins and RNA, it facilitates high-fidelity translation, leading to universal genetic code with 64 codons and 20 amino acids.

The inferred temporal order of evolution of translation machinery systems showing coevolution of translation machines and the genetic code. In our model, there are three stages of translation machinery systems: (A) pre-tRNA/pre-aaRS/pre-mRNA stage when GNC code evolved with the beginning of translation system; (B) tRNA/aaRS/mRNA stage when SNS code appeared; and finally, (C) tRNA/aaRS/mRNA/ ribosome stage when universal code evolved.

Three stages of the evolution of the genetic code corresponding to the evolution of the translation machines and the progressive addition of amino acids. Pre-tRNA molecule creates its custom-made pre-mRNA for storage of limited amino acid information in the beginning. Primitive translation process began with interaction between pre-mRNA and pre-tRNA. Pre-tRNA molecule in collaboration with pre-aaRS enzyme serve as crucial role in selecting and matching prebiotic amino acids from the prebiotic soup. At this stage, translation machine is simple, consisting of pre-tRNA/pre-aaRS/pre-mRNA. In the abiotic stage, the primitive GNC code appeared, which code four amino acids: valine, alanine, aspartic acid, and glycine [87,88]]. In the next stage, translation machine becomes modified and efficient with the evolution of the tRNA/aaRS/mRNA translation machine, when six new amino acids–glutamic acid, leucine, proline, and histidine were created. mRNA strand becomes more elongated and containing 16 codons and combination thereof with assignments of 10 amino acids with the emergence of the SNS code [87,88]. These 10 amino acids were readily available from the prebiotic environment [93]. Here, we see the beginning of degeneracy, where some the amino acids have more than one codon assignment. With the appearance of ribosome, the SNS code is modified to universal genetic code with 64 codons and 20 amino acids. The translation machine containing tRNA/aaRS/mRNA/ribosome becomes more robust with extensive degeneracy minimizing translation errors and mutation. Furthermore, amino acids with similar chemical properties seem to share similar codons. Ten more new amino acids were recruited at this stage from SNS stage: isoleucine, methionine, threonine, asparagine, lysine, serine, tryptophan, phenylalanine, tyrosine, and cysteine, totaling 20 amino acids. These new amino acids are derivatives of the first set of 10 primitive amino acids [86]. mRNA becomes independent storage device, and it can create its own strand by replication without the assistance of tRNA. mRNA strand becomes more elongated, containing information of 20 amino acids using 64 codons or combination thereof.

Evolution of Biological Information Systems. The basic biological system during the inception of the GNC code mainly processes the stereochemical properties of tRNA anticodons and primitive amino acids (alanine, aspartic acid, glycine, valine). The intermediate biological system during the origin of SNS code is able to process matching signals and signals etc. The advanced biological system during the origin of the universal genetic code is able to process rules, feedback, and instructions.

(A), a Model-View-Controller (MVC) Architecture pattern, and (B), an implementation Architecture of von Neumann’s Universal Constructor (UC). The solid arrows in the Figure show the flow of control among the components. For example, the solid arrow between the controller and model implies that the controller directs the actions of the model. A dotted arrow indicates a flow of data (information).

An overall architecture of CATI based on the MVC pattern. It shows the controller, model, and view aspects of the logic.

The overall Logic (Algorithm) of CATI. The logic combines the role of ribosomes, aaRS, tRNA, and mRNA into a single overall process.

The pre-aaRS/pre-tRNA/pre-mRNA translation machine. Pre-aaRS is the matchmaker between pre-tRNA and amino acid. Four primitive amino acids and their cognate four pre-tRNAs and pre-aaRS molecules were selected from the prebiotic soup. Each amino acid with its specific pre-tRNA molecules was catalyzed by pre-aaRS enzyme in the presence of ATP to create a charged pre-tRNA molecule. In a similar way, four charged molecules were available to decode the short string mRNA one at a time. During the hybridization of anticodon of pre-tRNA with codon of pre-mRNA, each pre-tRNA delivers the appropriate amino acid, which is linked to form a chain of biosynthetic protein for the first time, containing four amino acids. This is the first stage of translation, when primitive GNC code evolves.

A class diagram showing an information structure during the first stage of translation system. This diagram shows relationships among various parts of the primitive translation machine. Pre-tRNA attaches to a specific amino acid with help of pre-aaRS molecules. The charged pre-tRNA molecule has an anticodon that hybridizes with the corresponding codon of pre-mRNA. As pre-tRNA begins to decode pre-mRNA molecules, short chain of protein is synthesized for the first time in a prebiotic environment. The linkage of an amino acid to a pre-tRNA established the primitive GNC genetic code.

An MVC architecture of a pre-aaRS/pre-tRNA/pre-mRNA machine. Pre-aaRS and pre-tRNA direct charged pre-tRNA that will decode pre-mRNA to a growing protein.

The aaRS/tRNA/mRNA translation machine. Ten primitive amino acids joined with specific tRNA molecules by aaRS enzymes to form a pool of 10 charged tRNA molecules. These charged tRNA molecules begin to decode mRNA, creating a chain of longer, biosynthesized protein molecule. At this stage, SNC code appears with 10 amino acids for 16 codons. The translation is moderately efficient with the appearance of redundancy to minimize the translation errors.

A class diagram showing the interactions of aaRS/tRNA/mRNA translation machine showing the information structure during the origin of the SNS code. At this stage, 10 primitive amino acids are available to create 10 or more charged tRNAs for decoding mRNA. An amino acid in the charged tRNA will be incorporated into a growing protein chain, at a position that is dictated by the anticodon of the tRNA.

An MVC architecture of aaRS/tRNA/mRNA translation machine. aaRS and tRNA facilitate the interaction between a charged tRNA and a mRNA. A charged tRNA is an adaptor that acts as a view and helps to release the amino acid to form a chain of protein.

The aaRS/tRNA/mRNA/ribosome translation machine. tRNA delivers amino acid to ribosome that serves as the site of protein synthesis. Each ribosome has a large 50S subunit and a small 30S subunit that join together at the beginning of decoding of mRNA to synthesize a protein chain from amino acids carried by a tRNA. The correct tRNA enters the A site of the ribosome and appropriate amino acid is incorporated into the growing peptide chain, which transfers from tRNA in the P site to the tRNA of A site. As the ribosome moves, both tRNAs and mRNA then shit to the E site. Each newly translated amino acid is then added to a growing protein chain until ribosome completes the protein synthesis. At this stage, universal genetic code is optimized with 20 amino acids for 64 codons, including start and stop codons. The translation is highly efficient with start and stop codons; redundancy minimizes the translation errors and mutations.

Class diagram showing the interactions of aaRS/tRNA/mRNA/ribosome translation machine. The diagram is similar to that of Figure 22. In addition, it shows the introduction of a ribosome, which decodes mRNA with the help of charged tRNA.

An MVC structure of aaRS/tRNA/mRNA/ribosome translation machine. A ribosome is a part of a bigger machine that uses the aaRS machine to decode the mRNA information into a corresponding sequence of amino acids to link a protein chain.
Similar articles
-
Chatterjee S, Yadav S. Chatterjee S, et al. Life (Basel). 2022 Jun 2;12(6):834. doi: 10.3390/life12060834. Life (Basel). 2022. PMID: 35743865 Free PMC article. Review.
-
Wolf YI, Koonin EV. Wolf YI, et al. Biol Direct. 2007 May 31;2:14. doi: 10.1186/1745-6150-2-14. Biol Direct. 2007. PMID: 17540026 Free PMC article.
-
Evolution of the genetic code through progressive symmetry breaking.
Lenstra R. Lenstra R. J Theor Biol. 2014 Apr 21;347:95-108. doi: 10.1016/j.jtbi.2014.01.002. Epub 2014 Jan 14. J Theor Biol. 2014. PMID: 24434741
-
On origin of genetic code and tRNA before translation.
Rodin AS, Szathmáry E, Rodin SN. Rodin AS, et al. Biol Direct. 2011 Feb 22;6:14. doi: 10.1186/1745-6150-6-14. Biol Direct. 2011. PMID: 21342520 Free PMC article.
-
Engineering the Genetic Code in Cells and Animals: Biological Considerations and Impacts.
Wang L. Wang L. Acc Chem Res. 2017 Nov 21;50(11):2767-2775. doi: 10.1021/acs.accounts.7b00376. Epub 2017 Oct 6. Acc Chem Res. 2017. PMID: 28984438 Free PMC article. Review.
Cited by
-
Nunn AVW, Guy GW, Botchway SW, Bell JD. Nunn AVW, et al. Phytother Res. 2020 Aug;34(8):1868-1888. doi: 10.1002/ptr.6654. Epub 2020 Mar 12. Phytother Res. 2020. PMID: 32166791 Free PMC article. Review.
-
Evolution of the genetic code.
Lei L, Burton ZF. Lei L, et al. Transcription. 2021 Feb;12(1):28-53. doi: 10.1080/21541264.2021.1927652. Epub 2021 May 18. Transcription. 2021. PMID: 34000965 Free PMC article. Review.
-
Chimeric Translation for Mitochondrial Peptides: Regular and Expanded Codons.
Seligmann H, Warthi G. Seligmann H, et al. Comput Struct Biotechnol J. 2019 Aug 23;17:1195-1202. doi: 10.1016/j.csbj.2019.08.006. eCollection 2019. Comput Struct Biotechnol J. 2019. PMID: 31534643 Free PMC article.
-
Molecules, Information and the Origin of Life: What Is Next?
Chirumbolo S, Vella A. Chirumbolo S, et al. Molecules. 2021 Feb 14;26(4):1003. doi: 10.3390/molecules26041003. Molecules. 2021. PMID: 33672848 Free PMC article.
-
The biological process of lysine-tRNA charging is therapeutically targetable in liver cancer.
Zhang R, Noordam L, Ou X, Ma B, Li Y, Das P, Shi S, Liu J, Wang L, Li P, Verstegen MMA, Reddy DS, van der Laan LJW, Peppelenbosch MP, Kwekkeboom J, Smits R, Pan Q. Zhang R, et al. Liver Int. 2021 Jan;41(1):206-219. doi: 10.1111/liv.14692. Epub 2020 Oct 20. Liver Int. 2021. PMID: 33084231 Free PMC article.
References
-
- Deamer D.W. First Life: Discovering the Connections between Stars, Cells, and How Life Began. University of California Press; Berkeley, CA, USA: 2012.
LinkOut - more resources
Full Text Sources
Research Materials
Miscellaneous