nature.com

Incipient de novo genes can evolve from frozen accidents that escaped rapid transcript turnover - Nature Ecology & Evolution

  • ️Bornberg-Bauer, Erich
  • ️Mon Sep 10 2018
  • Tautz, D. & Domazet-Lošo, T. The evolutionary origin of orphan genes. Nat. Rev. Genet. 12, 692–702 (2011).

    Article  CAS  PubMed  Google Scholar 

  • Khalturin, K., Hemmrich, G., Fraune, S., Augustin, R. & Bosch, T. C. More than just orphans: are taxonomically-restricted genes important in evolution? Trends Genet. 25, 404–413 (2009).

    Article  CAS  PubMed  Google Scholar 

  • Ohno, S. Evolution by Gene Duplication (Springer, New York, 1970).

  • Zhang, J. Evolution by gene duplication: an update. Trends Ecol. Evol. 18, 292–298 (2003).

    Article  Google Scholar 

  • Domazet-Loso, T. & Tautz, D. An evolutionary analysis of orphan genes in Drosophila. Genome Res. 13, 2213–2219 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wissler, L., Gadau, J., Simola, D. F., Helmkampf, M. & Bornberg-Bauer, E. Mechanisms and dynamics of orphan gene emergence in insect genomes. Genome Biol. Evol. 5, 439–455 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wu, D.-D., Irwin, D. M. & Zhang, Y.-P. De novo origin of human protein-coding genes. PLoS Genet. 7, e1002379 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Donoghue, M. T., Keshavaiah, C., Swamidatta, S. H. & Spillane, C. Evolutionary origins of Brassicaceae specific genes in Arabidopsis thaliana. BMC Evol. Biol. 11, 47 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Begun, D. J., Lindfors, H. A., Kern, A. D. & Jones, C. D. Evidence for de novo evolution of testis-expressed genes in the Drosophila yakuba/Drosophila erecta clade. Genetics 176, 1131–1137 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Carvunis, A.-R. et al. Proto-genes and de novo gene birth. Nature 487, 370–374 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Monsellier, E. & Chiti, F. Prevention of amyloid-like aggregation as a driving force of protein evolution. EMBO Rep. 8, 737–742 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Geiler-Samerotte, K. A. et al. Misfolded proteins impose a dosage-dependent fitness cost and trigger a cytosolic unfolded protein response in yeast. Proc. Natl Acad. Sci USA 108, 680–685 (2011).

    Article  PubMed  Google Scholar 

  • DePristo, M. A., Weinreich, D. M. & Hartl, D. L. Missense meanderings in sequence space: a biophysical view of protein evolution. Nat. Rev. Genet. 6, 678–687 (2005).

    Article  CAS  PubMed  Google Scholar 

  • Ptitsyn, O. B. Physical principles of protein structure and protein folding. J. Biosci. 8, 1–13 (1985).

    Article  CAS  Google Scholar 

  • Ángyán, A. F., Perczel, A. & Gáspári, Z. Estimating intrinsic structural preferences of de novo emerging random-sequence proteins: is aggregation the main bottleneck? FEBS Lett. 586, 2468–2472 (2012).

    Article  CAS  PubMed  Google Scholar 

  • Saibil, H. Chaperone machines for protein folding, unfolding and disaggregation. Nat. Rev. Mol. Cell Biol. 14, 630–642 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Tompa, P. Unstructural biology coming of age. Curr. Opin. Struct. Biol. 21, 419–425 (2011).

    Article  CAS  PubMed  Google Scholar 

  • Wright, P. E. & Dyson, H. J. Intrinsically disordered proteins in cellular signaling and regulation. Nat. Rev. Mol. Cell Biol. 16, 18–29 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Bellay, J. et al. Bringing order to protein disorder through comparative genomics and genetic interactions. Genome. Biol. 12, R14 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Zhao, L., Saelao, P., Jones, C. D. & Begun, D. J. Origin and spread of de novo genes in Drosophila melanogaster populations. Science 343, 769–772 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Bornberg-Bauer, E., Schmitz, J. & Heberlein, M. Emergence of de novo proteins from ‘dark genomic matter’ by ‘grow slow and moult’. Biochem. Soc. Trans. 43, 867–873 (2015).

    Article  CAS  PubMed  Google Scholar 

  • Wilson, B. A., Foy, S. G., Neme, R. & Masel, J. Young genes are highly disordered as predicted by the preadaptation hypothesis of de novo gene birth. Nat. Ecol. Evol. 1, 0146 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  • Basile, W., Sachenkova, O., Light, S. & Elofsson, A. High GC content causes orphan proteins to be intrinsically disordered. PLoS Comput. Biol. 13, e1005375 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Bornberg-Bauer, E. & Albà, M. M. Dynamics and adaptive benefits of modular protein evolution. Curr. Opin. Struct. Biol. 23, 459–466 (2013).

    Article  CAS  PubMed  Google Scholar 

  • Schaefer, C., Schlessinger, A. & Rost, B. Protein secondary structure appears to be robust under in silico evolution while protein disorder appears not to be. Bioinformatics 26, 625–631 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Tretyachenko, V. et al. Random protein sequences can form defined secondary structures and are well-tolerated in vivo. Sci. Rep. 7, 15449 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Keefe, A. D. & Szostak, J. W. Functional proteins from a random-sequence library. Nature 410, 715–718 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Neme, R., Amador, C., Yildirim, B., McConnell, E. & Tautz, D. Random sequences are an abundant source of bioactive RNAs or peptides. Nat. Ecol. Evol. 1, 0217 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  • Hollfelder, F., Kirby, A. J., Tawfik, D. S., Kikuchi, K. & Hilvert, D. Characterization of proton-transfer catalysis by serum albumins. J. Am. Chem. Soc. 122, 1022–1029 (2000).

    Article  CAS  Google Scholar 

  • Chen, J.-Y. et al. Emergence, retention and selection: a trilogy of origination for functional de novo proteins from ancestral lncRNAs in primates. PLoS Genet. 11, e1005391 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Palmieri, N., Kosiol, C. & Schlötterer, C. The life cycle of Drosophila orphan genes. eLife 3, e01311 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Chen, S., Zhang, Y. E. & Long, M. New genes in Drosophila quickly become essential. Science 330, 1682–1685 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Reinhardt, J. A. et al. De novo ORFs in Drosophila are important to organismal fitness and evolved rapidly from previously non-coding sequences. PLoS Genet. 9, e1003860 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Gubala, A. M. et al. The Goddard and Saturn genes are essential for Drosophila male fertility and may have arisen de novo. Mol. Biol. Evol. 34, 1066–1082 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Long, M., Betrán, E., Thornton, K. & Wang, W. The origin of new genes:glimpses from the young and old. Nat. Rev. Genet. 4, 865–875 (2003).

    Article  CAS  PubMed  Google Scholar 

  • Levine, M. T., Jones, C. D., Kern, A. D., Lindfors, H. A. & Begun, D. J. Novel genes derived from noncoding DNA in Drosophila melanogaster are frequently X-linked and exhibit testis-biased expression. Proc. Natl Acad. Sci. USA 103, 9935–9939 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Knowles, D. G. & McLysaght, A. Recent de novo origin of human protein-coding genes. Genome Res. 19, 1752–1759 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ruiz-Orera, J. et al. Origins of de novo genes in human and chimpanzee. PLoS Genet. 11, e1005721 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Abrusán, G. Integration of new genes into cellular networks, and their structural maturation. Genetics 195, 1407–1417 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Luis Villanueva-Cañas, J. et al. New genes and functional innovation in mammals. Genome Biol. Evol. 9, 1886–1900 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ruiz-Orera, J., Messeguer, X., Subirana, J. A. & Alba, M. M. Long non-coding RNAs as a source of new peptides. eLife 3, e03523 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Neme, R. & Tautz, D. Phylogenetic patterns of emergence of new genes support a model of frequent de novo evolution. BMC Genomics 14, 117 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Neme, R. & Tautz, D. Fast turnover of genome transcription across evolutionary time exposes entire non-coding DNA to de novo gene emergence. eLife 5, e09977 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kapranov, P. & St. Laurent, G. Dark matter RNA: existence, function, and controversy. Front. Genet. 3, 60 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Singer, S. S., Männel, D. N., Hehlgans, T., Brosius, J. & Schmitz, J. From “junk” to gene: curriculum vitae of a primate receptor isoform gene. J. Mol. Biol. 341, 883–886 (2004).

    Article  CAS  PubMed  Google Scholar 

  • Krull, M., Brosius, J. & Schmitz, J. Alu-SINE exonization: en route to protein-coding function. Mol. Biol. Evol. 22, 1702–1711 (2005).

    Article  CAS  PubMed  Google Scholar 

  • Schmitz, J. & Brosius, J. Exonization of transposed elements: a challenge and opportunity for evolution. Biochimie 93, 1928–1934 (2011).

    Article  CAS  PubMed  Google Scholar 

  • Kozak, M. Initiation of translation in prokaryotes and eukaryotes. Gene 234, 187–208 (1999).

    Article  CAS  PubMed  Google Scholar 

  • Mouilleron, H., Delcourt, V. & Roucou, X. Death of a dogma: eukaryotic mRNAs can code for more than one protein. Nucleic Acids Res. 44, 14–23 (2016).

    Article  CAS  PubMed  Google Scholar 

  • Schmitz, J. F. & Bornberg-Bauer, E. Fact or fiction: updates on how protein-coding genes might emerge de novo from previously non-coding DNA. F1000Research 6, 57 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ladoukakis, E., Pereira, V., Magny, E. G., Eyre-Walker, A. & Couso, J. P. Hundreds of putatively functional small open reading frames in Drosophila. Genome. Biol. 12, R118 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Couso, J. P. Finding smORFs: getting closer. Genome. Biol. 16, 189 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Mackowiak, S. D. et al. Extensive identification and analysis of conserved small ORFs in animals. Genome. Biol. 16, 179 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Galindo, M. I., Pueyo, J. I., Fouix, S., Bishop, S. A. & Couso, J. P. Peptides encoded by short ORFs control development and define a new eukaryotic gene family. PLoS Biol. 5, e106 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Heinen, T. J. A. J., Staubach, F., Häming, D. & Tautz, D. Emergence of a new gene from an intergenic region. Curr. Biol. 19, 1527–1531 (2009).

    Article  CAS  PubMed  Google Scholar 

  • Michel, A. M. et al. GWIPS-Viz: development of a Ribo-Seq genome browser. Nucleic Acids Res. 42, D859–D864 (2014).

    Article  CAS  PubMed  Google Scholar 

  • Ruiz-Orera, J., Verdaguer-Grau, P., Villanueva-Cañas, J. L., Messeguer, X. & Albà, M. M. Translation of neutrally evolving peptides provides a basis for de novo gene evolution. Nat. Ecol. Evol 1, 890–896 (2018).

    Article  Google Scholar 

  • Moyers, B. A. & Zhang, J. Phylostratigraphic bias creates spurious patterns of genome evolution. Mol. Biol. Evol. 32, 258–267 (2015).

    Article  PubMed  Google Scholar 

  • Ahrens, J., Dos Santos, H. G. & Siltberg-Liberles, J. The nuanced interplay of intrinsic disorder and other structural properties driving protein evolution. Mol. Biol. Evol. 33, 2248–2256 (2016).

    Article  CAS  PubMed  Google Scholar 

  • Trapnell, C. et al. Differential gene and transcript expression analysis of RNA-Seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kim, D., Langmead, B. & Salzberg, S. L. HISAT: a fast spliced aligner with low memory requirements. Nat. Methods 12, 357–360 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Rice, P. et al. EMBOSS: the European Molecular Biology open software suite. Trends Genet. 16, 276–277 (2000).

    Article  CAS  PubMed  Google Scholar 

  • Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).

    Article  CAS  PubMed  Google Scholar 

  • Fernandez-Escamilla, A.-M., Rousseau, F., Schymkowitz, J. & Serrano, L. Prediction of sequence-dependent and mutational effects on the aggregation of peptides and proteins. Nat. Biotechnol. 22, 1302–1306 (2004).

    Article  CAS  PubMed  Google Scholar 

  • Dosztányi, Z., Csizmok, V., Tompa, P. & Simon, I. IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content. Bioinformatics 21, 3433–3434 (2005).

    Article  CAS  PubMed  Google Scholar 

  • Faure, G. & Callebaut, I. Comprehensive repertoire of foldable regions within whole genomes. PLoS Comput. Biol. 9, e1003280 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  • Wang, L. et al. CPAT: coding-potential assessment tool using an alignment-free logistic regression model. Nucleic Acids Res. 41, e74 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Cock, P. J. et al. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25, 1422–1423 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Flicek, P. et al. Ensembl 2014. Nucleic Acids Res. 42, D749–D755 (2014).

    Article  CAS  PubMed  Google Scholar 

  • Linding, R., Schymkowitz, J., Rousseau, F., Diella, F. & Serrano, L. A comparative study of the relationship between protein structure and β-aggregation in globular and intrinsically disordered proteins. J. Mol. Biol. 342, 345–353 (2004).

    Article  CAS  PubMed  Google Scholar