pubmed.ncbi.nlm.nih.gov

The origins of polypeptide domains - PubMed

️Sat Nov 16 2148

Review

The origins of polypeptide domains

Edward E Schmidt et al. Bioessays. 2007 Mar.

Abstract

Three decades ago Gilbert posited that novel proteins arise by re-shuffling genomic sequences encoding polypeptide domains. Today, with numerous genomes and countless genes sequenced, it is well established that recombination of sequences encoding polypeptide domains plays a major role in protein evolution. There is, however, less evidence to suggest how the novel polypeptide domains, themselves, arise. Recent comparisons of genomes from closely related species have revealed numerous species-specific exons, supporting models of domain origin based on "exonization" of intron sequences. Also, a mechanism for the origin of novel polypeptide domains has been proposed based on analyses of insertion-based polymorphisms between orthologous genes across broad phylogenetic spectra and between allelic variants of genes within species. This review discusses these processes and how each might participate in the evolutionary emergence of novel polypeptide domains.

PubMed Disclaimer

Figures

**Figure 1**
Genesis of novel polypeptide domains by “exonization” of intron sequences. At top is shown cartoon of a simple gene (exons green, introns uncolored, simplified splice donor and acceptor signals diagramed in blue and red, respectively, promoter sequences in yellow, initiation site as a bent arrow, polyadenylation signals as orange, and splicing pattern by bent lines). Below, acquisition of splice donor and acceptor signals within an existing intron can generate a novel exon (labeled “N”); however, the phase of the exon–exon junctions must be preserved by this exon. Alternative splicing can produce either the original protein or the modified version with the novel polypeptide. This novel exon might subsequently shuffle to other parts of the genome.

**Figure 2**
Exon- and primary amino acid-structures of vertebrate TBP. At top is depicted the protein-coding exon arrangement of higher vertebrate and cyclostome TBP mRNAs.(27) Heavy lines depict the relevant region of each mRNA, triangles indicate exon–exon junctions, numbers are exon designations. For cyclostomes, the exon structure has only been determined for the N-terminal region; the C-terminal region is labeled “???”. A cyclostome-specific intron divides the region homologous to higher vertebrate exon 3 into separate left and right exons (designated 3L and 3R).(24) Below is the primary amino acid structure of vertebrate TBP with subregions and repeat units emphasized. Below this is the linear distribution of phylogenetic histories of each region along the sequence. Note that boundaries in phylogenetic history do not correlate to exon boundaries. The TBP_CORE domain is encoded by an inverted repeat (arrows) in all phyla, resulting in the conserved symmetrical “saddle” structure of the protein;(21,54) other recognized repeat units are tandem.(24) At the bottom is a key of the various oligopeptide units recognizable in hagfish TBP, indicating the number of repeats of each unit seen in human and hagfish TBPs.(24) Due to mutation and genetic drift, some repeat units are less obvious in higher vertebrates.(24) The two regions having no recognizable pattern (*) are dissimilar to each other and are not obviously derived by duplication and deletion of adjacent sequences; however, like the rest of the N terminus, these regions might have arisen by insertion/deletion of sequences whose ancestry is not recognized. Q repeat lengths are given as number of Q-codons (**) because, even though this region can expand and contract as larger multi-codon units, this has only been documented once.(48) Normal human *tbp* genes have 36–38 Qs in the repeat region; however patients with SCA17 neuropathologies have severely expanded TBP Q-domains.(48,49)

**Figure 3**
Alignment of predicted amino acid sequences of the alpha-2 domain of two bison MHC-I alleles. Allele *Bibi-N*00501* is a novel allele with a seven amino acid duplication following position 163 of the parent allele, *Bibi-N*00502*. Numbering corresponds to amino acid positions of normal bovine or bison MHC-I proteins starting from the first amino acid of the mature protein. Dots in the *Bibi-N*00501* sequence indicate amino acid identity with *Bibi-N*00502*. The donor sequence is highlighted in green and the duplicated peptide in allele *Bibi-N*00501* is highlighted in blue. The symbols above the *Bibi-N*00502* sequence indicate amino acids of typical MHC-I proteins (e.g. *Bibi-N*00502*) that are predicted to contact either the bound oligopeptide or the T-cell receptor: * predicted to contact the peptide, + predicted to contact the T-cell ∼ receptor, predicted to contact both.(55) The seven amino acid duplication in *Bibi-N*00501* is in the middle of the α-helix that contacts with both the peptide antigen and the T-cell receptor.

Cited by

Testis-specific glyceraldehyde-3-phosphate dehydrogenase: origin and evolution.
Kuravsky ML, Aleshin VV, Frishman D, Muronetz VI. Kuravsky ML, et al. BMC Evol Biol. 2011 Jun 10;11:160. doi: 10.1186/1471-2148-11-160. BMC Evol Biol. 2011. PMID: 21663662 Free PMC article.
Intron creation and DNA repair.
Ragg H. Ragg H. Cell Mol Life Sci. 2011 Jan;68(2):235-42. doi: 10.1007/s00018-010-0532-2. Epub 2010 Sep 19. Cell Mol Life Sci. 2011. PMID: 20853128 Free PMC article. Review.
Novel and expanded roles for MAPK signaling in Arabidopsis stomatal cell fate revealed by cell type-specific manipulations.
Lampard GR, Lukowitz W, Ellis BE, Bergmann DC. Lampard GR, et al. Plant Cell. 2009 Nov;21(11):3506-17. doi: 10.1105/tpc.109.070110. Epub 2009 Nov 6. Plant Cell. 2009. PMID: 19897669 Free PMC article.
Small proteins: untapped area of potential biological importance.
Su M, Ling Y, Yu J, Wu J, Xiao J. Su M, et al. Front Genet. 2013 Dec 16;4:286. doi: 10.3389/fgene.2013.00286. Front Genet. 2013. PMID: 24379829 Free PMC article. Review.
Evolution of JAK-STAT pathway components: mechanisms and role in immune system development.
Liongue C, O'Sullivan LA, Trengove MC, Ward AC. Liongue C, et al. PLoS One. 2012;7(3):e32777. doi: 10.1371/journal.pone.0032777. Epub 2012 Mar 7. PLoS One. 2012. PMID: 22412924 Free PMC article.

References

1. Gilbert W. Why genes in pieces? Nature. 1978;271:501. - PubMed
1. Greer JM, Puetz J, Thomas KR, Capecchi MR. Maintenance of functional equivalence during paralogous Hox gene evolution. Nature. 2000;403:661–665. - PubMed
1. Li WH, Gu Z, Cavalcanti AR, Nekrutenko A. Detection of gene duplications and block duplications in eukaryotic genomes. J Struct Funct Genomics. 2003;3:27–34. - PubMed
1. Hoegg S, Meyer A. Hox clusters as models for vertebrate genome evolution. Trends Genet. 2005;21:421–424. - PubMed
1. de Souza SJ, Long M, Gilbert W. Introns and gene evolution. Genes Cells. 1996;1:493–505. - PubMed

The origins of polypeptide domains - PubMed