pubmed.ncbi.nlm.nih.gov

Prp8, the pivotal protein of the spliceosomal catalytic center, evolved from a retroelement-encoded reverse transcriptase - PubMed

Prp8, the pivotal protein of the spliceosomal catalytic center, evolved from a retroelement-encoded reverse transcriptase

Mensur Dlakić et al. RNA. 2011 May.

Abstract

Prp8 is the largest and most highly conserved protein of the spliceosome, encoded by all sequenced eukaryotic genomes but missing from prokaryotes and viruses. Despite all evidence that Prp8 is an integral part of the spliceosomal catalytic center, much remains to be learned about its molecular functions and evolutionary origin. By analyzing sequence and structure similarities between Prp8 and other protein domains, we show that its N-terminal region contains a putative bromodomain. The central conserved domain of Prp8 is related to the catalytic domain of reverse transcriptases (RTs) and is most similar to homologous enzymes encoded by prokaryotic retroelements. However, putative catalytic residues in this RT domain are only partially conserved and may not be sufficient for the nucleotidyltransferase activity. The RT domain is followed by an uncharacterized sequence region with relatives found in fungal RT-like proteins. This part of Prp8 is predicted to adopt an α-helical structure and may be functionally equivalent to diverse maturase/X domains of retroelements and to the thumb domain of retroviral RTs. Together with a previously identified C-terminal domain that has an RNaseH-like fold, our results suggest evolutionary connections between Prp8 and ancient mobile elements. Prp8 may have evolved by acquiring nucleic acid-binding domains from inactivated retroelements, and their present-day role may be in maintaining proper conformation of the bound RNA cofactors and substrates of the splicing reaction. This is only the second example-the other one being telomerase-of the RT recruitment from a genomic parasite to serve an essential cellular function.

PubMed Disclaimer

Figures

FIGURE 1.
FIGURE 1.

(A) Diagram of conserved structural and functional domains in yeast Prp8p. P indicates proline-rich region (amino acids 5–78); N, nuclear localization signal (amino acids 81–120) (Boon et al. 2007); P8NT, PRO8 N-terminal domain (Staub et al. 2004); Br, bromodomain (this work); PROCN, PRO8 central domain (Staub et al. 2004); R, RNA recognition motif (Grainger and Beggs 2005); RT, reverse transcriptase-like palm-and-fingers domain (this work); Th/X, conserved domain in Prp8 and a subset of fungal RT-like proteins, located at the same position as “maturase-specific” X/thumb domain (this work); U5i, U5-interacting domain (Turner et al. 2006); U6i, U6-interacting domain (Turner et al. 2006); RNase H, RNase H–like domain (Pena et al. 2008; Ritchie et al. 2008; Yang et al. 2008); and MPN, metalloprotease-like domain (Pena et al. 2007; Zhang et al. 2007). Approximate boundaries of each domain are indicated by numbers. Domains described in this work are shaded, and they overlap with previously known domains, which are raised above for clarity while preserving their relative positions and sizes. Three vertical lines indicate approximate positions where Prp8p can be split so that resulting pieces are able to complement in trans (Boon et al. 2006). (B) General domain organization of group II introns (data adapted from Lambowitz and Zimmerly 2004). RT indicates reverse transcriptase–like palm-and-fingers domain; X, maturase-specific X domain thought to be related to thumb domains (Blocker et al. 2005); D, DNA-binding domain; and E, DNA endonuclease domain. Not all members of this group have D and E domains (Lambowitz and Zimmerly 2004). (C) General domain organization of eukaryotic retroviruses (data adapted from Kohlstaedt et al. 1992). RT indicates reverse transcriptase-like palm-and-fingers domain; Th, thumb domain; C, connection domain; and RNH, RNase H domain.

FIGURE 2.
FIGURE 2.

Putative bromodomain in Prp8. Prp8 proteins from 10 different species are grouped in the top part of the figure. Several different classes of chromatin-binding bromodomains are aligned in the bottom part of the figure. Residues in chromatin-binding bromodomains that are important for binding pocket formation and direct acetyl-lysine recognition are indicated by red circles and a red triangle, respectively. Function of the conserved glutamate marked by a question mark is unclear. Secondary structure of the bromodomain 1 of mouse Brd4 (Vollmuth et al. 2009) is shown with H marking residues in α-helices. ZA and BC loops are indicated on the secondary structure line. Three stretches of charged residues within or around the ZA loop of Prp8 proteins are indicated by horizontal red lines above the alignment. Lowercase letters preceding Prp8 or bromodomain names stand for the following species: h, Homo sapiens; x, Xenopus laevis; d, Drosophila melanogaster; w, Caenorhabditis elegans; n, Nematostella vectensis; t, Trichoplax adhaerens, a, Arabidopsis thaliana; o, Ostreococcus tauri; p, Paramecium tetraurelia; and y, Saccharomyces cerevisiae. Underline followed by a number at the end of protein names indicates the numerical order of bromodomains for proteins that have multiple copies. Aligned ranges of sequences are shown on each line. Capital letters on the consensus line mean that a single-residue is conserved in at least 90% of sequences. The meaning of lowercase letters on the consensus line is as follows: h, hydrophobic; b, big; and s, small.

FIGURE 3.
FIGURE 3.

(A) The reverse transcriptase-like domain in Prp8. The alignment with selected prokaryotic and eukaryotic retroelements, as well as with retroviruses and telomerases, was made using the programs MACAW (Schuler et al. 1991) and MUSCLE (Edgar 2004). Positions of motifs A, B, and C are shown on the top line. Predicted secondary structures (SecStr Prp8 and SecStr GIIint) are shown when they could be predicted with confidence of 7 or higher (0–9 scale). 3KYL corresponds to Tribolium castaneum telomerase for which the secondary structure is known. Both for known and predicted secondary structures α-helices are marked by H and β-strands by S. Names of Prp8 proteins are the same as in Figure 2. To conserve the space in the legend and for figure clarity, the remaining sequences are grouped by similarity and are identified by their GI numbers. Aligned ranges of sequences are indicated on each line. Numbers in parentheses are in the regions of long insertions and deletions and show how many residues were omitted to make the alignment more compact. Columns are colored if at least 90% of sequences match the consensus except for two aspartates identified by first and third red circles on the line that reads Catalytic residues. Even though these residues are not conserved in Prp8 proteins and therefore do not qualify for 90% consensus, they are colored yellow for emphasis. Four acidic residues and one arginine that are well conserved only in Prp8 proteins and group II intron reverse transcriptases are shaded in green. Capital letters on the consensus line indicate single-residue conservation. The meaning of lowercase letters on the consensus is as follows: h, hydrophobic; b, big; and s, small. (B) The putative Th/X domain in Prp8. Names of Prp8 proteins are the same as in A and Figure 2. The remaining sequences are identified by their GI numbers. SecStr line shows secondary structure prediction for Prp8 proteins. Aligned ranges of sequences are indicated on each line. Capital letters on the consensus line indicate single-residue conservation for at least 90% of sequences. The meaning of lowercase letters on the consensus line is as follows: h, hydrophobic; +, positively charged; b, big; and s, small.

Similar articles

Cited by

References

    1. Abelson J 2008. Is the spliceosome a ribonucleoprotein enzyme? Nat Struct Mol Biol 15: 1235–1237 - PubMed
    1. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25: 3389–3402 - PMC - PubMed
    1. Anantharaman V, Koonin EV, Aravind L 2002. Comparative genomics and evolution of proteins involved in RNA metabolism. Nucleic Acids Res 30: 1427–1464 - PMC - PubMed
    1. Anantharaman V, Iyer LM, Aravind L 2010. Presence of a classical RRM-fold palm domain in Thg1-type 3′- 5′nucleic acid polymerases and the origin of the GGDEF and CRISPR polymerase domains. Biol Direct 5: 43 doi: 10.1186/1745-6150-5-43 - PMC - PubMed
    1. Andreeva A, Howorth D, Chandonia JM, Brenner SE, Hubbard TJ, Chothia C, Murzin AG 2008. Data growth and its impact on the SCOP database: new developments. Nucleic Acids Res 36: D419–D425 - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources