pubmed.ncbi.nlm.nih.gov

Comparison of SIV and HIV-1 genomic RNA structures reveals impact of sequence evolution on conserved and non-conserved structural motifs - PubMed

Comparative Study

Comparison of SIV and HIV-1 genomic RNA structures reveals impact of sequence evolution on conserved and non-conserved structural motifs

Elizabeth Pollom et al. PLoS Pathog. 2013.

Abstract

RNA secondary structure plays a central role in the replication and metabolism of all RNA viruses, including retroviruses like HIV-1. However, structures with known function represent only a fraction of the secondary structure reported for HIV-1(NL4-3). One tool to assess the importance of RNA structures is to examine their conservation over evolutionary time. To this end, we used SHAPE to model the secondary structure of a second primate lentiviral genome, SIVmac239, which shares only 50% sequence identity at the nucleotide level with HIV-1NL4-3. Only about half of the paired nucleotides are paired in both genomic RNAs and, across the genome, just 71 base pairs form with the same pairing partner in both genomes. On average the RNA secondary structure is thus evolving at a much faster rate than the sequence. Structure at the Gag-Pro-Pol frameshift site is maintained but in a significantly altered form, while the impact of selection for maintaining a protein binding interaction can be seen in the conservation of pairing partners in the small RRE stems where Rev binds. Structures that are conserved between SIVmac239 and HIV-1(NL4-3) also occur at the 5' polyadenylation sequence, in the plus strand primer sites, PPT and cPPT, and in the stem-loop structure that includes the first splice acceptor site. The two genomes are adenosine-rich and cytidine-poor. The structured regions are enriched in guanosines, while unpaired regions are enriched in adenosines, and functionaly important structures have stronger base pairing than nonconserved structures. We conclude that much of the secondary structure is the result of fortuitous pairing in a metastable state that reforms during sequence evolution. However, secondary structure elements with important function are stabilized by higher guanosine content that allows regions of structure to persist as sequence evolution proceeds, and, within the confines of selective pressure, allows structures to evolve.

PubMed Disclaimer

Conflict of interest statement

Dr. Gorelick is employed by a private company (SAIC-Frederick, Inc.), but he is employed to perform basic research, under contract to the National Cancer Institute, and his work has no connection with any product-development or other commercial activities. This has no bearing on employment, consultancy, patents, products in development or marketed products etc. In addition, this affiliation does not alter our adherence to all of the PLoS Pathogens policies on sharing data and materials.

Figures

Figure 1
Figure 1. Model for the structure of the SIVmac239 RNA genome as determined by SHAPE probing and directed RNA structure refinement.

The genome is divided into (A) 5′ and (B) 3′ halves. Colors of nucleotides indicate SHAPE reactivity on the scale shown on the left. Each sphere corresponds to a nucleotide, and side-by-side spheres indicate a base pair. Protein coding region boundaries are indicated by letters with the code shown at the bottom. Splice acceptor and donor sites are labeled SA and SD, respectively. tRNALys3 interaction is shown in gray. Heavy blue bars indicate base pairs in stems that are conserved between codon-aligned SIVmac239 and HIV-1NL4-3 RNA structures (71 total pairs). Areas of structure with a median reactivity below 0.3 over a 75 nucleotide window are numbered in green and correspond to motif numbers in Figure 2B. All positions are numbered in reference to the GenBank accession number M33262 for SIVmac239. A full structure, including nucleotide identity, is shown in Figure S1. Helix files for SIVmac239 and HIV-1NL4-3 are included in Datasets S1 and S3, respectively.

Figure 2
Figure 2. Genomic organization and SHAPE reactivity of SIVmac239 and comparison with HIV-1NL4-3.

(A) Organization of the SIVmac239 genome. Gray boxes indicate protein coding regions, dark lines indicate the boundaries of the mature viral proteins. (B) SIVmac239 median SHAPE reactivity values calculated over a 75 nucleotide sliding window (red). Regions with SHAPE reactivities below 0.3 are numbered (and listed explicity in Table S2). SHAPE reactivity values of HIV-1NL4-3 (gray) are shown as medians calculated over a 75 nucleotide sliding window and overlayed with those of SIVmac239. Viruses were codon-aligned based on the Los Alamos Database alignment (

www.hiv.lanl.gov

). Green dashed line indicates SHAPE reactivity of 0.4, and the gray bars below indicate regions where median reactivity values for both viruses are below 0.4. SHAPE reactivities for SIVmac239 and HIV-1NL4-3 are included in Datasets S2 and S4, respectively. (C) Percent concordance of SHAPE reactivity and pairing prediction of the SIVmac239 genome over a 76 nucleotide sliding window. A concordant nucleotide is defined as having low SHAPE reactivity (below 0.4) and predicted to be paired or high reactivity (0.4 or above) and predicted to be unpaired.

Figure 3
Figure 3. Structural similarity in the 5′ regions of the SIVmac239 and HIV-1NL4-3 genomes.

Predicted structures of SIVmac239 (left) and HIV-1NL4-3 (right) in the 5′ region; distinctive structures are coded by color. Conserved base pairs are indicated by dark blue connecting lines. The primer binding site (PBS) and 5′ poly(A) signal AAUAAA are emphasized with curved lines. The hatched lines indicate nucleotides within the MA coding domain that are not shown. The known pseudoknot is labeled in HIV-1NL4-3, and the predicted pseudoknot in SIVmac239 is labeled and shown with thick gray lines; protein residues encoded by the pseudoknot region are shown.

Figure 4
Figure 4. LNA oligo seclusion of half of the predicted 5′ poly(A) pseudoknot in SIVmac239.

(A) The SHAPE-derived structures of SIVmac239 5′ poly(A) hairpin and the predicted pseudoknot interaction without (top) and with (bottom) addition of an oligo containing locked nucleic acids (LNA). The 5′ half of the pseudoknot is indicated by the label 5′ pk. Gray dots indicate nucleotides outside of the area of interest. Nucleotides not shown are indicated by hatched lines. (B) Corresponding SHAPE reactivity values at the 5′ half of the predicted pseudoknot of SIVmac239 without (top) and with (bottom) the addition of an oligo containing LNAs and are colored according to the given scale. The 5′ half of the pseudoknot is indicated by the label 5′ pk.

Figure 5
Figure 5. Codon alignment and predicted pairing partners in the Gag-Pro-Pol frameshift region of SIVmac239 and HIV-1NL4-3.

(A) RNA structures at the Gag-Pro-Pol frameshift stem for SIVmac239 (left) and HIV-1 (right). The main stem is emphasized by green brackets and the poly(U) slippery sequence is emphasized by a curved line. The dotted line indicates nucleotides between the poly(U) sequence and the beginning of the frameshift stem. (B) The sequences of HIV-1, SIVagm, and SIVmac239 were aligned horizontally. The poly(U) slippery sequence is boxed. Curved lines represent base pairs between the nucleotides within HIV-1 and SIVmac239; the curved lines connecting nucleotides downstream of the poly(U) sequence correspond to the frameshift stems (green), the dotted line represents the extended frameshift stem in HIV-1. Gray boxes indicate regions of strong alignment, and spaces in the sequence indicate regions of poor alignment. The structure of the HIV-1 hairpin is modified from the previous model based on the updated folding parameters as described in the Supporting Material.

Figure 6
Figure 6. Codon alignment and predicted pairing partners in the RRE region of SIVmac239 and HIV-1NL4-3.

(A) Predicted structures within the RRE are shown for SIVmac239 (left) and HIV-1 (middle). Codons of stem I are in brackets with their corresponding amino acids labeled and numbered from the first codon of stem I. Blue brackets indicate an area of conserved pairing partners; green brackets indicate an area of shifted pairing partners in stem I. The HIV-1NL4-3 structure with forced SIVmac239 pairing is also shown with sequence variations that occur in SIVmac239 indicated while those that change the amino acid sequence are in green (right). Blue lines indicate base pairs that are exactly conserved between the two viruses. (B) The sequences of SIVmac239 (top) and HIV-1NL4-3 (bottom) aligned horizontally. Curved lines indicate base-pairing partners. Gray boxes indicate regions of amino acid alignment. Roman numerals indicate helices discussed in the text.

Figure 7
Figure 7. Codon alignment and predicted pairing partners in the stem-loop surrounding SA1 of SIVmac239 and HIV-1NL4-3.

(A) Structures for the conserved stem in SIVmac239 (left) and HIV-1NL4-3 (right). Blue lines indicate the base pairs that are exactly conserved between the two viruses. (B) The sequences of SIVmac239 (top) and HIV-1NL4-3 are aligned horizontally. Curved lines indicate base-pairing partners. Gray boxes indicate regions of amino acid alignment. Bold letters represent the bases that are involved in the conserved pairing interactions.

Figure 8
Figure 8. Profiles of HIV-1wt and HIV-1SLSA1m transcripts.

(A) Diagram displaying reading frames (open boxes) of the HIV-1NL4-3 genome. Solid lines indicate different classes of mRNA including unspliced, 4 kb, and 1.8 kb, with their corresponding genes labeled on the right. Gray boxes represent exons 2 (between SA1 to SD2) and 3 (between SA2 and SD3). Splice donors (SD1-4) and acceptors (SA1-7) are labeled on the top of the unspliced length of RNA The sites of NarI cleavage and primer-binding for the forward and reverse primers used to create the splicing profile are shown on the unspliced RNA. (B) SHAPE-derived structure for the SLSA1 hairpin in HIV-1NL4-3. The first splice acceptor (SA1) is labeled. The mutated nucleotides in HIVSLSA1m are boxed, and the mutations are identified next to the arrows. Exonic splicing enhancer sequences (ESEVif and ESEM1) , are labeled. (C) Splicing profiles for HIV-1wt and HIV-1SLSA1m grown for three days in CEMx174 cells are shown. The cDNA from the 1.8 kb class (left) or the 4 kb class (right) of transcripts, amplified with the corresponding primers, was separated on a 6% polyacrylamide gel and labeled according to common nomenclature (solid lines) or left unidentified (dotted lines). Decreased band intensity between the WT and SLSA1m transcripts is marked by thicker lines.

Figure 9
Figure 9. SHAPE analysis of the polypurine tracts of SIVmac239 and HIV-1NL4-3 and base composition of both genomes in structured and unstructured regions.

(A) RNA structure models for the cPPT and PPT of SIVmac239 and HIV-1NL4-3. Nucleotides involved in the polypurine tracts are colored according to their SHAPE reactivity values as in Figure 1. Other nucleotides are light gray. Nucleotides not shown are indicated by hatched lines. (B) Histograms of SHAPE reactivity values, integrated and normalized, along the span of the polypurine tracts. HIV-1NL4-3 reactivity values are displayed in a lighter color scale. (C) Histogram of percentage of each individual nucleotide compared to the percentage of each in the entire genome. For each individual nucleotide, SIVmac239 (green) is on the left and HIV-1NL4-3 (blue) is on the right. The percent paired for each nucleotide is indicated by hatched lines. (D) Histogram of percentage of each nucleotide in the genome compared to the percentage in highly structured regions of known function (5′-UTR and RRE). SIVmac239 (green) is on the left and HIV-1NL4-3 (blue) is on the right.

Figure 10
Figure 10. G minus A comparison for genomic sequences of various primate lentiviruses.

(A) Organization of the SIVmac239 genome. Gray boxes indicate protein coding regions, dark lines indicate the boundaries of the mature viral proteins. (B) The number of adenosines was subtracted from the number of guanosines (G minus A) in a 75 nucleotide sliding window for selected primate lentiviruses. Dotted lines indicate the point where the difference between the number of G and A is zero for each individual virus. Values above the dotted line indicate higher guanosine concentration than adenosine. Viruses were aligned to SIVmac239 based on protein-coding regions. GenBank accession numbers for each virus are the following: SIVL'hoest, AF075269; SIVsyk173, L06042; SIVagm, M30931; HIV2BEN, M30502; HIV-1 MVP, L20571. Domain junctions on polyproteins are indicated by yellow bars. Other landmarks within the genome are indicated by gray bars.

Similar articles

Cited by

References

    1. Olsen HS, Nelbock P, Cochrane AW, Rosen CA (1990) Secondary structure is the major determinant for interaction of HIV rev protein with RNA. Science 247: 845–848. - PubMed
    1. Karn J (1999) Tackling Tat. J Mol Biol 293: 235–254. - PubMed
    1. Parkin NT, Chamorro M, Varmus HE (1992) Human immunodeficiency virus type 1 gag-pol frameshifting is dependent on downstream mRNA secondary structure: demonstration by expression in vivo. J Virol 66: 5147–5151. - PMC - PubMed
    1. Kollmus H, Honigman A, Panet A, Hauser H (1994) The sequences of and distance between two cis-acting signals determine the efficiency of ribosomal frameshifting in human immunodeficiency virus type 1 and human T-cell leukemia virus type II in vivo. J Virol 68: 6087–6091. - PMC - PubMed
    1. Muesing MA, Smith DH, Capon DJ (1987) Regulation of mRNA accumulation by a human immunodeficiency virus trans-activator protein. Cell 48: 691–701. - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources