pubmed.ncbi.nlm.nih.gov

Simulation of DNA sequence evolution under models of recent directional selection - PubMed

  • ️Invalid Date

Simulation of DNA sequence evolution under models of recent directional selection

Yuseob Kim et al. Brief Bioinform. 2009 Jan.

Abstract

Computer simulation is an essential tool in the analysis of DNA sequence variation for mapping events of recent adaptive evolution in the genome. Various simulation methods are employed to predict the signature of selection in sequence variation. The most informative and efficient method currently in use is coalescent simulation. However, this method is limited to simple models of directional selection. Whole-population forward-in-time simulations are the alternative to coalescent simulations for more complex models. The notorious problem of excessive computational cost in forward-in-time simulations can be overcome by various simplifying amendments. Overall, the success of simulations depends on the creative application of some population genetic theory to the simulation algorithm.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:

Gene genealogy under a model of strong directional selection and the corresponding pattern of sequence polymorphism, for a population of 20 haploid individuals (homologous sequences; shown as circles) which reproduce in discrete generations. (A) Each line in the graph represents one generation. Gray arrows indicate the inheritance of alleles at the locus under directional selection. Filled (empty) circles represent individuals carrying the beneficial (ancestral) allele at the selected locus. Due to strong directional selection, the beneficial allele quickly reaches fixation in the population, indicated by the presence of only filled circles in the lower third of the figure. Three sequences (q, r and s) were sampled in the current generation (bottom). The genealogy at the selected locus, traced backward-in-time from the sampled chromosomes, is shown by dotted lines. The genealogy at a neutral locus that is partially linked to the selected locus is shown by solid lines. Genealogies at the two loci are different due to recombination events (indicated by diamonds). Coalescent events are indicated by squares. The recombination event in the ancestral sequence (l) allows the neutral lineage of sequence s to escape the coalescence tree of the selected locus. Coalescent simulations create such genealogies by generating only the times of coalescence or recombination according to probabilistic models, without specifying the reproduction of the entire population at each generation. Genealogies at different loci merge to form an ancestral recombination graph. Therefore, squares and diamonds above correspond to nodes in the ancestral recombination graph. Lightning symbols along the lineages indicate mutation events that produce polymorphism in the sampled sequences. Empty and gray lightning represent mutation events which happened in earlier generations than the ones shown; however, they are still visible as polymorphisms in the sampled sequences q, r and s that are inherited along the indicated lineages. Black lightning represent mutation events which happened during the shown time interval. (B) Sketch of a possible course of evolutionary events leading to the observable polymorphisms in a present day sample (sequences q, r and s). The states of ancestral sequences at selected time points [sequences a to p indicated in (A)] are shown. Filled gray bars indicate the sequence on which the beneficial mutation first occurred (gray circle on sequence a) and its descendants. Empty bars represent DNA segments which recombined with the beneficial allele. A dashed outline of the empty bars indicates ancestral DNA segments, acquired by recombination, that do not leave descendants in the present-day sample.

Figure 2:
Figure 2:

Variability profile along a sequence of 20 kb produced by coalescent simulation under the demographic model of [55] (a population with N = 104 individuals undergoes a bottleneck with 3000 individuals that starts 600 generations ago and lasts for 200 generations. At the start of bottleneck, directional selection occurs on a beneficial mutation with s = 0.05 and the starting frequency of 0.001. Scaled recombination and mutation rates are 4Nr = 0.04 and 4 = 0.005, respectively, per nucleotide). Variation is measured by π (solid curve) and θH (gray curve) and with a sliding window of 1 kb (A) without selection (B) with selection. The filled triangle at 10 kb indicates the position of the site where a beneficial mutation triggered a selective sweep. At first sight, both measures show a quite irregular pattern in both scenarios. However, they become distinguishable with the help of derived quantities, for instance the difference θHπ, which is much larger under selection than under neutrality. The distributions of such statistics can be obtained by coalescent simulations and then be used to define a significance level for rejecting the null hypothesis of neutral evolution.

Similar articles

Cited by

References

    1. Schlötterer C. Hitchhiking mapping–functional genomics from the population genetics perspective. Trends Genet. 2003;19:32–8. - PubMed
    1. Voight BF, Kudaravalli S, Wen X, et al. A map of recent positive selection in the human genome. PLoS Biol. 2006;4:e72. - PMC - PubMed
    1. Vigouroux Y, Matsuoka Y, Doebley J. Directional evolution for microsatellite size in maize. Mol Biol Evol. 2003;20:1480–3. - PubMed
    1. Maynard Smith J, Haigh J. The hitch-hiking effect of a favorable gene. Genet Res. 1974;23:23–35. - PubMed
    1. Kaplan NL, Hudson RR, Langley CH. The “hitchhiking effect” revisited. Genetics. 1989;123:887–99. - PMC - PubMed

Publication types

MeSH terms