Simulation of DNA sequence evolution under models of recent directional selection - PubMed
- ️Invalid Date
Simulation of DNA sequence evolution under models of recent directional selection
Yuseob Kim et al. Brief Bioinform. 2009 Jan.
Abstract
Computer simulation is an essential tool in the analysis of DNA sequence variation for mapping events of recent adaptive evolution in the genome. Various simulation methods are employed to predict the signature of selection in sequence variation. The most informative and efficient method currently in use is coalescent simulation. However, this method is limited to simple models of directional selection. Whole-population forward-in-time simulations are the alternative to coalescent simulations for more complex models. The notorious problem of excessive computational cost in forward-in-time simulations can be overcome by various simplifying amendments. Overall, the success of simulations depends on the creative application of some population genetic theory to the simulation algorithm.
Figures

Gene genealogy under a model of strong directional selection and the corresponding pattern of sequence polymorphism, for a population of 20 haploid individuals (homologous sequences; shown as circles) which reproduce in discrete generations. (A) Each line in the graph represents one generation. Gray arrows indicate the inheritance of alleles at the locus under directional selection. Filled (empty) circles represent individuals carrying the beneficial (ancestral) allele at the selected locus. Due to strong directional selection, the beneficial allele quickly reaches fixation in the population, indicated by the presence of only filled circles in the lower third of the figure. Three sequences (q, r and s) were sampled in the current generation (bottom). The genealogy at the selected locus, traced backward-in-time from the sampled chromosomes, is shown by dotted lines. The genealogy at a neutral locus that is partially linked to the selected locus is shown by solid lines. Genealogies at the two loci are different due to recombination events (indicated by diamonds). Coalescent events are indicated by squares. The recombination event in the ancestral sequence (l) allows the neutral lineage of sequence s to escape the coalescence tree of the selected locus. Coalescent simulations create such genealogies by generating only the times of coalescence or recombination according to probabilistic models, without specifying the reproduction of the entire population at each generation. Genealogies at different loci merge to form an ancestral recombination graph. Therefore, squares and diamonds above correspond to nodes in the ancestral recombination graph. Lightning symbols along the lineages indicate mutation events that produce polymorphism in the sampled sequences. Empty and gray lightning represent mutation events which happened in earlier generations than the ones shown; however, they are still visible as polymorphisms in the sampled sequences q, r and s that are inherited along the indicated lineages. Black lightning represent mutation events which happened during the shown time interval. (B) Sketch of a possible course of evolutionary events leading to the observable polymorphisms in a present day sample (sequences q, r and s). The states of ancestral sequences at selected time points [sequences a to p indicated in (A)] are shown. Filled gray bars indicate the sequence on which the beneficial mutation first occurred (gray circle on sequence a) and its descendants. Empty bars represent DNA segments which recombined with the beneficial allele. A dashed outline of the empty bars indicates ancestral DNA segments, acquired by recombination, that do not leave descendants in the present-day sample.

Variability profile along a sequence of 20 kb produced by coalescent simulation under the demographic model of [55] (a population with N = 104 individuals undergoes a bottleneck with 3000 individuals that starts 600 generations ago and lasts for 200 generations. At the start of bottleneck, directional selection occurs on a beneficial mutation with s = 0.05 and the starting frequency of 0.001. Scaled recombination and mutation rates are 4Nr = 0.04 and 4Nμ = 0.005, respectively, per nucleotide). Variation is measured by π (solid curve) and θH (gray curve) and with a sliding window of 1 kb (A) without selection (B) with selection. The filled triangle at 10 kb indicates the position of the site where a beneficial mutation triggered a selective sweep. At first sight, both measures show a quite irregular pattern in both scenarios. However, they become distinguishable with the help of derived quantities, for instance the difference θH − π, which is much larger under selection than under neutrality. The distributions of such statistics can be obtained by coalescent simulations and then be used to define a significance level for rejecting the null hypothesis of neutral evolution.
Similar articles
-
GENOMEPOP: a program to simulate genomes in populations.
Carvajal-Rodríguez A. Carvajal-Rodríguez A. BMC Bioinformatics. 2008 Apr 30;9:223. doi: 10.1186/1471-2105-9-223. BMC Bioinformatics. 2008. PMID: 18447924 Free PMC article.
-
An overview of population genetic data simulation.
Yuan X, Miller DJ, Zhang J, Herrington D, Wang Y. Yuan X, et al. J Comput Biol. 2012 Jan;19(1):42-54. doi: 10.1089/cmb.2010.0188. Epub 2011 Dec 9. J Comput Biol. 2012. PMID: 22149682 Free PMC article. Review.
-
Rapid forward-in-time simulation at the chromosome and genome level.
Aberer AJ, Stamatakis A. Aberer AJ, et al. BMC Bioinformatics. 2013 Jul 9;14:216. doi: 10.1186/1471-2105-14-216. BMC Bioinformatics. 2013. PMID: 23834340 Free PMC article.
-
Efficient pedigree recording for fast population genetics simulation.
Kelleher J, Thornton KR, Ashander J, Ralph PL. Kelleher J, et al. PLoS Comput Biol. 2018 Nov 1;14(11):e1006581. doi: 10.1371/journal.pcbi.1006581. eCollection 2018 Nov. PLoS Comput Biol. 2018. PMID: 30383757 Free PMC article.
-
The molecular signature of selection underlying human adaptations.
Harris EE, Meyer D. Harris EE, et al. Am J Phys Anthropol. 2006;Suppl 43:89-130. doi: 10.1002/ajpa.20518. Am J Phys Anthropol. 2006. PMID: 17103426 Review.
Cited by
-
Johri P, Riall K, Becher H, Excoffier L, Charlesworth B, Jensen JD. Johri P, et al. Mol Biol Evol. 2021 Jun 25;38(7):2986-3003. doi: 10.1093/molbev/msab050. Mol Biol Evol. 2021. PMID: 33591322 Free PMC article.
-
Genetic hitchhiking under heterogeneous spatial selection pressures.
Schneider KA, Kim Y. Schneider KA, et al. PLoS One. 2013 Apr 24;8(4):e61742. doi: 10.1371/journal.pone.0061742. Print 2013. PLoS One. 2013. PMID: 23637897 Free PMC article.
-
Simulation of genes and genomes forward in time.
Carvajal-Rodríguez A. Carvajal-Rodríguez A. Curr Genomics. 2010 Mar;11(1):58-61. doi: 10.2174/138920210790218007. Curr Genomics. 2010. PMID: 20808525 Free PMC article.
-
Robust forward simulations of recurrent hitchhiking.
Uricchio LH, Hernandez RD. Uricchio LH, et al. Genetics. 2014 May;197(1):221-36. doi: 10.1534/genetics.113.156935. Epub 2014 Feb 21. Genetics. 2014. PMID: 24561480 Free PMC article.
-
Dabi A, Schrider DR. Dabi A, et al. bioRxiv [Preprint]. 2024 Sep 3:2024.04.07.588318. doi: 10.1101/2024.04.07.588318. bioRxiv. 2024. PMID: 38645049 Free PMC article. Updated. Preprint.
References
-
- Schlötterer C. Hitchhiking mapping–functional genomics from the population genetics perspective. Trends Genet. 2003;19:32–8. - PubMed
-
- Vigouroux Y, Matsuoka Y, Doebley J. Directional evolution for microsatellite size in maize. Mol Biol Evol. 2003;20:1480–3. - PubMed
-
- Maynard Smith J, Haigh J. The hitch-hiking effect of a favorable gene. Genet Res. 1974;23:23–35. - PubMed