Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data - PubMed
Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data
Ryan N Gutenkunst et al. PLoS Genet. 2009 Oct.
Abstract
Demographic models built from genetic data play important roles in illuminating prehistorical events and serving as null models in genome scans for selection. We introduce an inference method based on the joint frequency spectrum of genetic variants within and between populations. For candidate models we numerically compute the expected spectrum using a diffusion approximation to the one-locus, two-allele Wright-Fisher process, involving up to three simultaneous populations. Our approach is a composite likelihood scheme, since linkage between neutral loci alters the variance but not the expectation of the frequency spectrum. We thus use bootstraps incorporating linkage to estimate uncertainties for parameters and significance values for hypothesis tests. Our method can also incorporate selection on single sites, predicting the joint distribution of selected alleles among populations experiencing a bevy of evolutionary forces, including expansions, contractions, migrations, and admixture. We model human expansion out of Africa and the settlement of the New World, using 5 Mb of noncoding DNA resequenced in 68 individuals from 4 populations (YRI, CHB, CEU, and MXL) by the Environmental Genome Project. We infer divergence between West African and Eurasian populations 140 thousand years ago (95% confidence interval: 40-270 kya). This is earlier than other genetic studies, in part because we incorporate migration. We estimate the European (CEU) and East Asian (CHB) divergence time to be 23 kya (95% c.i.: 17-43 kya), long after archeological evidence places modern humans in Europe. Finally, we estimate divergence between East Asians (CHB) and Mexican-Americans (MXL) of 22 kya (95% c.i.: 16.3-26.9 kya), and our analysis yields no evidence for subsequent migration. Furthermore, combining our demographic model with a previously estimated distribution of selective effects among newly arising amino acid mutations accurately predicts the frequency spectrum of nonsynonymous variants across three continental populations (YRI, CHB, CEU).
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures

(A) Qualitative effects of modeled neutral genetic forces on , the density of alleles at relative frequencies
and
in populations 1 and 2. (B) For the spectra shown, an equilibrium population of effective size
diverges into two populations
generations ago. Populations 1 and 2 have effective sizes
and
, respectively. Migration is symmetric at
per generation, and
. (C) The AFS at
. Each entry is colored by the logarithm of the number of sites in it, according to the scale shown. (D) The AFS at various times for various demographic parameters, on the same scale as (B). (E) Comparison between coalescent- and diffusion-based estimates of the likelihood
of data generated under the model (A). Coalescent-based estimates of the likelihood, each of which took approximately 7.0 seconds, are represented in the histogram. The result from our diffusion approach, which took 2.0 seconds, is represented by the red line. For accuracy comparison, the yellow line indicates the likelihood inferred from
coalescent simulations.

(A) AFS for the YRI, CEU, and CHB populations. The color scale is as in (C). (B) Illustration of the model we fit, with the 14 free parameters labeled. (C) Marginal spectra for each pair of populations. The top row is the data, and the second is the maximum-likelihood model. The third row shows the Anscombe residuals between model and data. Red or blue residuals indicate that the model predicts too many or too few alleles in a given cell, respectively. (D) The observed decay of linkage disequilibrium (black lines) is qualitatively well-matched by our simulated data sets (colored lines). (E) Goodness-of-fit tests based on the likelihood and Pearson's
statistic both indicate that our model is a reasonable, though incomplete description of the data. In both plots, the red line results from fitting the real data and the histogram from fits to simulated data. Poorer fits lie to the right (lower
and higher
). (F) The improvement in likelihood from including contemporary migration in the real data fit (red line) is much greater than expected from fits to simulated data generated without contemporary migration (histogram). This indicates that the data contain a strong signal of contemporary migration.

As in Figure 2, (A) is the data, (B) is a schematic of the model we fit, (C) compares the data and model AFS, and (D) compares LD. (E) The fit of our model to the real data is not atypical of fits to simulated data. (F) The improvement in real data fit upon including CHB-MXL migration (red line) is very typical of the improvement in fits to simulated data without CHB-MXL migration. Thus we have no evidence for CHB-MXL migration after divergence.

We simulated our maximum-likelihood Out of Africa demographic model with a distribution of selective effects previously inferred for nonsynonymous polymorphism . (A) To enable direct comparison with the neutral AFS (Figure 2C), the scaled mutation rate was set identically, as is the color scale. As expected, selection dramatically reduces the amount of segregating polymorphism. (B) Shown are the proportions of variation found in various frequency classes. As expected, nonsynonymous variants typically have lower frequency. They also less likely to be shared between populations. Data error bars indicate 95% bootstrap confidence intervals.
Similar articles
-
Lukic S, Hey J. Lukic S, et al. Genetics. 2012 Oct;192(2):619-39. doi: 10.1534/genetics.112.141846. Epub 2012 Aug 3. Genetics. 2012. PMID: 22865734 Free PMC article.
-
McEvoy BP, Powell JE, Goddard ME, Visscher PM. McEvoy BP, et al. Genome Res. 2011 Jun;21(6):821-9. doi: 10.1101/gr.119636.110. Epub 2011 Apr 25. Genome Res. 2011. PMID: 21518737 Free PMC article.
-
Wei YL, Wei L, Zhao L, Sun QF, Jiang L, Zhang T, Liu HB, Chen JG, Ye J, Hu L, Li CX. Wei YL, et al. Int J Legal Med. 2016 Jan;130(1):27-37. doi: 10.1007/s00414-015-1183-5. Epub 2015 Apr 2. Int J Legal Med. 2016. PMID: 25833170
-
[The origin and evolution history of East Asian populations from genetic perspectives].
Tian JY, Li YC, Kong QP, Zhang YP. Tian JY, et al. Yi Chuan. 2018 Oct 20;40(10):814-824. doi: 10.16288/j.yczz.18-202. Yi Chuan. 2018. PMID: 30369466 Review. Chinese.
-
Genomic inference using diffusion models and the allele frequency spectrum.
Ragsdale AP, Moreau C, Gravel S. Ragsdale AP, et al. Curr Opin Genet Dev. 2018 Dec;53:140-147. doi: 10.1016/j.gde.2018.10.001. Epub 2018 Oct 23. Curr Opin Genet Dev. 2018. PMID: 30366252 Review.
Cited by
-
The Genetic Cost of Neanderthal Introgression.
Harris K, Nielsen R. Harris K, et al. Genetics. 2016 Jun;203(2):881-91. doi: 10.1534/genetics.116.186890. Epub 2016 Apr 2. Genetics. 2016. PMID: 27038113 Free PMC article.
-
Population genetics models of local ancestry.
Gravel S. Gravel S. Genetics. 2012 Jun;191(2):607-19. doi: 10.1534/genetics.112.139808. Epub 2012 Apr 4. Genetics. 2012. PMID: 22491189 Free PMC article.
-
Demes: a standard format for demographic models.
Gower G, Ragsdale AP, Bisschop G, Gutenkunst RN, Hartfield M, Noskova E, Schiffels S, Struck TJ, Kelleher J, Thornton KR. Gower G, et al. Genetics. 2022 Nov 1;222(3):iyac131. doi: 10.1093/genetics/iyac131. Genetics. 2022. PMID: 36173327 Free PMC article.
-
Lessons Learned from Bugs in Models of Human History.
Ragsdale AP, Nelson D, Gravel S, Kelleher J. Ragsdale AP, et al. Am J Hum Genet. 2020 Oct 1;107(4):583-588. doi: 10.1016/j.ajhg.2020.08.017. Am J Hum Genet. 2020. PMID: 33007197 Free PMC article.
-
Signatures of local adaptation and maladaptation to future climate in wild Zizania latifolia.
Zou Y, Yang W, Zhang R, Xu X. Zou Y, et al. Commun Biol. 2024 Oct 12;7(1):1313. doi: 10.1038/s42003-024-07036-1. Commun Biol. 2024. PMID: 39396070 Free PMC article.
References
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous