pubmed.ncbi.nlm.nih.gov

A haplotype method detects diverse scenarios of local adaptation from genomic sequence variation - PubMed

. 2016 Jul;25(13):3081-100.

doi: 10.1111/mec.13671. Epub 2016 Jun 6.

Affiliations

PMID: 27135633
PMCID: PMC4931985
DOI: 10.1111/mec.13671

A haplotype method detects diverse scenarios of local adaptation from genomic sequence variation

Jeremy D Lange et al. Mol Ecol. 2016 Jul.

Abstract

Identifying genomic targets of population-specific positive selection is a major goal in several areas of basic and applied biology. However, it is unclear how often such selection should act on new mutations versus standing genetic variation or recurrent mutation, and furthermore, favoured alleles may either become fixed or remain variable in the population. Very few population genetic statistics are sensitive to all of these modes of selection. Here, we introduce and evaluate the Comparative Haplotype Identity statistic (χMD ), which assesses whether pairwise haplotype sharing at a locus in one population is unusually large compared with another population, relative to genomewide trends. Using simulations that emulate human and Drosophila genetic variation, we find that χMD is sensitive to a wide range of selection scenarios, and for some very challenging cases (e.g. partial soft sweeps), it outperforms other two-population statistics. We also find that, as with FST , our haplotype approach has the ability to detect surprisingly ancient selective sweeps. Particularly for the scenarios resembling human variation, we find that χMD outperforms other frequency- and haplotype-based statistics for soft and/or partial selective sweeps. Applying χMD and other between-population statistics to published population genomic data from D. melanogaster, we find both shared and unique genes and functional categories identified by each statistic. The broad utility and computational simplicity of χMD will make it an especially valuable tool in the search for genes targeted by local adaptation.

Keywords: haplotypes; natural selection; partial sweeps; selective sweeps; simulation; soft sweeps.

PubMed Disclaimer

Figures

**Figure 1**
Power of each statistic for complete hard sweeps for high *N_e* (top) and low *N_e* cases (bottom). Note the difference in selection initiation times of the x axes.

**Figure 2**
Power of each statistic tested for partial hard sweeps, for high *N_e* (top) and low *N_e* cases (bottom).

**Figure 3**
For complete soft sweeps, the top two panels depict power of each tested statistic. The bottom two panels depict the number of unique adaptations of derived allele at the time of sampling to help distinguish the softness of the sweep (where a value close to 1 indicates mostly hard sweeps). Note the change in scale of x axes between the two *N_e* cases simulated (left and right).

**Figure 4**
Heat map depicting power for each statistic for partial soft sweeps. The key refers to powers ranging from 0 to 1. The x axis represents the number of copies of the beneficial allele in the population when the populations split. Note the change in x axes between the two *N_e* cases (starting frequency per 10,000 or per 1,000). The y axis represent the ending allele frequency at sampling.

**Figure 5**
Depicted here are power for scenarios with bottlenecks simulated. The left panels depict hard sweeps, with varying strengths of bottlenecks indicated on the x axis. The right panels depict a single bottleneck strength (0.05) with varying starting allele frequencies. Additional cases are summarized in Table S5.

**Figure 6**
Migration was simulated for a subset of scenarios. The high levels of migration that affected statistical performance were sufficient to prevent fixed differences at the target site. Allele frequencies at sampling for both populations are shown below each migration rate.

**Figure 7**
This heat map depicts power of the *χ_MD* statistic as a function of allele frequency threshold (minimum frequency of allele to be included in analysis) and the time (in coalescent units) since the initiation of a complete hard selective sweep. The exclusion of all but intermediate frequency alleles yields surprising power to detect very ancient sweeps.

**Figure 8**
Sample size effects on each statistic. Bottleneck strength in the high *N_e* case is 0.01 while in the low *N_e* case it is 0.025. Ending frequency of partial hard sweeps is 0.3. Starting allele frequency is 0.001 for the high *N_e* complete soft sweep is and 0.02 for the low *N_e* case.

**Figure 9**
For selected sweep scenarios, this heat map shows *χ_MD* power for differing window lengths and threshold proportions (the fraction of a window that must be identical between two haplotypes).

**Figure 10**
For a subset of sweep scenarios, this figure illustrates the decay of all three statistics’ power by distance (kilobases on the x axis). In the non-bottleneck complete hard sweep, the high *N_e* populations split at 0.5 time units in the past and selection (s = 0.001) began at 0.2 time units in the past. In the low *N_e* population, the populations split at 0.2 time units in the past and selection (s = 0.01) began immediately. The bottleneck strength in the high *N_e* case is 0.05 and the low *N_e* case is 0.1. In both cases of the partial hard sweep, the ending allele frequencies were 0.5. In the complete soft sweep cases for both populations, starting frequency was 0.001. In the partial soft sweep cases for the high *N_e* case, starting allele frequency was 0.0001 and ending allele frequency was 0.5. For the high *N_e* case, beneficial starting allele frequency was 0.001 and ended at 0.5.

**Figure 11**
The power of four single population statistics was calculated for an older complete hard sweep, a partial hard sweep, a complete soft sweep, and a partial soft sweep. Note that simulation parameters differ between the high *N_e* and low *N_e* cases (Materials and Methods).

**Figure 12**
The top outlier regions and flanking windows for the empirical analysis of *χ_MD*, *XP-EHH*, and *F_ST* are shown. Above, the *χ_MD* outlier resides within a transcript region of the insulin receptor gene (*InR* alternative transcripts are shown). Below, *XP-EHH* and *F_ST* reached their maxima in the same outlier region (at adjacent windows), within a cluster of cuticle-related genes.

Cited by

A Variable Genetic Architecture of Melanic Evolution in Drosophila melanogaster.
Bastide H, Lange JD, Lack JB, Yassin A, Pool JE. Bastide H, et al. Genetics. 2016 Nov;204(3):1307-1319. doi: 10.1534/genetics.116.192492. Epub 2016 Sep 16. Genetics. 2016. PMID: 27638419 Free PMC article.
Ethanol resistance in Drosophila melanogaster has increased in parallel cold-adapted populations and shows a variable genetic architecture within and between populations.
Sprengelmeyer QD, Pool JE. Sprengelmeyer QD, et al. Ecol Evol. 2021 Oct 20;11(21):15364-15376. doi: 10.1002/ece3.8228. eCollection 2021 Nov. Ecol Evol. 2021. PMID: 34765183 Free PMC article.
Inferring Signatures of Positive Selection in Whole-Genome Sequencing Data: An Overview of Haplotype-Based Methods.
Abondio P, Cilli E, Luiselli D. Abondio P, et al. Genes (Basel). 2022 May 22;13(5):926. doi: 10.3390/genes13050926. Genes (Basel). 2022. PMID: 35627311 Free PMC article. Review.
Parallel Evolution of Cold Tolerance within Drosophila melanogaster.
Pool JE, Braun DT, Lack JB. Pool JE, et al. Mol Biol Evol. 2017 Feb 1;34(2):349-360. doi: 10.1093/molbev/msw232. Mol Biol Evol. 2017. PMID: 27777283 Free PMC article.
The Worldwide Invasion of Drosophila suzukii Is Accompanied by a Large Increase of Transposable Element Load and a Small Number of Putatively Adaptive Insertions.
Mérel V, Gibert P, Buch I, Rodriguez Rada V, Estoup A, Gautier M, Fablet M, Boulesteix M, Vieira C. Mérel V, et al. Mol Biol Evol. 2021 Sep 27;38(10):4252-4267. doi: 10.1093/molbev/msab155. Mol Biol Evol. 2021. PMID: 34021759 Free PMC article.

References

1. Bonin A, Nicole F, Pompanon F, Miaud C, Taberlet P. Population Adaptive Index: a new method to help measure intraspecific genetic diversity and prioritize populations for conservation. Cons Biol. 2007;21:697–708. - PubMed
1. Comeron JM, Ratnappan R, Bailin A. The many landscapes of recombination in Drosophila melanogaster. PLoS Genetics. 2012;8:e1002905. - PMC - PubMed
1. Ewing G, Hermisson J. MSMS: a coalescent simulation program including recombination, demographic structure and selection at a single locus. Bioinformatics. 2010;26:2064–2065. - PMC - PubMed
1. Fariello MI, Boitard S, Naya H, SanCristobal M, Servin B. Detecting signatures of selection through haplotype differentiation among hierarchically structured populations. Genetics. 2013;193:929–941. - PMC - PubMed
1. Fay JC, Wu CI. Hitchhiking under positive Darwinian selection. Genetics. 2000;155:1405–1413. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Miscellaneous
- NCI CPTAC Assay Portal

A haplotype method detects diverse scenarios of local adaptation from genomic sequence variation - PubMed

A haplotype method detects diverse scenarios of local adaptation from genomic sequence variation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources

Miscellaneous