Recombination affects accumulation of damaging and disease-associated mutations in human populations - Nature Genetics
- ️Awadalla, Philip
- ️Mon Feb 16 2015
References
Felsenstein, J. The evolutionary advantage of recombination. Genetics 78, 737–756 (1974).
Charlesworth, B. & Charlesworth, D. The degeneration of Y chromosomes. Phil. Trans. R. Soc. Lond. B 355, 1563–1572 (2000).
Keinan, A. & Clark, A.G. Recent explosive human population growth has resulted in an excess of rare genetic variants. Science 336, 740–743 (2012).
Nelson, M.R. et al. An abundance of rare functional variants in 202 drug target genes sequenced in 14,002 people. Science 337, 100–104 (2012).
Muller, H.J. The relation of recombination to mutational advance. Mutat. Res. 106, 2–9 (1964).
Campos, J.L., Charlesworth, B. & Haddrill, P.R. Molecular evolution in nonrecombining regions of the Drosophila melanogaster genome. Genome Biol. Evol. 4, 278–288 (2012).
Campos, J.L., Halligan, D.L., Haddrill, P.R. & Charlesworth, B. The relation between recombination rate and patterns of molecular evolution and variation in Drosophila melanogaster. Mol. Biol. Evol. 31, 1010–1028 (2014).
Hellmann, I. et al. Why do human diversity levels vary at a megabase scale? Genome Res. 15, 1222–1231 (2005).
Lercher, M.J. & Hurst, L.D. Human SNP variability and mutation rate are higher in regions of high recombination. Trends Genet. 18, 337–340 (2002).
Hernandez, R.D. et al. Classic selective sweeps were rare in recent human evolution. Science 331, 920–924 (2011).
Lohmueller, K.E. et al. Natural selection affects multiple aspects of genetic variation at putatively neutral sites across the human genome. PLoS Genet. 7, e1002326 (2011).
Charlesworth, B. The effects of deleterious mutations on evolution at linked sites. Genetics 190, 5–22 (2012).
McGaugh, S.E. et al. Recombination modulates how selection affects linked sites in Drosophila. PLoS Biol. 10, e1001422 (2012).
Kaiser, V.B. & Charlesworth, B. The effects of deleterious mutations on evolution in non-recombining genomes. Trends Genet. 25, 9–12 (2009).
Hill, W.G. & Robertson, A. The effect of linkage on limits to artificial selection. Genet. Res. 8, 269–294 (1966).
Keightley, P.D. & Otto, S.P. Interference among deleterious mutations favours sex and recombination in finite populations. Nature 443, 89–92 (2006).
Awadalla, P. et al. Cohort profile of the CARTaGENE study: Quebec's population-based biobank for public health and personalized genomics. Int. J. Epidemiol. 42, 1285–1299 (2013).
Hodgkinson, A. et al. High-resolution genomic analysis of human mitochondrial RNA sequence variation. Science 344, 413–415 (2014).
Abecasis, G.R. et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).
Landrum, M.J. et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 42, D980–D985 (2014).
Davydov, E.V. et al. Identifying a high fraction of the human genome to be under selective constraint using GERP++. PLoS Comput. Biol. 6, e1001025 (2010).
Comeron, J.M., Williford, A. & Kliman, R.M. The Hill-Robertson effect: evolutionary consequences of weak selection and linkage in finite populations. Heredity (Edinb.) 100, 19–31 (2008).
Gordo, I., Navarro, A. & Charlesworth, B. Muller's ratchet and the pattern of variation at a neutral locus. Genetics 161, 835–848 (2002).
Messer, P.W. SLiM: simulating evolution with selection and linkage. Genetics 194, 1037–1039 (2013).
Hernandez, R.D. A flexible forward simulator for populations subject to selection and demography. Bioinformatics 24, 2786–2787 (2008).
Hudson, R.R. & Kaplan, N.L. Deleterious background selection with recombination. Genetics 141, 1605–1617 (1995).
Charlesworth, B. & Charlesworth, D. Rapid fixation of deleterious alleles can be caused by Muller's ratchet. Genet. Res. 70, 63–73 (1997).
Bullaughey, K., Przeworski, M. & Coop, G. No effect of recombination on the efficacy of natural selection in primates. Genome Res. 18, 544–554 (2008).
Casals, F. et al. Whole-exome sequencing reveals a rapid change in the frequency of rare functional variants in a founding population of humans. PLoS Genet. 9, e1003815 (2013).
Moreau, C. et al. Deep human genealogies reveal a selective advantage to be on an expanding wave front. Science 334, 1148–1150 (2011).
Smith, A.V., Thomas, D.J., Munro, H.M. & Abecasis, G.R. Sequence features in regions of weak and strong linkage disequilibrium. Genome Res. 15, 1519–1534 (2005).
Khurana, E. et al. Integrative annotation of variants from 1092 humans: application to cancer genomics. Science 342, 1235587 (2013).
Simons, Y.B., Turchin, M.C., Pritchard, J.K. & Sella, G. The deleterious mutation load is insensitive to recent population history. Nat. Genet. 46, 220–224 (2014).
Boyko, A.R. et al. Assessing the evolutionary impact of amino acid mutations in the human genome. PLoS Genet. 4, e1000083 (2008).
HapMap Consortium. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007).
Kong, A. et al. Fine-scale recombination rate differences between sexes, populations and individuals. Nature 467, 1099–1103 (2010).
Hinch, A.G. et al. The landscape of recombination in African Americans. Nature 476, 170–175 (2011).
Adzhubei, I.A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010).
Kumar, P., Henikoff, S. & Ng, P.C. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat. Protoc. 4, 1073–1081 (2009).
Yang, Z. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13, 555–556 (1997).
Morris, J.A. & Gardner, M.J. Calculating confidence intervals for relative risks (odds ratios) and standardised ratios and rates. Br. Med. J. (Clin. Res. Ed.) 296, 1313–1316 (1988).
Delaneau, O., Zagury, J.F. & Marchini, J. Improved whole-chromosome phasing for disease and population genetic studies. Nat. Methods 10, 5–6 (2013).
Hussin, J., Nadeau, P., Lefebvre, J.F. & Labuda, D. Haplotype allelic classes for detecting ongoing positive selection. BMC Bioinformatics 11, 65 (2010).
Eyre-Walker, A., Woolfit, M. & Phelps, T. The distribution of fitness effects of new deleterious amino acid mutations in humans. Genetics 173, 891–900 (2006).
Mi, H. et al. The PANTHER database of protein families, subfamilies, functions and pathways. Nucleic Acids Res. 33, D284–D288 (2005).
Wang, J., Duncan, D., Shi, Z. & Zhang, B. WEB-based GEne SeT AnaLysis Toolkit (WebGestalt): update 2013. Nucleic Acids Res. 41, W77–W83 (2013).
Acknowledgements
We thank G. Gibson, E. Hatton, G. McVean, J. Novembre, C. Spencer, E. Stone and the anonymous reviewers for insightful comments on the study, and we thank the CARTaGENE participants and team for data collection. We confirm that informed consent was obtained from all subjects. We acknowledge financial support from Fonds de la Recherche en Santé du Québec (FRSQ), Génome Québec, Fonds Québécois de la Recherche sur la Nature et les Technologies (FQRNT) and the Canadian Partnership Against Cancer. J.G.H. is a Human Frontiers Postdoctoral Fellow, A.H. is an FRSQ Research Fellow, and Y.I. is a Banting Postdoctoral Fellow.
Author information
Author notes
Julie G Hussin & Philip Awadalla
Present address: Present addresses: Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK (J.G.H.) and Ontario Institute of Cancer Research, Toronto, Ontario, Canada (P.A.).,
Authors and Affiliations
Department of Pediatrics, Sainte-Justine University Hospital Research Centre, Faculty of Medicine, University of Montreal, Montreal, Quebec, Canada
Julie G Hussin, Alan Hodgkinson, Youssef Idaghdour, Jean-Christophe Grenier, Elias Gbeha, Elodie Hip-Ki & Philip Awadalla
CARTaGENE Project, Sainte-Justine University Hospital, Montreal, Quebec, Canada
Julie G Hussin, Alan Hodgkinson, Youssef Idaghdour, Jean-Philippe Goulet & Philip Awadalla
Authors
- Julie G Hussin
You can also search for this author in PubMed Google Scholar
- Alan Hodgkinson
You can also search for this author in PubMed Google Scholar
- Youssef Idaghdour
You can also search for this author in PubMed Google Scholar
- Jean-Christophe Grenier
You can also search for this author in PubMed Google Scholar
- Jean-Philippe Goulet
You can also search for this author in PubMed Google Scholar
- Elias Gbeha
You can also search for this author in PubMed Google Scholar
- Elodie Hip-Ki
You can also search for this author in PubMed Google Scholar
- Philip Awadalla
You can also search for this author in PubMed Google Scholar
Contributions
J.G.H. designed the study, performed quality control on genotyping and sequencing data, performed bioinformatics and statistical analyses and wrote the manuscript. A.H. performed quality control on genotyping and sequencing data and wrote the manuscript. E.G. and E.H.-K. processed samples for sequencing and genotyping. Y.I., J.-C.G. and J.-P.G. preprocessed the genomic data and performed quality control and bioinformatics analyses. P.A. provided samples, designed the study and wrote the manuscript.
Corresponding author
Correspondence to Philip Awadalla.
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Integrated supplementary information
Supplementary Figure 1 Comparison of the levels of diversity between coldspots (CS) and highly recombining regions (HRRs) for SNPs in the FCQ data set.
Odds ratios (ORs) are computed to compare SNP density between coldspots and HRRs for all SNPs (red) and SNPs divided in different allele frequency classes (black). OR < 1 means that diversity is greater in HRRs than in coldspots. We confirm the lack of diversity in coldspots relative to HRRs, in line with previous evidence that diversity is reduced in regions with low recombination rates owing to background selection. The effect is seen for all frequency classes and does not differ significantly between classes of SNPs with MAF > 0.05. The class of variants with MAF < 0.05 shows a smaller effect than the other frequency classes.
Supplementary Figure 2 Differential mutational burden between coldspots (CS) and highly recombining regions (HRRs) in a genomic subset of the data.
Differential burden is computed using odds ratios (ORs), representing the relative enrichment of a category of variants compared to all variants in coldspots versus HRRs for (a) RNA and exome sequencing of French Canadians (FC) and for exome sequencing of (b) Europeans (EUR), (c) Asians (ASN) and (d) Africans (AFR) from the 1000 Genomes Project. Variants are categorized as rare (MAF < 0.01 in a population), nonsynonymous (missense and nonsense) and damaging (as predicted by both SIFT and PolyPhen-2). Highly covered exons (HC exons) have coverage above 20× for each position within the exons in all data sets. The set of exons analyzed does not affect the results, and the exome data set in French Canadians (FC) replicates the results found in RNA sequencing.
Supplementary Figure 3 Minor allele frequencies (MAF) impact on odds ratios between coldspots (CS) and high recombining regions (HRRs).
Impact of MAF on the effects for functional mutations in the French-Canadian (FCQ) RNA sequencing data set (a,b) and for private and shared variants in (c) Europeans (EUR) and (d) Africans (AFR). (a) The enrichment of nonsynonymous and damaging mutations in coldspots remains significant for MAF < 0.05, indicating that the excess of rare variants in coldspots does not drive the effect for nonsynonymous and damaging variants. (b) Neutral variants with MAF < 0.05 are enriched in coldspots in comparison to more frequent variants, indicating that neutral diversity contributes to the excess of rare variants in coldspots. (c,d) The enrichment of private mutations in coldspots and of shared mutations in HRRs remains significant for MAF < 0.1 in both EUR and AFR, indicating that these effects are not driven only by differences in allele frequency between shared and private variants.
Supplementary Figure 4 Distribution of conservation across exons measured by GERP scores in coldspots (CS) and highly recombining regions (HRRs).
(a) Mean GERP score per exon. (b) Proportion of constrained positions (GERP > 3) per exon. (c) Scatter plot of mean GERP by the proportion of constrained positions for all exons. (d) For each measure of conservation per exon, exons were grouped into four categories of equal size. Only exons that were concordant between the two classifications were kept in analyses within conservation categories, to minimize the effect of outliers for one of the two measures. Characteristics of exons in these four conservation categories in terms of average GERP score per base pair and number of constrained sites per base pair (GERP > 3) are reported in Supplementary Table 7 .
Supplementary Figure 5 Differential mutational burden in conservation categories.
Differential mutational burden between coldspots (CS) and highly recombining regions (HRRs) for rare (MAF < 0.01), nonsynonymous (nonsyn), damaging and constrained variants in (a) French Canadians (FCQ) and (b) Europeans (EUR) for highly covered (HC) exons and in (c) Asians (ASN) and (d) Africans (AFR) for the whole exome. Results for EUR in the whole exome are presented in Figure 3a . Conservation categories are described in Supplementary Table 7 . Results for ASN and AFR in HC exons (data not shown) are similar to EUR results. For all populations and exon data sets, the medium high and high conservation categories always show a significant enrichment for potentially deleterious mutations in coldspots.
Supplementary Figure 7 Additional simulations testing the effect of recombination rates and phasing.
(a) Distribution of effects for initial and modified coldspot (CS) and highly recombining region (HRR) rates in simulations, with CS/HRR recombination rates matching the rates in the CEU and YRI maps, respectively ( Supplementary Note , section 4). The distributions are significantly different, but the shift in the mean is very weak and unlikely to cause the large differences observed between populations in Figure 5 . (b,c) Effect of phasing on the distribution of the number of haplotypes with two and more rare mutations (MAF < 0.01) in real haplotypes and phased haplotypes on chunks of the same length (25 kb) in simulated coldspots and HRRs. (b) The number of haplotypes with two mutations is reduced by statistical phasing with SHAPEIT2, (b) but no significant difference between coldspots and HRRs was found in this phasing bias.
Supplementary Figure 8 Effects for private and shared variants between African subpopulations.
Comparison of closely related populations of African ancestry. Odds ratios comparing coldspots (CS) and highly recombining regions (HRRs) are computed on the basis of private and shared variants called in 88 Yoruba in Ibadan from Nigeria (YRI), 97 Luhya in Webuye from Kenya (LWK) and 61 Americans of African ancestry (ASW).
Supplementary Figure 9 Per-individual differential mutational burden across populations.
Comparison of proportions of (a) rare and (b) nonsynonymous mutations between coldspots (CS) and highly recombining regions (HRRs) in French Canadians (FCQ), Europeans (EUR), Asians (ASN) and Africans (AFR). For each individual (ordered by their OR values), the relative proportions of rare or nonsynonymous mutations in coldspots and HRRs are shown, computed by dividing coldspot and HRR proportions by genome-wide proportions of rare or nonsynonymous variants within each individual, to adjust for differences across individuals. The larger symbols represent individuals with the minimum and maximum OR values in each population. Ticks at the bottom of the plots show individual OR values significantly different from 1 (two-tailed P < 0.05). The French-Canadian data used are the RNA sequencing data set (Supplementary Note, section 2); replication with exome data of 96 French Canadians is presented in Supplementary Figure 11.
Supplementary Figure 10 Per-individual differential mutational burden across European populations for private variants.
Distribution of odds ratios (ORs) per individual comparing proportions of private variants between coldspots (CS) and highly recombining regions (HRRs) in closely related populations of western European ancestry. ORs are computed on the basis of private variants called in the exome sequencing data set of 96 French Canadians (FCX), 89 British individuals (GBR), 93 Finns (FIN), 98 Italians from Tuscany (TSI) and 85 European Americans (CEU). The left panel shows the frequencies of individual ORs in each population. The right panel shows, for each individual (ordered by their OR values), the relative proportions of private mutations in coldspots and HRRs, computed by dividing coldspot and HRR proportions by genome-wide proportions of private variants within each individual, to adjust for differences across individuals.
Supplementary Figure 11 Per-individual differential mutational burden across populations with FCQ exome sequencing data.
Distribution of odds ratios (ORs) per individual comparing proportions of rare (a,b) and nonsynonymous (c,d) mutations between coldspots (CS) and highly recombining regions (HRRs). For Europeans (EUR), Asians (ASN) and Africans (AFR), the results are the same as shown in Figure 4 and Supplementary Figure 9, whereas French-Canadian (FCQ) results are computed using the exome sequencing data set from 96 individuals. Further descriptions of the plots are found in Figure 4 and Supplementary Figure 9.
Supplementary Figure 12 Quality checks on per-individual differential mutational burden across populations.
Distribution of odds ratios (ORs) per individual in French Canadians (FCQ), Europeans (EUR), Asians (ASN) and Africans (AFR), comparing proportions of (a) nonsynonymous variants after modifying annotations in the 1000 Genomes Project populations (see the Supplementary Note , section 4.1) and (b) nonsynonymous and (c) rare variants, after excluding mutations that are fixed in one population but still segregating in others, between coldspots (CS) and highly recombining regions (HRRs). The differences between populations observed in Figure 5 remain the same after correcting for these potential technical differences.
Supplementary Figure 13 Population structure in regional populations of Quebec.
Sampling from the CARTaGENE Project includes individuals from the Montreal area (MTL), Quebec City (QCC) and the Saguenay region (SAG). The regional origin of individuals was confirmed by a principal-component analysis of genetic diversity in FCQ individuals compared with genetic diversity within the Reference Panel of Quebec (RPQ) and in the CEU population from HapMap 3. Other populations included in the RPQ are GAS (Gaspesia region), ACA (Acadians), LOY (Loyalists) and CNO (North Shore region).
Supplementary information
Rights and permissions
About this article
Cite this article
Hussin, J., Hodgkinson, A., Idaghdour, Y. et al. Recombination affects accumulation of damaging and disease-associated mutations in human populations. Nat Genet 47, 400–404 (2015). https://doi.org/10.1038/ng.3216
Received: 09 May 2014
Accepted: 14 January 2015
Published: 16 February 2015
Issue Date: April 2015
DOI: https://doi.org/10.1038/ng.3216