Maximum-likelihood methods for detecting recent positive selection and localizing the selected site in the genome - PubMed
Comparative Study
Maximum-likelihood methods for detecting recent positive selection and localizing the selected site in the genome
Haipeng Li et al. Genetics. 2005 Sep.
Abstract
Two maximum-likelihood methods are proposed for detecting recent, strongly positive selection and for localizing the target of selection along a recombining chromosome. The methods utilize the compact mutation frequency spectrum at multiple neutral loci that are partially linked to the selected site. Using simulated data, we show that the power of the tests lies between 80 and 98% in most cases, and the false positive rate could be as low as approximately 10% when the number of sampled marker loci is sufficiently large (> or = 20). The confidence interval around the estimated position of selection is reasonably narrow. The methods are applied to X chromosome data of Drosophila melanogaster from a European and an African population. Evidence of selection was found for both populations (including a selective sweep that was shared between both populations).
Figures

High rejection rate of genealogies when a full mutation frequency spectrum is considered. One mutation (A → T) of size 3 was observed in four sequences. There are two types of rooted topologies for four sequences. Of the two types, only the second one can explain the data because it has a branch of size 3. Therefore, 33.3% (T
ajima1983) of the simulated random genealogies are inconsistent with the observed data.

The effect of positive selection with different strength. The average level of nucleotide variation is plotted as a function of the distance (in kilobases) from the selected site. N = 100,000 and n = 10.

Illustration of the log-likelihood curve for one simulated data set with a positive selection event occurring at 100 kb. The estimated position of the selected site is at 105 kb. N = 100,000, n = 10, m = 10, θk = 5, K = 1000, , and
. The positions of the neutral loci are shown at the bottom.

The distribution of the estimated position of selection (from 1000 simulated data sets). Parameter values are the same as in Figure 3, and the positions of the m neutral loci in 1000 simulated data sets have been chosen according to method LPS1.

Standard deviation of the estimated position of the selected site for LPS1 and LPS2 and different numbers of loci. Parameter values are the same as in Figure 3.

Standard deviation of the estimated position of the selected site for different sequencing strategies (m vs. length of marker locus). The solid bars represent the cases with more loci and a shorter sequence per locus. The open bars represent the alternative strategy (such that the sequencing load in both cases is identical). Parameter values are the same as in Figure 3. LPS2 is used.

Standard deviation of the estimated position of the selected site for different sequencing strategies (m vs. n). The solid bars represent the cases with fewer sampled chromosomes and more loci and the open bars the cases with more sampled chromosomes and fewer loci. Parameter values are the same as in Figure 3. LPS2 is used.

Genetic diversity of Drosophila melanogaster between African and European populations. The dashed lines are the expected θ-values for each population. (Top) Fragments are denoted according to their identification numbers (G
linkaet al. 2003). (Bottom) Their positions on the X chromosome are shown. The positions of selected sites estimated by L1 and L2 and their 95% confidence intervals are also presented. For the African population, only the L2 method suggests the occurrence of a sweep.
Similar articles
-
Schöfl G, Catania F, Nolte V, Schlötterer C. Schöfl G, et al. Genetics. 2005 Aug;170(4):1701-9. doi: 10.1534/genetics.104.037507. Epub 2005 Jun 3. Genetics. 2005. PMID: 15937137 Free PMC article.
-
Evidence of gene conversion associated with a selective sweep in Drosophila melanogaster.
Glinka S, De Lorenzo D, Stephan W. Glinka S, et al. Mol Biol Evol. 2006 Oct;23(10):1869-78. doi: 10.1093/molbev/msl069. Epub 2006 Jul 25. Mol Biol Evol. 2006. PMID: 16868022
-
A population genomic approach to map recent positive selection in model species.
Pavlidis P, Hutter S, Stephan W. Pavlidis P, et al. Mol Ecol. 2008 Aug;17(16):3585-98. doi: 10.1111/j.1365-294X.2008.03852.x. Mol Ecol. 2008. PMID: 18627454 Review.
-
The recent demographic and adaptive history of Drosophila melanogaster.
Stephan W, Li H. Stephan W, et al. Heredity (Edinb). 2007 Feb;98(2):65-8. doi: 10.1038/sj.hdy.6800901. Epub 2006 Sep 27. Heredity (Edinb). 2007. PMID: 17006533 Review.
-
Ometto L, Glinka S, De Lorenzo D, Stephan W. Ometto L, et al. Mol Biol Evol. 2005 Oct;22(10):2119-30. doi: 10.1093/molbev/msi207. Epub 2005 Jun 29. Mol Biol Evol. 2005. PMID: 15987874
Cited by
-
Inferences of demography and selection in an African population of Drosophila melanogaster.
Singh ND, Jensen JD, Clark AG, Aquadro CF. Singh ND, et al. Genetics. 2013 Jan;193(1):215-28. doi: 10.1534/genetics.112.145318. Epub 2012 Oct 26. Genetics. 2013. PMID: 23105013 Free PMC article.
-
Balancing selection and its effects on sequences in nearby genome regions.
Charlesworth D. Charlesworth D. PLoS Genet. 2006 Apr;2(4):e64. doi: 10.1371/journal.pgen.0020064. PLoS Genet. 2006. PMID: 16683038 Free PMC article. Review.
-
Approximating genealogies for partially linked neutral loci under a selective sweep.
Pfaffelhuber P, Studeny A. Pfaffelhuber P, et al. J Math Biol. 2007 Sep;55(3):299-330. doi: 10.1007/s00285-007-0085-7. Epub 2007 Mar 30. J Math Biol. 2007. PMID: 17396267
-
A framework for evolutionary systems biology.
Loewe L. Loewe L. BMC Syst Biol. 2009 Feb 24;3:27. doi: 10.1186/1752-0509-3-27. BMC Syst Biol. 2009. PMID: 19239699 Free PMC article.
-
Non-neutral processes drive the nucleotide composition of non-coding sequences in Drosophila.
Haddrill PR, Charlesworth B. Haddrill PR, et al. Biol Lett. 2008 Aug 23;4(4):438-41. doi: 10.1098/rsbl.2008.0174. Biol Lett. 2008. PMID: 18505714 Free PMC article.
References
-
- Felsenstein, J., 1992. Estimating effective population size from samples of sequences: a bootstrap monte carlo integration method. Genet. Res. 60: 209–220. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases