pubmed.ncbi.nlm.nih.gov

Maximum-likelihood methods for detecting recent positive selection and localizing the selected site in the genome - PubMed

Comparative Study

Maximum-likelihood methods for detecting recent positive selection and localizing the selected site in the genome

Haipeng Li et al. Genetics. 2005 Sep.

Abstract

Two maximum-likelihood methods are proposed for detecting recent, strongly positive selection and for localizing the target of selection along a recombining chromosome. The methods utilize the compact mutation frequency spectrum at multiple neutral loci that are partially linked to the selected site. Using simulated data, we show that the power of the tests lies between 80 and 98% in most cases, and the false positive rate could be as low as approximately 10% when the number of sampled marker loci is sufficiently large (> or = 20). The confidence interval around the estimated position of selection is reasonably narrow. The methods are applied to X chromosome data of Drosophila melanogaster from a European and an African population. Evidence of selection was found for both populations (including a selective sweep that was shared between both populations).

PubMed Disclaimer

Figures

Figure 1.
Figure 1.

High rejection rate of genealogies when a full mutation frequency spectrum is considered. One mutation (A → T) of size 3 was observed in four sequences. There are two types of rooted topologies for four sequences. Of the two types, only the second one can explain the data because it has a branch of size 3. Therefore, 33.3% (T

ajima

1983) of the simulated random genealogies are inconsistent with the observed data.

Figure 2.
Figure 2.

The effect of positive selection with different strength. The average level of nucleotide variation is plotted as a function of the distance (in kilobases) from the selected site. N = 100,000 and n = 10.

Figure 3.
Figure 3.

Illustration of the log-likelihood curve for one simulated data set with a positive selection event occurring at 100 kb. The estimated position of the selected site is at 105 kb. N = 100,000, n = 10, m = 10, θk = 5, K = 1000, formula image, and formula image. The positions of the neutral loci are shown at the bottom.

Figure 4.
Figure 4.

The distribution of the estimated position of selection (from 1000 simulated data sets). Parameter values are the same as in Figure 3, and the positions of the m neutral loci in 1000 simulated data sets have been chosen according to method LPS1.

Figure 5.
Figure 5.

Standard deviation of the estimated position of the selected site for LPS1 and LPS2 and different numbers of loci. Parameter values are the same as in Figure 3.

Figure 6.
Figure 6.

Standard deviation of the estimated position of the selected site for different sequencing strategies (m vs. length of marker locus). The solid bars represent the cases with more loci and a shorter sequence per locus. The open bars represent the alternative strategy (such that the sequencing load in both cases is identical). Parameter values are the same as in Figure 3. LPS2 is used.

Figure 7.
Figure 7.

Standard deviation of the estimated position of the selected site for different sequencing strategies (m vs. n). The solid bars represent the cases with fewer sampled chromosomes and more loci and the open bars the cases with more sampled chromosomes and fewer loci. Parameter values are the same as in Figure 3. LPS2 is used.

Figure 8.
Figure 8.

Genetic diversity of Drosophila melanogaster between African and European populations. The dashed lines are the expected θ-values for each population. (Top) Fragments are denoted according to their identification numbers (G

linka

et al. 2003). (Bottom) Their positions on the X chromosome are shown. The positions of selected sites estimated by L1 and L2 and their 95% confidence intervals are also presented. For the African population, only the L2 method suggests the occurrence of a sweep.

Similar articles

Cited by

References

    1. Braverman, J. M., R. R. Hudson, N. L. Kaplan, C. H. Langley and W. Stephan, 1995. The hitchhiking effect on the site frequency spectrum of DNA polymorphisms. Genetics 140: 783–796. - PMC - PubMed
    1. Fay, J. C., and C.-I Wu, 2000. Hitchhiking under positive Darwinian selection. Genetics 155: 1405–1413. - PMC - PubMed
    1. Felsenstein, J., 1992. Estimating effective population size from samples of sequences: a bootstrap monte carlo integration method. Genet. Res. 60: 209–220. - PubMed
    1. Fu, Y.-X., and W.-H. Li, 1993. Statistical tests of neutrality of mutations. Genetics 133: 693–709. - PMC - PubMed
    1. Glinka, S., L. Ometto, S. Mousset, W. Stephan and D. D. Lorenzo, 2003. Demography and natural selection have shaped genetic variation in Drosophila melanogaster: a multilocus approach. Genetics 165: 1269–1278. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources