web.archive.org

sampling distribution: Definition from Answers.com

Wikipedia: sampling distribution

In statistics, a sampling distribution is the probability distribution, under repeated sampling of the population, of a given statistic (a numerical quantity calculated from the data values in a sample).

The formula for the sampling distribution depends on the distribution of the population, the statistic being considered, and the sample size used. A more precise formulation would speak of the distribution of the statistic as that for all possible samples of a given size, not just "under repeated sampling".

For example, consider a very large normal population (one that follows the so-called bell curve). Assume we repeatedly take samples of a given size from the population and calculate the sample mean (\bar x, the arithmetic mean of the data values) for each sample. Different samples will lead to different sample means. The distribution of these means is the "sampling distribution of the sample mean" (for the given sample size). This distribution will be normal since the population was normal. (According to the central limit theorem, if the population is not normal but "sufficiently well behaved", the sampling distribution of the sample mean will still be approximately normal provided the sample size is sufficiently large.)

Thus, the mean of the sampling distribution is equivalent to the expected value of any statistic. For the case where the statistic is the sample mean:

\mu_{\bar x} = \mu

The standard deviation of the sampling distribution of the statistic is referred to as the standard error of that quantity. For the case where the statistic is the sample mean, the standard error is:

\sigma_{\bar x} = \frac{\sigma}{\sqrt{n}}

where σ is the standard deviation of the population distribution of that quantity and n is the size (number of items) in the sample.

A very important implication of this formula is that you must quadruple the sample size (4×) to achieve half (1/2) the measurement error. When designing statistical studies where cost is a factor, this may have a factor in understanding cost-benefit tradeoffs.

Alternatively, consider the sample median from the same population. It has a different sampling distribution which is generally not normal (but may be close under certain circumstances).

Examples

Population Sample statistic Sampling distribution
Infinite, X \sim N(\mu, \sigma^2) Sample mean, \bar X \bar X \sim N \left (\mu, \frac{\sigma^2}{n} \right )
Finite (size N), X \sim N(\mu, \sigma^2) Sample mean, \bar X \bar X \sim N \left (\mu, \frac{N - n}{N - 1} \times \frac{\sigma^2}{n} \right )
Infinite, X \sim \operatorname{Binomial}(p) Sample proportion, \bar p \bar p \sim \operatorname{Binomial}(p)
Infinite, X_1 \sim N(\mu_1, \sigma_1^2), X_2 \sim N(\mu_2, \sigma_2^2) Sample difference between means, \bar X_1 - \bar X_2 \bar X_1 - \bar X_2 \sim N \left (\mu_1 - \mu_2, \frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}\right )

See also

External links

Image:Bvn-small.png Probability distributions []
Univariate Multivariate
Discrete: BenfordBernoullibinomialBoltzmanncategoricalcompound Poissondiscrete phase-typedegenerateGauss-Kuzmingeometrichypergeometriclogarithmicnegative binomialparabolic fractalPoissonRademacherSkellamuniformYule-SimonzetaZipfZipf-Mandelbrot Ewensmultinomialmultivariate Polya
Continuous: BetaBeta primeCauchychi-squareDirac delta functionCoxianErlangexponentialexponential powerFfadingFermi-DiracFisher's zFisher-TippettGammageneralized extreme valuegeneralized hyperbolicgeneralized inverse GaussianHalf-logisticHotelling's T-squarehyperbolic secanthyper-exponentialhypoexponentialinverse chi-square (scaled inverse chi-square) • inverse Gaussianinverse gamma (scaled inverse gamma) • KumaraswamyLandauLaplaceLévyLévy skew alpha-stablelogisticlog-normalMaxwell-BoltzmannMaxwell speedNakagaminormal (Gaussian)normal-gammanormal inverse GaussianParetoPearsonphase-typepolarraised cosineRayleighrelativistic Breit-WignerRiceshifted GompertzStudent's ttriangulartruncated normaltype-1 Gumbeltype-2 GumbeluniformVariance-GammaVoigtvon MisesWeibullWigner semicircleWilks' lambda DirichletGeneralized Dirichlet distribution . inverse-WishartKentmatrix normalmultivariate normalmultivariate Studentvon Mises-FisherWigner quasiWishart
Miscellaneous: bimodalCantorconditionalequilibriumexponential familyInfinite divisibility (probability)location-scale familymarginalmaximum entropyposteriorpriorquasisamplingsingular

This entry is from Wikipedia, the leading user-contributed encyclopedia. It may not have been reviewed by professional editors (see full disclaimer)