web.archive.org

chi-square distribution: Definition and Much More from Answers.com

️Wed Jul 01 2015

Wikipedia: chi-square distribution

This article is about the mathematics of the chi-square distribution. For its uses in statistics, see chi-square test.

chi-square

Probability density function
Cumulative distribution function
Parameters	$k > 0\,$ degrees of freedom
Support	$x \in [0; +\infty)\,$
Probability density function (pdf)	$\frac{(1/2)^{k/2}}{\Gamma(k/2)} x^{k/2 - 1} e^{-x/2}\,$
Cumulative distribution function (cdf)	$\frac{\gamma(k/2,x/2)}{\Gamma(k/2)}\,$
Mean	$k\,$
Median	approximately $k-2/3\,$
Mode	$k-2\,$ if $k\geq 2\,$
Variance	$2\,k\,$
Skewness	$\sqrt{8/k}\,$
Excess kurtosis	$12/k\,$
Entropy	$\frac{k}{2}\!+\!\ln(2\Gamma(k/2))\!+\!(1\!-\!k/2)\psi(k/2)$
Moment-generating function (mgf)	$(1-2\,t)^{-k/2}$ for $2\,t<1\,$
Characteristic function	$(1-2\,i\,t)^{-k/2}\,$

In probability theory and statistics, the chi-square distribution (also chi-squared or χ² distribution) is one of the most widely used theoretical probability distributions in inferential statistics, i.e. in statistical significance tests. It is useful because, under reasonable assumptions, easily calculated quantities can be proven to have distributions that approximate to the chi-square distribution if the null hypothesis is true.

If X_i are k independent, normally distributed random variables with mean 0 and variance 1, then the random variable

$Q = \sum_{i=1}^k X_i^2$

is distributed according to the chi-square distribution. This is usually written

$Q\sim\chi^2_k.\,$

The chi-square distribution has one parameter: k - a positive integer that specifies the number of degrees of freedom (i.e. the number of X_i)

The chi-square distribution is a special case of the gamma distribution.

The best-known situations in which the chi-square distribution is used are the common chi-square tests for goodness of fit of an observed distribution to a theoretical one, and of the independence of two criteria of classification of qualitative data. However, many other statistical tests lead to a use of this distribution. One example is Friedman's analysis of variance by ranks.

Characteristics

Probability density function

A probability density function of the chi-square distribution is

Failed to parse (unknown function\displaystyle): f(x;k)= \begin{cases}\displaystyle \frac{1}{2^{k/2}\Gamma(k/2)}\,x^{k/2 - 1} e^{-x/2}&\text{for }x>0,\\ 0&\text{for }x\le0, \end{cases}

where Γ denotes the Gamma function, which takes particular values at the half-integers.

Cumulative distribution function

Its cumulative distribution function is:

$F(x;k)=\frac{\gamma(k/2,x/2)}{\Gamma(k/2)} = P(k/2, x/2)$

where γ(k,z) is the lower incomplete Gamma function and P(k,z) is the regularized Gamma function.

Tables of this distribution — usually in its cumulative form — are widely available and the function is included in many spreadsheets and all statistical packages.

Characteristic function

The characteristic function of the Chi-square distribution is

$\phi(t;k)=(1-2it)^{-k/2}.\,$

Properties

The chi-square distribution has numerous applications in inferential statistics, for instance in chi-square tests and in estimating variances. It enters the problem of estimating the mean of a normally distributed population and the problem of estimating the slope of a regression line via its role in Student's t-distribution. It enters all analysis of variance problems via its role in the F-distribution, which is the distribution of the ratio of two independent chi-squared random variables divided by their respective degrees of freedom.

Normal approximation

If $X\sim\chi^2_k$ , then as k tends to infinity, the distribution of X tends to normality. However, the tendency is slow (the skewness is $\sqrt{8/k}$ and the kurtosis excess is 12 / k) and two transformations are commonly considered, each of which approaches normality faster than X itself:

Fisher emprically showed that $\sqrt{2X}$ is approximately normally distributed with mean $\sqrt{2k-1}$ and unit variance. It is possible to arrive at the same normal approximation result by using moment matching. To see this, consider the mean and the variance of a Chi-distributed random variable $z=\sqrt{X}$ , which are given by $\mu_z= \sqrt{2} \frac{\Gamma\left(k/2+1/2\right)}{\Gamma\left(k/2 \right)}$ and $\sigma_z^2= k-\mu_z^2$ , where $\Gamma(\cdot)$ is the Gamma function. The particular ratio of the Gamma functions in μ_z has the following series expansion [1]: $\frac{\Gamma\left(N+1/2\right)}{\Gamma\left(N \right)}=\sqrt{N}\left(1-\frac{1}{8N}+ \frac{1}{128N^2}+\frac{5}{1024N^3}-\frac{21}{32768N^4}+\ldots\right).$ When Failed to parse (unknown function\gg): N\gg 1 , this ratio can be approximated as follows: $\frac{\Gamma\left(N+1/2\right)}{\Gamma\left(N \right)}\approx\sqrt{N}\left(1-\frac{1}{8N}\right)\approx\sqrt{N}\left(1-\frac{1}{4N}\right)^{0.5}=\sqrt{N-1/4}.$ Then, simple moment matching results in the following approximation of z: $z\sim{\mathcal N}\left(\sqrt{k-1/2}, \frac{1}{2}\right)$ , from which it follows that $\sqrt{2X}\sim{\mathcal N}\left(\sqrt{2k-1}, 1\right)$ .

Wilson and Hilferty showed in 1931 that $\sqrt[3]{X/k}$ is approximately normally distributed with mean 1 - 2 / (9k) and variance 2 / (9k).

The expected value of a random variable having chi-square distribution with k degrees of freedom is k and the variance is 2k. The median is given approximately by

$k-\frac{2}{3}+\frac{4}{27k}-\frac{8}{729k^2}.$

Note that 2 degrees of freedom lead to an exponential distribution.

Information entropy

The information entropy is given by

$H = \int_{-\infty}^\infty f(x;k)\ln(f(x;k)) dx = \frac{k}{2} + \ln \left( 2 \Gamma \left( \frac{k}{2} \right) \right) + \left(1 - \frac{k}{2}\right) \psi(k/2).$

where ψ(x) is the Digamma function.

Related distributions

Various chi and chi-square distributions

Name	Statistic
chi-square distribution	$\sum_{i=1}^k \left(\frac{X_i-\mu_i}{\sigma_i}\right)^2$
noncentral chi-square distribution	$\sum_{i=1}^k \left(\frac{X_i}{\sigma_i}\right)^2$
chi distribution	$\sqrt{\sum_{i=1}^k \left(\frac{X_i-\mu_i}{\sigma_i}\right)^2}$
noncentral chi distribution	$\sqrt{\sum_{i=1}^k \left(\frac{X_i}{\sigma_i}\right)^2}$

External links

On-line calculator for the significance of chi-square, in Richard Lowry's statistical website at Vassar College.
Distribution Calculator Calculates probabilities and critical values for normal, t-, chi2- and F-distribution
Chi-Square Calculator for critical values of Chi-Square in R. Webster West's applet website at University of South Carolina
Chi-Square Calculator from GraphPad

	Probability distributions []
Univariate	Multivariate
Discrete:	Benford • Bernoulli • binomial • Boltzmann • categorical • compound Poisson • discrete phase-type • degenerate • Gauss-Kuzmin • geometric • hypergeometric • logarithmic • negative binomial • parabolic fractal • Poisson • Rademacher • Skellam • uniform • Yule-Simon • zeta • Zipf • Zipf-Mandelbrot	Ewens • multinomial • multivariate Polya
Continuous:	Beta • Beta prime • Cauchy • chi-square • Dirac delta function • Coxian • Erlang • exponential • exponential power • F • fading • Fermi-Dirac • Fisher's z • Fisher-Tippett • Gamma • generalized extreme value • generalized hyperbolic • generalized inverse Gaussian • Half-logistic • Hotelling's T-square • hyperbolic secant • hyper-exponential • hypoexponential • inverse chi-square (scaled inverse chi-square) • inverse Gaussian • inverse gamma (scaled inverse gamma) • Kumaraswamy • Landau • Laplace • Lévy • Lévy skew alpha-stable • logistic • log-normal • Maxwell-Boltzmann • Maxwell speed • Nakagami • normal (Gaussian) • normal-gamma • normal inverse Gaussian • Pareto • Pearson • phase-type • polar • raised cosine • Rayleigh • relativistic Breit-Wigner • Rice • shifted Gompertz • Student's t • triangular • truncated normal • type-1 Gumbel • type-2 Gumbel • uniform • Variance-Gamma • Voigt • von Mises • Weibull • Wigner semicircle • Wilks' lambda	Dirichlet • Generalized Dirichlet distribution . inverse-Wishart • Kent • matrix normal • multivariate normal • multivariate Student • von Mises-Fisher • Wigner quasi • Wishart
Miscellaneous:	bimodal • Cantor • conditional • equilibrium • exponential family • Infinite divisibility (probability) • location-scale family • marginal • maximum entropy • posterior • prior • quasi • sampling • singular

This entry is from Wikipedia, the leading user-contributed encyclopedia. It may not have been reviewed by professional editors (see full disclaimer)