web.archive.org

Cochran's theorem

In statistics, Cochran's theorem, devised by William G. Cochran,^[1] is a theorem used in to justify results relating to the probability distributions of statistics that are used in the analysis of variance.^[2]

Contents

Statement

This article's factual accuracy is disputed. Please see the relevant discussion on the talk page. (July 2009)

Suppose U₁, ..., U_n are independent standard normally distributed random variables, and an identity of the form

$\sum_{i=1}^n U_i^2=Q_1+\cdots + Q_k$

can be written, where each Q_i is a sum of squares of linear combinations of the Us. Further suppose that

$r_1+\cdots +r_k=n$

where r_i is the rank of Q_i. Cochran's theorem states that the Q_i are independent, and each Q_i has a chi-square distribution with r_i degrees of freedom.

Examples

Sample mean and sample variance

If X₁, ..., X_n are independent normally distributed random variables with mean μ and standard deviation σ then

$U_i = \frac{X_i-\mu}{\sigma}$

is standard normal for each i.

It is possible to write

$\sum U_i^2=\sum\left(\frac{X_i-\overline{X}}{\sigma}\right)^2 + n\left(\frac{\overline{X}-\mu}{\sigma}\right)^2$

(here, summation is from 1 to n, that is over the observations). To see this identity, multiply throughout by σ² and note that

$\sum(X_i-\mu)^2= \sum(X_i-\overline{X}+\overline{X}-\mu)^2$

and expand to give

$\sum(X_i-\overline{X})^2+\sum(\overline{X}-\mu)^2+ 2\sum(X_i-\overline{X})(\overline{X}-\mu).$

The third term is zero because it is equal to a constant times

$\sum(\overline{X}-X_i),$

and the second term is just n identical terms added together.

Combining the above results (and dividing by σ²), we have:

$\sum\left(\frac{X_i-\mu}{\sigma}\right)^2= \sum\left(\frac{X_i-\overline{X}}{\sigma}\right)^2 +n\left(\frac{\overline{X}-\mu}{\sigma}\right)^2 =Q_1+Q_2.$

Now the rank of Q₂ is just 1 (it is the square of just one linear combination of the standard normal variables). The rank of Q₁ can be shown to be n − 1, and thus the conditions for Cochran's theorem are met.

Cochran's theorem then states that Q₁ and Q₂ are independent, with Chi-squared distribution with n − 1 and 1 degree of freedom respectively.

This shows that the sample mean and sample variance are independent; this can also be show by Basu's theorem, and this property characterizes the normal distribution – for no other distribution are the sample mean and sample variance independent.

Further,

$(\overline{X}-\mu)^2\sim \frac{\sigma^2}{n}\chi^2_1.$

Estimation of variance

To estimate the variance σ², one estimator that is often used is

$\widehat{\sigma}^2= \frac{1}{n}\sum\left( X_i-\overline{X}\right)^2.$

Cochran's theorem shows that

$\frac{n\widehat{\sigma}^2}{\sigma^2}\sim\chi^2_{n-1}$

which shows that the expected value of $\widehat{\sigma}^2$ is σ²(n − 1)/n.

Both these distributions are proportional to the true but unknown variance σ²; thus their ratio is independent of σ² and because they are independent we have

$\frac{n\left(\overline{X}-\mu\right)^2} {\frac{1}{n-1}\sum\left(X_i-\overline{X}\right)^2}\sim F_{1,n-1}$

where F_1,n − 1 is the F-distribution with 1 and n − 1 degrees of freedom (see also Student's t-distribution).

Alternative formulation

The following version is often seen when considering linear regression.

Suppose Y˜N_n(0,σ²I_n) is a standard multivariate Gaussian random variable [here I_n denotes the n-by-n identity matrix], and if $A_1,\ldots,A_k$ are all n-by-n symmetric matrices with $\sum_{i=1}^kA_i=I_n$ . Then, writing Rank(A_i) = r_i, any one of the following conditions implies the other two:

References

^ Cochran, W. G. (April 1934). "The distribution of quadratic forms in a normal system, with applications to the analysis of covariance". Mathematical Proceedings of the Cambridge Philosophical Society 30 (2): 178-191. doi:10.1017/S0305004100016595.
^ Bapat, R. B. (2000). Linear Algebra and Linear Models (Second ed.). Springer. ISBN 9780387988719.

v • d • e Design of experiments

Scientific Method	Scientific experiment · Statistical design · Control · Internal & external validity · Experimental unit · Blinding Optimal design: Bayesian · Random assignment · Randomization · Restricted randomization · Replication versus subsampling · Sample size

Treatment & Blocking	Treatment · Effect size · Contrast · Interaction · Confounding · Orthogonality · Blocking · Covariate · Nuisance variable

Models & Inference	Linear regression · Ordinary least squares · Bayesian Random effect · Mixed model · Hierarchical model: Bayesian Analysis of variance (Anova) · Cochran's theorem · Manova (multivariate) · Ancova (covariance) Compare means · Multiple comparison ·

Designs: Completely Randomized	Factorial · Fractional factorial · Plackett-Burman · Taguchi Response-surface design · Polynomial & rational modeling · Box-Behnken · Central composite Block · Generalized randomized block design (GRBD) · Latin square · Graeco-Latin square · Hyper-Graeco-Latin square · Latin hypercube Repeated measures experiment · Crossover experiment · Randomized controlled trial · Sequential analysis · Sequential probability ratio test

Glossary · Category · Statistics portal · Statistical outline · Statistical topics

This entry is from Wikipedia, the leading user-contributed encyclopedia. It may not have been reviewed by professional editors (see full disclaimer)