web.archive.org

mean squared error: Information and Much More from Answers.com

In statistics, the mean squared error or MSE of an estimator is the expected value of the square of the "error." The error is the amount by which the estimator differs from the quantity to be estimated. The difference occurs because of randomness or because the estimator doesn't account for information that could produce a more accurate estimate.

Definition and basic properties

The MSE of an estimator \hat{\theta} with respect to the estimated parameter θ is defined as

\operatorname{MSE}(\hat{\theta})=\operatorname{E}((\hat{\theta}-\theta)^2).

It can be shown that the MSE is the sum of the variance and the square of bias of the estimator

\operatorname{MSE}(\hat{\theta})=\operatorname{Var}\left(\hat{\theta}\right)+ \left(\operatorname{Bias}(\hat{\theta},\theta)\right)^2.

In that sense, the MSE assesses the quality of the estimator in terms of its variation and unbiasedness. Note that the MSE is not equivalent to the expected value of the absolute error.

The root mean squared error (RMSE) (or root mean squared deviation (RMSD)) is then simply defined as the square root of the MSE.

\operatorname{RMSE}(\hat{\theta}) = \sqrt{\operatorname{MSE}(\hat{\theta})}.

The defined MSE (as well as the RMSE) is a random variable, that needs to be estimated itself. This is usually done by the sample mean

\operatorname{\widehat{MSE}}(\hat{\theta}) = \frac{1}{n} \sum_{j=1}^n \left(\theta_j-\theta\right)^2

with θj being realizations of the estimator \hat{\theta} of size n.

Examples

Suppose we have a random sample of size n from a normally distributed population, X_1,\dots,X_n\sim\operatorname{N}(\mu,\sigma^2).

Some commonly-used estimators of the true parameters of the population, μ and σ2, are:

True value Estimator Mean squared error
θ = μ \hat{\theta} = the unbiased estimator of the sample mean, \overline{X}=\frac{1}{n}\sum_{i=1}^n(X_i) \operatorname{MSE}(\overline{X})=\operatorname{E}((\overline{X}-\mu)^2)=\left(\frac{\sigma}{\sqrt{n}}\right)^2
θ = σ2 \hat{\theta} = the unbiased estimator of the sample variance, S^2 = \frac{1}{n-1}\sum_{i=1}^n\left(X_i-\overline{X}\,\right)^2 \operatorname{MSE}(S^2)=\operatorname{E}((S^2-\sigma^2)^2)=\operatorname{var}(S^2)

Notice how these examples also illustrate one facet of the bias-variance decomposition. The MSE of unbiased estimators are just their variance. The MSE of a biased estimator would have a non-zero bias term as well as a variance term. Note that the estimator that minimizes the MSE is not necessarily unbiased; it could compensate for the bias with a smaller variance. In the example above, a biased estimator for the variance, S^2 = \frac{1}{n}\sum_{i=1}^n\left(X_i-\overline{X}\,\right)^2, actually has a smaller mean squared error than the formula given, despite being biased by - \frac{1}{n} \sigma^2.

Applications

  • In statistical modelling, the MSE is defined as the difference between the actual observations and the response predicted by the model and is used to determine whether the model does not fit the data or whether the model can be simplified by removing terms.
  • In Bioinformatics, the RMSD is the measure of the average distance between the backbones of superimposed proteins.
  • In GIS, the RMSE is one measure used to assess the accuracy of spatial analysis and remote sensing.
  • In Imaging Science, the RMSD is one measure used to assess how well a method to reconstruct an image performs relative to the original image.

See also

This entry is from Wikipedia, the leading user-contributed encyclopedia. It may not have been reviewed by professional editors (see full disclaimer)