Truncated mean
A truncated mean or trimmed mean is a statistical measure of central tendency, much like the mean and median. It involves the calculation of the mean after discarding given parts of a probability distribution or sample at the high and low end, and typically discarding an equal amount of both.
For most statistical applications, 5 to 25 percent of the ends are discarded. In some regions of Central Europe it is also known as a Windsor mean, but this name should not be confused with the Winsorized mean: in the latter, the observations that the trimmed mean would discard are instead replaced by the largest/smallest of the remaining values.
Notation
The index of the mean is an indication of the percentage of the entries removed on both sides. For example, if you were to truncate a sample with 8 entries by 12.5%, you would discard the first and the last entry in the sample when calculating the truncated mean.
Interpolation
When a trimmed mean for a sample must be determined, but it cannot be accurately done, the best is to calculate the nearest two trimmed means, and interpolate (usually linearly). For example, if you need to calculate the 15% trimmed mean of a sample containing 10 entries, you would calculate the 10% trimmed mean (removing 1 entry on either side of the sample), the 20% trimmed mean (removing 2 entries on either side), and interpolating to determine the 15% trimmed mean.
Advantages
The truncated mean is a useful estimator because it is less sensitive to outliers than the mean but will still give a reasonable estimate of central tendency or mean for many statistical models. In this regard it is referred to as a robust estimator.
One situation in which it can be advantageous to use a truncated mean is when estimating the location parameter of a Cauchy distribution, a bell shaped probability distribution with fatter tails than a normal distribution. It can be shown that the truncated mean of the middle 24% sample order statistics (i.e., truncate the sample by 38%) produces an estimate for the population location parameter that is more efficient than using either the sample median or the full sample mean.[1][2] However, due to the fat tails of the Cauchy distribution, the efficiency of the estimator decreases as more of the sample gets used in the estimate.[1][2] Note that for the Cauchy distribution, neither the truncated mean, full sample mean or sample median represents a maximum likelihood estimator, nor are any as asymptotically efficient as the maximum likelihood estimator; however, the maximum likelihood estimate is difficult to compute, leaving the truncated mean as a useful alternative.[2][3]
Drawbacks
The truncated mean uses more information from the distribution or sample than the median, but unless the underlying distribution is symmetric, the truncated mean of a sample is unlikely to produce an unbiased estimator for either the mean or the median.
Examples
The scoring method used in many sports that are evaluated by a panel of judges is a truncated mean: discard the lowest and the highest scores; calculate the mean value of the remaining scores. The interquartile mean is another example when the lowest 25% and the highest 25% are discarded, and the mean of the remaining scores is calculated.
References
- ^ a b Rothenberg, Thomas J.; Fisher, Franklin, M.; Tilanus, C.B. (1966). "A note on estimation from a cauchy sample". Journal of the American Statistical Association 59: 460–463.
- ^ a b c Bloch, Daniel (1966). "A note on the estimation of the location parameters of the Cauchy distribution". Journal of the American Statistical Association 61 (316): 852–855. JSTOR 2282794.
- ^ Ferguson, Thomas S. (1978). "Maximum Likelihood Estimates of the Parameters of the Cauchy Distribution for Samples of Size 3 and 4". Journal of the American Statistical Association 73 (361): 211. JSTOR 2286549.
This entry is from Wikipedia, the leading user-contributed encyclopedia. It may not have been reviewed by professional editors (see full disclaimer)
This article is licensed under the Creative Commons Attribution/Share-Alike License. It uses material from the Wikipedia article Truncated mean.