FACTOID # 25: If you're tired of sitting in traffic on your way to work, move to North Dakota.

 Home Encyclopedia Statistics States A-Z Flags Maps FAQ About

 WHAT'S NEW

SEARCH ALL

Search encyclopedia, statistics and forums:

(* = Graphable)

Encyclopedia > Unbiased estimator

In statistics, a biased estimator is one that for some reason on average over- or underestimates what is being estimated. The word bias has at least two different senses in statistics, one referring to something considered very bad, the other referring to something that can at times produce results more useful and closer to the truth than an insistence on being "unbiased."

One meaning is involved in what is called a biased sample: If some elements are more likely to be chosen in the sample than others, and those that are have a higher or lower value of the quantity being estimated, the outcome will be higher or lower than the true value.

A famous case of what can go wrong when using a biased sample is found in the 1936 US presidential election polls. The Literary Digest held a poll that forecast that Alfred M. Landon would defeat Franklin Delano Roosevelt by 57% to 43%. George Gallup, using a much smaller sample (300,000 rather than 2,000,000), predicted Roosevelt would win, and he was right. What went wrong with the Literary Digest poll? They had used lists of telephone and automobile owners to select their sample. In those days, these were luxuries, so their sample consisted mainly of middle- and upper-class citizens. These voted in majority for Landon, but the lower classes voted for Roosevelt. Because their sample was biased towards wealthier citizens, their result was incorrect.

This kind of bias is usually regarded as a worse problem than statistical noise: Problems with statistical noise can be lessened by enlarging the sample, but a biased sample will not go away that easily. In particular, a meta-analysis will distill good data for studies that themselves suffer from statistical noise, but a meta-analysis of biased studies will be biased itself.

## The sometimes-good kind

Another kind of bias in statistics does not involve biased samples, but does involve the use of a statistic whose average value differs from the value of the quantity being estimated. Suppose we are trying to estimate the parameter θ using an estimator (that is, some function of the observed data). Then the bias of is defined to be

In words, this would be "the expected value of the estimator minus the true value θ". This may be rewritten as

which would read "the expected value of the difference between the estimator and the true value" (the expected value of θ is θ).

For example, suppose X1, ..., Xn are independent and identically distributed random variables with expectation μ and variance σ2. Let

be the "sample average", and let

be a "sample variance". Then S2 is a "biased estimator" of σ2 because

But if the sample comes from a normally distributed population, then this biased estimator is, by the commonly used criterion of "mean squared error", actually better (but only very slightly) than the unbiased estimator that results from putting n − 1 in the denominator where n appears in the definition of S2 above. Even then the square root of the unbiased estimator of the population variance is not an unbiased estimator of the population standard deviation; for a non-linear function f and an unbiased estimator U of a parameter p, f(U) is usually not an unbiased estimator of f(p).

A far more extreme case of a biased estimator being better than any unbiased estimator is well-known: Suppose X has a Poisson distribution with expectation λ. It is desired to estimate

The only function of the data constituting an unbiased estimator is

If the observed value of X is 100, then the estimate is 1, although the true value of the quantity being estimated is obviously very likely to be near 0, which is the opposite extreme. And if X is observed to be 101, then the estimate is even more absurd: it is −1, although the quantity being estimated obviously must be positive. The (biased) maximum-likelihood estimator

is better than this unbiased estimator in the sense that the mean squared error

is smaller. Compare the unbiased estimator's MSE of

1 - e - 4λ

The MSE is a function of the true value λ. The bias of the maximum-likelihood estimator is:

.

The bias of maximum-likelihood estimators can be substantial. Consider a case where n tickets numbered from 1 through to n are placed in a box and one is selected at random, giving a value X. If n is unknown, then the maximum-likelihood estimator of n is X, even though the expectation of X is only n/2; we can only be certain that n is at least X and is probably more. In this case, the natural unbiased estimator is 2X.

Results from FactBites:

 Bias (statistik) - Wikipédia (694 words) Dina statistik, estimator anu bias nyaeta hiji kaayaan numana nilai rata-rata saluhureun atawa sahandapeun nu ditaksir. Aya dua panilaian anu beda, hiji nempo kana kaayaan nu kacida gorengna, panempo sejenna nyaeta kana kaayaan dina waktu hasil nyieunna leuwih kapake sarta leuwih deukeut kana bebeneran tinimbang kana kaayaan "unbiased." Even then the square root of the unbiased estimator of the population varian is not an unbiased estimator of the population simpangan baku; for a non-linear function f and an unbiased estimator U of a parameter p, f(U) is usually not an unbiased estimator of f(p).
 Estimator - Wikipedia, the free encyclopedia (764 words) In statistics, given a parametric model, an estimator is a function of the known sample data that is used to estimate an unknown parameter; an estimate is the result from the actual application of the function to a particular set of data. The standard deviation of an estimator of θ (the square root of the variance), or an estimate of the standard deviation of an estimator of θ, is called the standard error of θ. A consistent estimator is an estimator that converges in probability to the quantity being estimated as the sample size grows.
More results at FactBites »

Share your thoughts, questions and commentary here