FACTOID # 17: Though Rhode Island is the smallest state in total area, it has the longest official name: The State of Rhode Island and Providence Plantations.

 Home Encyclopedia Statistics States A-Z Flags Maps FAQ About

 WHAT'S NEW

SEARCH ALL

Search encyclopedia, statistics and forums:

(* = Graphable)

Encyclopedia > Resampling (statistics)
 This article or section recently underwent a major revision or rewrite and needs further review. You can help!

In statistics, resampling is any of a variety of methods for doing one of the following: Wikipedia does not have an article with this exact name. ... Template:Otherusescccc A graph of a bell curve in a normal distribution showing statistics used in educational assessment, comparing various grading methods. ...

1. Estimating the precision of sample statistics (medians, variances, percentiles) by using subsets of available data (jackknife) or drawing randomly with replacement from a set of data points (bootstrapping)
2. Exchanging labels on data points when performing significance tests (permutation test, also called exact test, randomization test, or re-randomization test)
3. Validating models by using random subsets (bootstrap, decision trees)

Random redirects here. ... In decision theory (for example risk management), a decision tree is a graph of decisions and their possible consequences, (including resource costs and risks) used to create a plan to reach a goal. ...

Bootstrapping is a statistical method for estimating the sampling distribution of an estimator by sampling with replacement from the original sample, most often with the purpose of deriving robust estimates of standard errors and confidence intervals of a population parameter like a mean, median, proportion, odds ratio, correlation coefficient or regression coefficient. It may also be used for constructing hypothesis tests. It is often used as a robust alternative to inference based on parametric assumptions when those assumptions are in doubt, or where parametric inference is impossible or requires very complicated formulas for the calculation of standard errors. In statistics, bootstrapping is a modern, computer intensive, general purpose approach to statistical inference, falling within a broader class of resampling methods. ... In statistics, a sampling distribution is the probability distribution, under repeated sampling of the population, of a given statistic (a numerical quantity calculated from the data values in a sample). ... In statistics, an estimator is a function of the observable sample data that is used to estimate an unknown population parameter; an estimate is the result from the actual application of the function to a particular set of data. ... Sampling is that part of statistical practice concerned with the selection of individual observations intended to yield some knowledge about a population of concern, especially for the purposes of statistical inference. ... In statistics, mean has two related meanings: Look up mean in Wiktionary, the free dictionary. ... In probability theory and statistics, a median is a number dividing the higher half of a sample, a population, or a probability distribution from the lower half. ... In mathematics, two quantities are called proportional if they vary in such a way that one of the quantities is a constant multiple of the other, or equivalently if they have a constant ratio. ... The odds-ratio is a statistical measure, particularly important in Bayesian statistics and logistic regression. ... In probability theory and statistics, correlation, also called correlation coefficient, is a numeric measure of the strength of linear relationship between two random variables. ... Generally, regression is related to moving backwards, and the opposite of progression. ...

See also particle filter for the general theory of Sequential Monte Carlo methods, as well as details on some common implementations. Result of particle filtering (red line) based on observed data generated from the blue line ( Much larger image) Particle filter methods, also known as Sequential Monte Carlo (SMC), are sophisticated model estimation techniques based on simulation. ...

## Jackknife

Like bootstrapping, jackknifing is a statistical method for estimating and compensating for bias and for deriving robust estimates of standard errors and confidence intervals. Jackknifed statistics are created by systematically dropping out subsets of data one at a time and assessing the resulting variation in the studied parameter. (Mooney & Duval). For other senses of this word, see bias (disambiguation). ... Robust statistics provides an alternative approach to classical statistical methods. ...

Both methods estimate the variability of a statistic from the variability of that statistic between subsamples, rather than from parametric assumptions. The jackknife is a less general technique than the bootstrap, and explores the sample variation differently. However the jackknife is easier to apply to complex sampling schemes, such as multi-stage sampling with varying sampling weights, than the bootstrap.

The jackknife and bootstrap may in many situations yield similar results. But when used to estimate the standard error of a statistic, bootstrap gives slightly different results when repeated on the same data, whereas the jackknife gives exactly the same result each time (assuming the subsets to be removed are the same).

Richard von Mises was the first to conceive and apply the jackknife, which has some similarity to k-fold and leave-one-out cross-validation techniques. Richard von Mises. ... In statistics cross-validation is the practice of partitioning a sample of data into subsamples such that analysis is initially performed on a single subsample, while further subsamples are retained blind in order for subsequent use in confirming and validating the initial analysis. ...

## Permutation tests

A permutation test (also called a randomization test, re-randomization test, or an exact test) is a type of statistical significance test in which a reference distribution is obtained by calculating all possible values of the test statistic under rearrangements of the labels on the observed data points. In other words, the method by which treatments are allocated to subjects in an experimental design is mirrored in the analysis of that design. If the labels are exchangeable under the null hypothesis, then the resulting tests yield exact significance levels. Confidence intervals can then be derived from the tests. The theory has evolved from the works of R.A. Fisher and E.J.G. Pitman in the 1930s. An exact (significance) test is a test where all assumptions that the derivation of the distribution of the test statistic is based on are met. ... One may be faced with the problem of making a definite decision with respect to an uncertain hypothesis which is known only through its observable consequences. ... Sir Ronald Fisher Sir Ronald Aylmer Fisher, FRS (February 17, 1890&#8211;July 29, 1962) was an extraordinarily talented evolutionary biologist, geneticist and statistician. ... To meet Wikipedias quality standards, this article or section may require cleanup. ...

To illustrate the basic idea of a permutation test, suppose we have two groups A and B whose sample means are $bar{x}_{A}$ and $bar{x}_{B}$, and that we want to test, at 5% significance level, whether they come from the same distribution. Let nA and nB be the sample size corresponding to each group. The permutation test is designed to determine whether the observed difference between the sample medians is large enough to reject the null hypothesis H0 that the two groups have identical probability curves.

The test proceeds as follows. First, the observations of groups A and B are pooled. From these pooled values, nA observations are sampled without replacement. The sample mean for these nB observations is computed, then the sample mean for the remaining nB observations is computed, and the difference between the resulting sample medians is recorded. This process is repeated many times (e.g. 999 times), and if the middle 95% of the resulting differences does not contain the actual difference $bar{x}_{A} - bar{x}_{B}$, reject the hypothesis of identical probability curves at 5% significative level.

At last, note that if the number of all possible permutation is small enough to allow it, the algorithm can be applied using all the possible cases instead of generating random permutations (some authors speak of permutation tests in this last case only, using the term randomization test in the previous situation).

### Relation to parametric tests

Permutation tests are a subset of non-parametric statistics. The basic premise is to use only the assumption that it is possible that all of the treatment groups are equivalent, and that every member of them is the same before sampling began (i.e. the slot that they fill is not differentiable from other slots before the slots are filled). From this, one can calculate a statistic and then see to what extent this statistic is special by seeing how likely it would be if the treatment assignments had been jumbled. The branch of statistics known as non-parametric statistics is concerned with non-parametric statistical models and non-parametric statistical tests. ...

In contrast to permutation tests, the reference distributions for many popular "classical" statistical tests, such as the t-test, z-test and chi-squared test, are obtained from theoretical probability distributions. The Student's t test is exactly a permutation test under normality and is thus relatively robust. The F-test (z-test) and chi-squared test are far from exact except for in large samples (n > 5, or 20). In statistics, confidence intervals are the most prevalent form of interval estimation. ... A t-test is any statistical hypothesis test in which the test statistic has a Students t-distribution if the null hypothesis is true. ... The Z-test is a statistical test used in inference. ... Pearsons chi-square test (&#967;2) is one of a variety of chi-square tests &#8211; statistical procedures whose results are evaluated by reference to the chi-square distribution. ...

Fisher's exact test is a commonly used test that is exactly equivalent to a permutation test for evaluating the association between two dichotomous variables. When sample sizes are small, the chi-squared test statistic can no longer be accurately compared against the chi-square reference distribution and the use of Fisher’s exact test becomes most appropriate. A rule of thumb is that the expected count in each cell of the table should be greater than 5 before Pearson's chi-squared test is used. Fishers exact test is a statistical significance test used in the analysis of categorical data where sample sizes are small. ...

Permutation tests exist in many situations where parametric tests do not. For example, when deriving an optimal test when losses are proportional to the size of an error rather than its square. All simple and many relatively complex parametric tests have a corresponding permutation test version that is defined by using the same test statistic as the parametric test, but obtains the p-value from the sample-specific permutation distribution of that statistic, rather than from the theoretical distribution derived from the parametric assumption. For example, it is possible in this manner to construct a permutation t-test, a permutation chi-squared test of association, a permutation version of Aly's test for comparing variances and so on. A t-test is any statistical hypothesis test in which the test statistic has a Students t-distribution if the null hypothesis is true. ... Pearsons chi-square test (&#967;2) is one of a variety of chi-square tests &#8211; statistical procedures whose results are evaluated by reference to the chi-square distribution. ...

The major down-side to permutation tests are that

• They can be computationally intensive, and may require "custom" code for difficult-to-calculate statistics. This must be rewritten for every case.
• They provide only a p-value and nothing else.

### Examples

Permutation tests exist for any test statistic, regardless of whether or not its distribution is known. Thus one is always free to choose the statistic which best discriminates between hypothesis and alternative and which minimizes losses.

Permutation tests can be used for analyzing unbalanced designs (http://tbf.coe.wayne.edu/jmasm/vol1_no2.pdf) and for combining dependent tests on mixtures of categorical, ordinal, and metric data (Pesarin, 2001).

Before the 1980s, the burden of creating the reference distribution was overwhelming except for data sets with small sample sizes. But since the 1980s, the confluence of cheap fast computers and the development of new sophisticated path algorithms applicable in special situations, made the application of permutation test methods practical for a wide range of problems, and initiated the addition of exact-test options in the main statistical software packages and the appearance of specialized software for performing a wide range of uni- and multi-variable exact tests and computing test-based "exact" confidence intervals.

### Limitations

An important assumption behind a permutation test is that the observations are exchangeable under the null hypothesis. An important consequence of this assumption is that tests of difference in location (like a permutation t-test) require equal variance. In this respect, the permutation t-test shares the same weakness as the classical Student’s t-test. A third alternative in this situation is to use a bootstrap-based test. Good (2000) explains the difference between permutation tests and bootstrap tests the following way: "Permutations test hypotheses concerning distributions; bootstraps tests hypotheses concerning parameters. As a result, the bootstrap entails less-stringent assumptions." Of course, bootstrap tests are not exact.

### Monte Carlo testing

An asymptotically equivalent permutation test can be created when there are too many possible orderings of the data to conveniently allow complete enumeration. This is done by generating the reference distribution by Monte Carlo sampling, which takes a small (relative to the total number of permutations) random sample of the possible replicates.
The realization that this could be applied to any permutation test on any dataset was an important breakthrough in the area of applied statistics. The earliest known reference to this approach is Dwass (1957)[1]. This type of permutation test is known under various names: approximate permutation test, Monte Carlo permutation tests or random permutation tests[2]. Monte Carlo methods are algorithms for solving various kinds of computational problems by using random numbers (or more often pseudo-random numbers), as opposed to deterministic algorithms. ...

The necessary size of the Monte Carlo sample depends on the need for accuracy of the test. If one merely wants to know if the p-value is significant, as few as 400 rearrangements may generate an answer. (For observed p=0.05, the accuracy from 10,000 random permutations is around 0.005, and for observed p=0.10, the accuracy is around 0.008. Accuracy is defined from the binomial 99% confidence interval: p +/- accuracy).

Result of particle filtering (red line) based on observed data generated from the blue line ( Much larger image) Particle filter methods, also known as Sequential Monte Carlo (SMC), are sophisticated model estimation techniques based on simulation. ... A random permutation is a random ordering of a set of objects, that is, a permutation-valued random variable. ...

## Bibliography

### Introductory statistics

• Good, P. (2005) Introduction to Statistics Through Resampling Methods and R/S-PLUS. Wiley. ISBN 0-471-71575-1
• Good, P. (2005) Introduction to Statistics Through Resampling Methods and Microsoft Office Excel. Wiley. ISBN 0-471-73191-9

### Resampling methods

• Good, P. (2006) Resampling Methods. 3rd Ed. Birkhauser.

#### Bootstrapping

• Bradley Efron (1979). "Bootstrap methods: Another look at the jackknife", The Annals of Statistics, 7, 1-26.
• Bradley Efron (1981). "Nonparametric estimates of standard error: The jackknife, the bootstrap and other methods", Biometrika, 68, 589-599.
• Bradley Efron (1982). The jackknife, the bootstrap, and other resampling plans, In Society of Industrial and Applied Mathematics CBMS-NSF Monographs, 38.
• P. Diaconis, Bradley Efron (1983), "Computer-intensive methods in statistics," Scientific American, May, 116-130.
• Bradley Efron, Robert J. Tibshirani, (1993). An introduction to the bootstrap, New York: Chapman & Hall, software.
• Mooney, C Z & Duval, R D (1993). Bootstrapping. A Nonparametric Approach to Statistical Inference. Sage University Paper series on Quantitative Applications in the Social Sciences, 07-095. Newbury Park, CA: Sage.
• E. S. Edgington, (1995). Randomization tests. New York: Marcel Dekker.
• Davison, A. C. and Hinkley, D. V. (1997): Bootstrap Methods and their Applications, software.
• Simon, J. L. (1997): Resampling: The New Statistics.
• Moore, D. S., G. McCabe, W. Duckworth, and S. Sclove (2003): Bootstrap Methods and Permutation Tests
• Hesterberg, T. C., D. S. Moore, S. Monaghan, A. Clipson, and R. Epstein (2005): Bootstrap Methods and Permutation Tests, software.

Bradley Efron is a statistician best known for proposing the bootstrap resampling technique, which has had a major impact in the field of Statistics and virtually every area of statistical application. ... Bradley Efron is a statistician best known for proposing the bootstrap resampling technique, which has had a major impact in the field of Statistics and virtually every area of statistical application. ... Biometrika is a scientific journal established in 1901 by Francis Galton, Karl Pearson and W. F. R. Weldon to promote the study of biometrics, the statistical analysis of hereditary phenomena. ... Bradley Efron is a statistician best known for proposing the bootstrap resampling technique, which has had a major impact in the field of Statistics and virtually every area of statistical application. ... Bradley Efron is a statistician best known for proposing the bootstrap resampling technique, which has had a major impact in the field of Statistics and virtually every area of statistical application. ... Scientific American is a popular-science magazine, published (first weekly and later monthly) since August 28, 1845, making it the oldest continuously published magazine in the United States. ... Bradley Efron is a statistician best known for proposing the bootstrap resampling technique, which has had a major impact in the field of Statistics and virtually every area of statistical application. ... Look up sage in Wiktionary, the free dictionary. ... Marcel Dekker is a well-known encyclopedia publishing company with editorial boards found in New York, New York. ...

#### Permutation test

Original references:

• R. A. Fisher, The Design of Experiment, New York: Hafner, 1935.
• Pitman, E. J. G., "Significance tests which may be applied to samples from any population", Royal Statistical Society Supplement, 1937; 4: 119-130 and 225-32 (parts I and II).
• Pitman, E. J. G., "Significance tests which may be applied to samples from any population. Part III. The analysis of variance test", Biometrika, 1938; 29: 322-335.

Modern references: Sir Ronald Fisher Sir Ronald Aylmer Fisher, FRS (February 17, 1890 &#8211; July 29, 1962) was an evolutionary biologist, geneticist and statistician. ... The Hafner Manufacturing Company was a maker of clockwork-powered O gauge toy trains, based in Chicago, Illinois, from 1914 to 1951. ... Biometrika is a scientific journal established in 1901 by Francis Galton, Karl Pearson and W. F. R. Weldon to promote the study of biometrics, the statistical analysis of hereditary phenomena. ...

• E. S. Edgington, Randomization tests, 3rd ed. New York: Marcel-Dekker, 1995.
• Phillip I. Good, Permutation, Parametric and Bootstrap Tests of Hypotheses, 3rd ed., Springer, 2005. ISBN 0-387-98898-X
• Good, P. (2002) Extensions of the concept of exchangeability and their applications, J. Modern Appl. Statist. Methods, 1:243-247.
• Lunneborg, Cliff. Data Analysis by Resampling, Duxbury Press, 1999. ISBN 0-534-22110-6.
• Pesarin, F. 2001. Multivariate Permutation Tests, Wiley.
• Welch, W. J., Construction of permutation tests, Journal of American Statistical Association, 85:693-698, 1990.

Computational methods: Springer is the name of several places in the United States: Springer, New Mexico Springer Township, North Dakota Springer, Oklahoma Springer is the name of: Springer Science+Business Media, a worldwide publishing group based in Germany (including Springer-Verlag) Axel Springer Verlag AG, famous conservative German publishing house Springer (EP... Look up Wiley in Wiktionary, the free dictionary. ...

• Mehta, C. R. and Patel, N. R. (1983). 2A network algorithm for performing Fisher’s exact test in r x c contingency tables", J. Amer. Statist. Assoc, 78(382):427–434.
• Metha, C. R., Patel, N. R. and Senchaudhuri, P. (1988). "Importance sampling for estimating exact probabilities in permutational inference", J. Am. Statist. Assoc., 83(404):999–1005.

### References

1. ^ Meyer Dwass, "Modified Randomization Tests for Nonparametric Hypotheses", The Annals of Mathematical Statistics, 28:181-187, 1957.
2. ^ Thomas E. Nichols, Andrew P. Holmes, "Nonparametric Permutation Tests For Functional Neuroimaging: A Primer with Examples,", Human Brain Mapping, 15:1-25, 2001.

Share your thoughts, questions and commentary here