 FACTOID # 21: 15% of Army recruits from South Dakota are Native American, which is roughly the same percentage for female Army recruits in the state.

 Home Encyclopedia Statistics States A-Z Flags Maps FAQ About

 WHAT'S NEW

SEARCH ALL

Search encyclopedia, statistics and forums:

(* = Graphable)

Encyclopedia > Analysis of variance

In statistics, analysis of variance (ANOVA) is a collection of statistical models and their associated procedures which compare means by splitting the overall observed variance into different parts. The initial techniques of the analysis of variance were pioneered by the statistician and geneticist R. A. Fisher in the 1920s and 1930s, and is sometimes known as Fisher's ANOVA or Fisher's analysis of variance, due to the use of Fisher's F-distribution as part of the test of statistical significance. A graph of a bell curve in a normal distribution showing statistics used in educational assessment, comparing various grading methods. ... A statistical model is used in applied statistics. ... In probability theory and statistics, the variance of a random variable (or equivalently, of a probability distribution) is a measure of its statistical dispersion, indicating how its possible values are spread around the expected value. ... For Wikipedia statistics, see m:Statistics Statistics is the science and practice of developing human knowledge through the use of empirical data expressed in quantitative form. ... Look up geneticist in Wiktionary, the free dictionary. ... Sir Ronald Aylmer Fisher, FRS (17 February 1890 â€“ 29 July 1962) was a English statistician, evolutionary biologist, and geneticist. ... The 1920s was a decade sometimes referred to as the Jazz Age or the Roaring Twenties, usually applied to America. ... This article or section does not cite its references or sources. ... Sir Ronald Aylmer Fisher, FRS (17 February 1890 â€“ 29 July 1962) was a English statistician, evolutionary biologist, and geneticist. ... In statistics and probability, the F-distribution is a continuous probability distribution. ... In statistics, a result is significant if it is unlikely to have occurred by chance, given that a presumed null hypothesis is true. ...

## Contents

There are three conceptual classes of such models:

• Fixed-effects model assumes that the data come from normal populations which may differ only in their means. (Model 1)
• Random-effects models assume that the data describe a hierarchy of different populations whose differences are constrained by the hierarchy. (Model 2)
• Mixed effects models describe situations where both fixed and random effects are present. (Model 3)

In practice, there are several types of ANOVA depending on the number of treatments and the way they are applied to the subjects in the experiment: Fixed effect(s) model is a term often used in hierarchical linear modeling. ... The normal distribution, also called Gaussian distribution (although Gauss was not the first to work with it), is an extremely important probability distribution in many fields. ... Random effect(s) model is a term often used in hierarchical linear modeling. ...

• One-way ANOVA is used to test for differences among three or more independent groups.
• One-way ANOVA for repeated measures is used when the subjects are repeated measures; this means that the same subjects are used for each treatment. Note that this method can be subject to carryover effects.
• Factorial ANOVA is used when the experimenter wants to study the effects of two or more treatment variables. The most commonly used type of factorial ANOVA is the 2x2 (read: two by two) design, where there are two independent variables and each variable has two levels or distinct values. Factorial ANOVA can also be multi-level such as 3×3, etc. or higher order such as 2×2×2, etc. but analyses with higher numbers of factors are rarely done because the calculations are lengthy and the results are hard to interpret.
• Multivariate analysis of variance (MANOVA) is used when there is more than one dependent variable.

In experimental design an independent variable is that variable which is measured, manipulated, or selected by the experimenter to determine its relationship to an observed phenomenon (the dependent variable). ... Multivariate analysis of variance (MANOVA) is an extension of analysis of variance (ANOVA) methods to cover cases where there is more than one dependent variable and where the dependent variables cannot simply be combined. ... In experimental design, a dependent variable is a variable dependent on another variable (called the independent variable). ...

## Models

### Fixed-effects model

The fixed-effects model of analysis of variance applies to situations in which the experimenter has subjected his experimental material to several treatments, each of which affects only the mean of the underlying normal distribution of the "response variable".

### Random-effects model

Random effects models are used to describe situations in which incomparable differences in experimental material occur. The simplest example is that of estimating the unknown mean of a population whose individuals differ from each other. In this case, the variation between individuals is confounded with that of the observing instrument.

## Assumptions

• Independence of cases - this is a requirement of the design.
• Scale of measurement - the dependent variable is interval or ratio.
• Normality - the distributions in each of the groups are normal (use the Kolmogorov-Smirnov and Shapiro-Wilk normality tests to test it). Note that the F-test is extremely non-robust to deviations from normality (Lindman, 1974).
• Homogeneity of variances - the variance of data in groups should be the same (use Levene's test for homogeneity of variances).

The level of measurement of a variable in mathematics and statistics is a classification that was proposed in order to describe the nature of information contained within numbers assigned to objects and, therefore, within the variable. ... The term interval is used in the following contexts: cricket mathematics music time This is a disambiguation page &#8212; a navigational aid which lists other pages that might otherwise share the same title. ... In number and more generally in algebra, a ratio is the linear relationship between two quantities of the same unit. ... Normality can mean Normality (chemistry) Normality (statistics) Used in the English language: Being normal. ... The normal distribution, also called Gaussian distribution (although Gauss was not the first to work with it), is an extremely important probability distribution in many fields. ... In statistics, the Kolmogorov-Smirnov test (often called the K-S test) is used to determine whether two underlying probability distributions differ, or whether an underlying probability distribution differs from a hypothesized distribution, in either case based on finite samples. ... In statistics, the Shapiro-Wilk test tests the null hypothesis that a sample x1, ..., xn came from a normally distributed population. ... An F-test is any statistical test in which the test statistic has an F-distribution if the null hypothesis is true. ... In statistics, a sequence or a vector of random variables is homoscedastic if all random variables in the sequence or vector have the same finite variance. ... In statistics, Levenes test compares the variances of samples. ...

## Logic of ANOVA

The fundamental technique is a partitioning of the total sum of squares into components related to the effects in the model used. For example, we show the model for a simplified ANOVA with one type of treatment at different levels. (If the treatment levels are quantitative and the effects are linear, a linear regression analysis may be appropriate.) To meet Wikipedias quality standards, this article or section may require cleanup. ...

SSTotal = SSError + SSTreatments.

The number of degrees of freedom (abbreviated df) can be partitioned in a similar way and specifies the chi-square distribution which describes the associated sums of squares. This article or section is in need of attention from an expert on the subject. ... In probability theory and statistics, the chi-square distribution (also chi-squared or Ï‡2  distribution) is one of the theoretical probability distributions most widely used in inferential statistics, i. ...

dfTotal = dfError + dfTreatments.

### Degrees of freedom

Degrees of freedom indicate the effective number of observations which contribute to the sum of squares in an ANOVA, the total number of observations minus the number of linear constraints in the data. The degrees of freedom are the number of participants (for each group) minus 1. This removes the error otherwise produced by the differences in variance of such groups to account for the difference in sample and population variance.

### ANOVA on Ranks

As first suggested by Conover and Iman in 1981, in many cases when the data do not meet the assumptions of ANOVA, one can replace each original data value by its rank from 1 for the smallest to N for the largest, then run a standard ANOVA calculation on the rank-transformed data. "Where no equivalent nonparametric methods have yet been developed such as for the two-way design, rank transformation results in tests which are more robust to non-normality, and resistant to outliers and non-constant variance, than is ANOVA without the transformation. (Helsel & Hirsch, 2002, Page 177)."

Conover, W. J., Iman, R. L. (1981). Rank transformations as a bridge between parametric and nonparametric statistics. American Statistician, 35, 124-129.  

Helsel, D.R. and R. M. Hirsch, 2002. Statistical Methods in Water Resources: Techniques of Water Resourses Investigations, Book 4, chapter A3. U.S. Geological Survey. 522 pages.

## Examples

Group A is given vodka, Group B is given gin, and Group C is given a placebo. All groups are then tested with a memory task. A one-way ANOVA can be used to assess the effect of the various treatments (that is, the vodka, gin, and placebo). It has been suggested that Placebo effect be merged into this article or section. ...

Group A is given vodka and tested on a memory task. The same group is allowed a rest period of five days and then the experiment is repeated with gin. The procedure is repeated using a placebo. A one-way ANOVA with repeated measures can be used to assess the effect of the vodka versus the impact of the placebo. It has been suggested that Placebo effect be merged into this article or section. ...

In an experiment testing the effects of expectations, subjects are randomly assigned to four groups:

4. expect placebo-receive placebo (the last group is used as the control group)

Each group is then tested on a memory task. The advantage of this design is that multiple variables can be tested at the same time instead of running two different experiments. Also, the experiment can determine whether one variable affects the other variable (known as interaction effects). A factorial ANOVA (2×2) can be used to assess the effect of expecting vodka or the placebo and the actual reception of either. It has been suggested that Placebo effect be merged into this article or section. ... From Latin ex- + -periri (akin to periculum attempt). ...

ANCOVA, or analysis of covariance is an old-fashioned name for a linear regression model with one continuous explanatory variable and one or more factors. ... In statistics, analysis of rhythmic variance (ANORVA) is a new simple method for detecting rhythms in biological time series, published by Peter Celec (Biol Res. ... In statistics, Duncans new Multiple Range Test (MRT) is a multiple comparison procedure developed by David B Duncan in 1955. ... Explained variance is part of is part of the variance of any residual that can be be attributed to a specific condition (cause). ... Residual variance or unexplained variance is part of the variance of any residual. ... // Probability The Doctrine of Chances Author: Abraham de Moivre Publication data: 1738 (2nd ed. ... Multivariate analysis of variance (MANOVA) is an extension of analysis of variance (ANOVA) methods to cover cases where there is more than one dependent variable and where the dependent variables cannot simply be combined. ... In statistics, the multiple comparisons problem tests null hypotheses stating that the averages of several disjoint populations are equal to each other (homogeneous). ... A t-test is any statistical hypothesis test in which the test statistic has a Students t-distribution if the null hypothesis is true. ... In statistics, the Kruskal-Wallis one-way analysis of variance by ranks (named after William Kruskal and Allen Wallis) is a non-parametric method for testing equality of population medians among groups. ... The Friedman test is a non-parametric statistical test developed by the U.S. economist Milton Friedman. ...

• King, Bruce M., Minium, Edward W. (2003). Statistical Reasoning in Psychology and Education, Fourth Edition. Hoboken, New Jersey: John Wiley & Sons, Inc. ISBN 0-471-21187-7
• Lindman, H. R. (1974). Analysis of variance in complex experimental designs. San Francisco: W. H. Freeman & Co. Results from FactBites:

 Ed 602 - Lesson 13 - Analysis of Variance (1651 words) When using analysis of variance, it is a common practice to present the results of the analysis in an analysis of variance table. This table which shows the source of variation, the sum of squares, the degrees of freedom, the mean squares, and the probability is sometimes presented in a research article. In analysis of variance, if F is significant, we can use the Scheffe test to see which specific cell mean differs from which other specific cell mean.
More results at FactBites »

Share your thoughts, questions and commentary here
Press Releases | Feeds | Contact