Please add any Wikipedia articles related to statistics that are not already on this list. This article is about the field of statistics. ...
The "Related changes" link in the margin of this page (below search) leads to a list of the most recent changes to the articles listed below. To see the most recent changes to this page, see the page history. See also the list of probability topics, and the list of statisticians. This is a list of probability topics, by Wikipedia page. ...
Statisticians or people who made notable contributions to the theories of statistics, or related aspects of probability, or machine learning: // Odd Olai Aalen (1947â€“) Gottfried Achenwall (1719â€“1772) Abraham Manie Adelstein (1916â€“1992) John Aitchison (1926â€“) Alexander Aitken (1895â€“1967) Aleyamma George Hirotsugu Akaike (1927â€“) Oskar Anderson (1887â€“1960) Peter...
Contents: Top  0–9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
A
The absolute deviation of an element of a data set is the absolute difference between that element and a given point. ...
â€œAccuracyâ€ redirects here. ...
In classical (frequentist) decision theory, an admissible decision rule is a rule for making a decision that is better in some sense than any other rule that may compete with it. ...
The Akaike information criterion (AIC) (pronounced ahkaheekeh), developed by Hirotsugu Akaike in 1971 and proposed in Akaike (1974), is a measure of the goodness of fit of an estimated statistical model. ...
Algorithms for calculating variance play a minor role in statistical computing. ...
The Allan variance, named after David W. Allen, also known as twosample variance, is a measurement of accuracy in clocks. ...
80 4point nearalignments of 137 random points Statistics shows that if you put a large number of random points on a bounded flat surface you can find many alignments of random points. ...
These are statistical procedures which can be used to analyse categorical data: regression analysis of variance linear modeling loglinear modeling logistic regression repeated measures analysis simple correspondence analysis multiple correspondence analysis contingency table Burt table binary table frequency table chisquare statistics odds ratios correlation statistics Fishers exact...
In statistics, analysis of rhythmic variance (ANORVA) is a new simple method for detecting rhythms in biological time series, published by Peter Celec (Biol Res. ...
In statistics, analysis of variance (ANOVA) is a collection of statistical models and their associated procedures which compare means by splitting the overall observed variance into different parts. ...
In statistics, an ancillary statistic is a statistic whose probability distribution does not depend on which of the probability distributions among those being considered is the distribution of the statistical population from which the data were taken. ...
ANCOVA, or analysis of covariance is an oldfashioned name for a linear regression model with one continuous explanatory variable and one or more factors. ...
ASCA, ANOVASCA, or analysis of variance â€“ simultaneous component analysis is a method that partitions variation and enables interpretation of these partitions by SCA, a method that is similar to PCA. This method is a multi or even megavariate extension of ANOVA. The variation partitioning is similar to Analysis of...
An anomaly time series is the time series of deviations of a quantity from some mean. ...
Approximate Bayesian computation (ABC) is a family of computational techniques in Bayesian statistics. ...
In Survival analysis, the Area compatibility factor, F, is used in Indirect Standardisation of population mortality rates. ...
In mathematics and statistics, the arithmetic mean (or simply the mean) of a list of numbers is the sum of all the members of the list divided by the number of items in the list. ...
A plot showing 100 random numbers with a hidden sine function, and an autocorrelation of the series on the bottom. ...
Autocorrelation is a mathematical tool used frequently in signal processing for analysing functions or series of values, such as time domain signals. ...
In econometrics, an autoregressive conditional heteroskedasticity (ARCH, Engle (1982)) model considers the variance of the current error term to be a function of the variances of the previous time periods error terms. ...
In statistics, an autoregressive integrated moving average (ARIMA) model is a generalisation of an autoregressive moving average or (ARMA) model. ...
In statistics, autoregressive moving average (ARMA) models, sometimes called BoxJenkins models after George Box and G. M. Jenkins, are typically applied to time series data. ...
B The BaldingNichols model is a statistical description of the allele frequencies in the components of a subdivided population. ...
Statistics are very important to baseball, perhaps as much as they are for cricket, and more than almost any other sport. ...
In statistics, Basus theorem states that any complete sufficient statistic is independent of any ancillary statistic. ...
Bayes theorem (also known as Bayes rule or Bayes law) is a result in probability theory, which relates the conditional and marginal probability distributions of random variables. ...
Thomas Bayes (c. ...
In statistics, the use of Bayes factors is a Bayesian alternative to classical hypothesis testing. ...
Bayesian inference is statistical inference in which evidence or observations are used to update or to newly infer the probability that a hypothesis may be true. ...
In statistics, Bayesian linear regression is a Bayesian alternative to the more wellknown ordinary leastsquares linear regression. ...
The posterior probability of a model given data, P(HD), is given by Bayes theorem: P(HD) = P(DH)P(H)/P(D) The key data_dependent term P(DH) is a likelihood, and is sometimes called the evidence for model H; evaluating it correctly is the...
A Bayesian network (or a belief network) is a probabilistic graphical model that represents a set of variables and their probabilistic independencies. ...
Bayesian search theory is the application of Bayesian statistics to the search for lost objects. ...
In statistics, the BehrensFisher problem is the problem of interval estimation and hypothesis testing concerning the difference between the means of two normally distributed populations when the variances of the two populations are not assumed to be equal, based on two independent samples. ...
Belief propagation is an iterative algorithm for computing marginals of functions on a graphical model most commonly used in artificial intelligence and information theory. ...
In statistics, Bessels correction, named after Friedrich Bessel, is the use of n âˆ’ 1 instead of n when estimating variance, where n is the number of observations in a sample. ...
In empirical Bayes methods, the Betabinomial model is an analytic model where the likelihood function is specifed by a binomial distribution and the conjugate prior is a Beta distribution // It is convenient to reparameterize the distributions so that the expected mean of the prior is a single parameter: Let...
In probability theory and statistics, the beta distribution is a continuous probability distribution with the probability density function (pdf) defined on the interval [0, 1]: where Î± and Î² are parameters that must be greater than zero and B is the beta function. ...
The Bhattacharya coefficient is an approximate measurement of the amount of overlap between two statistical samples. ...
In statistics, the term bias is used for two different concepts. ...
A biased sample is one that is falsely taken to be typical of a population from which it is drawn. ...
Allan Birnbaum (May 27, 1923  July 1, 1976) was an American statistician who contributed to statistical inference, foundations of statistics, statistical genetics, statistical psychology, and history of statistics. ...
In probability theory, Chebyshevs inequality (also known as Tchebysheffs inequality, Chebyshevs theorem, or the BienaymÃ©Chebyshev inequality), named after Pafnuty Chebyshev, who first proved it, states that in any data sample or probability distribution, nearly all the values are close to the mean value, and provides a...
Binary classification is the task of classifying the members of a given set of objects into two groups on the basis of whether they have some property or not. ...
In probability theory and statistics, the binomial distribution is the discrete probability distribution of the number of successes in a sequence of n independent yes/no experiments, each of which yields success with probability p. ...
In statistics, the binomial test is an exact test of the statistical significance of deviations from a theoretically expected distribution of observations into two categories. ...
In combinatorial mathematics, a block design (more fully, a balanced incomplete block design) is a particular kind of set system, which has longstanding applications to experimental design (an area of statistics) as well as purely combinatorial aspects. ...
In the statistical theory of the design of experiments, blocking is the arranging of experimental units in groups (blocks) which are similar to one another. ...
In statistics, the Bonferroni correction states that if an experimenter is testing n independent hypotheses on a set of data, then the statistical significance level that should be used for each hypothesis separately is 1/n times what it would be if only one hypothesis were tested. ...
Bootstrap aggregating (bagging) is a metaalgorithm to improve classification and regression models in terms of stability and classification accuracy. ...
In statistics, bootstrapping is a modern, computer intensive, general purpose approach to statistical inference, falling within a broader class of resampling methods. ...
Data is taken to be either a scalar number, a vector or a matrix. ...
In statistics, the BoxCox transformation of the variable Y given the BoxCox parameter λ ≥ 0 is defined as This transformation has proved popular in regression analysis, including econometrics. ...
Figure 1. ...
Leo Breiman (January 27, 1928â€“July 7, 2005) was a distinguished statistician at the University of California, Berkeley. ...
In statistics, the BreuschPagan test is used to test for heteroskedasticity in a linear regression model. ...
Ladislaus Josephovich Bortkiewicz (August 7, 1868  July 15, 1931) was a Russian economist and statistician of Polish descent. ...
Business statistics is the science of good decision making in the face of uncertainty and is used in many disciplines such as financial analysis, econometrics, auditing, production and operations including services improvement, and marketing research. ...
C Calibrated probability assessments are subjective probabilities assigned by individuals who have been trained to assess probabilities in a way that historically represents their uncertainty[1][2]. In other words, when a calibrated person says they are 80% confident in each of 100 predictions they made, they will get about 80...
Calibration in statistics is a reverse process to regression. ...
Canonical Analysis  Wikipedia /**/ @import /skins1. ...
In statistics, canonical correlation analysis, introduced by Harold Hotelling, is a way of making sense of crosscovariance matrices. ...
The level of measurement of a variable in mathematics and statistics describes how much information the numbers associated with the variable contain. ...
In mathematics, the CauchySchwarz inequality, also known as the Schwarz inequality, the Cauchy inequality, or the CauchyBunyakovskiSchwarz inequality, named after Augustin Louis Cauchy, Viktor Yakovlevich Bunyakovsky and Hermann Amandus Schwarz, is a useful inequality encountered in many different settings, such as linear algebra applied to vectors, in...
In statistics, censoring occurs when the value of an observation is only partially known. ...
A central limit theorem is any of a set of weakconvergence results in probability theory. ...
In statistics, the Chapmanâ€“Robbins bound or Hammersleyâ€“Chapmanâ€“Robbins bound is a lower bound on the variance of estimators of a deterministic parameter. ...
In probability theory, the characteristic function of any random variable completely defines its probability distribution. ...
Chauvenets Criterion is a means of assessing whether one piece of experimental data â€” an outlier â€” from a set of observations, is spurious. ...
In probability theory, Chebyshevs inequality (also known as Tchebysheffs inequality, Chebyshevs theorem, or the BienaymÃ©Chebyshev inequality), named after Pafnuty Chebyshev, who first proved it, states that in any data sample or probability distribution, nearly all the values are close to the mean value, and provides a...
It has been suggested that this article or section be merged with Checking if a coin is fair. ...
In probability theory, the Chernoff bound, named after Herman Chernoff, gives a lower bound for the success of majority agreement for n independent, equally likely events. ...
In probability theory, Chernoffs inequality, named after Herman Chernoff, states the following. ...
In probability theory and statistics, the chi distribution is a continuous probability distribution. ...
This article is about the mathematics of the chisquare distribution. ...
A chisquare test is any statistical hypothesis test in which the test statistic has a chisquare distribution when the null hypothesis is true, or any in which the probability distribution of the test statistic (assuming the null hypothesis is true) can be made to approximate a chisquare...
The Chow test is an econometric test of whether the coefficients in two linear regressions on different data are equal. ...
Circular or directional statistics is the subdiscipline of statistics that deals with circular data. ...
Several classic data sets that have been used extensively in the statistical literature. ...
In health care, including medicine, a clinical trial (synonyms: clinical studies, research protocols, medical research) is a process in which a medicine or other medical treatment is tested for its safety and effectiveness, often in comparison to existing treatments. ...
In statistics, the closed testing procedure [1] is a general method for performing more than one hypothesis test simultaneously. ...
CochraneOrcutt estimation is a procedure in econometrics, which adjusts a linear model for serial correlation in the error term. ...
In statistics, Cochrans theorem is used in the analysis of variance. ...
In statistics, the coefficient of determination R2 is the proportion of variability in a data set that is accounted for by a statistical model. ...
In probability theory and statistics, the coefficient of variation (CV) is a measure of dispersion of a probability distribution. ...
Cohens kappa coefficient is a statistical measure of interrater reliability. ...
Special cause Common and specialcauses are the two distinct origins of variation, in a process that features in the statistical thinking and methods of Walter A. Shewhart and W. Edwards Deming. ...
The following tables compare general and technical information for a number of statistical analysis packages. ...
In probability theory, two events are called complementary if and only if precisely one of the possibilities must occur. ...
Suppose a random variable (which may be a sequence () of scalarvalued random variables), has a probability distribution belonging to a known family of probability distributions, parametrized by Î¸, which may be either vector or scalarvalued, and let be any statistic based on . ...
In statistics, compositional data is data in which each data point is an ntuple of nonnegative numbers whose sum is 1. ...
In statistics, computational learning theory is a mathematical field related to the analysis of machine learning algorithms. ...
In statistics, the concordance correlation coefficient measures the agreement between two variables, e. ...
In statistics, a concordant pair is a pair of a twovariable (bivariate) observation dataset {X1,Y1} and {X2,Y2}, where: Correspondingly, a discordant pair is a pair, as defined above, where and the sign function, often represented as sgn, is defined as: Kendall tau distance Spearmans rank...
This article illustrates the central limit theorem via an example for which the computation can be done quickly by hand on paper, unlike the more computingintensive example in the article titled illustration of the central limit theorem. ...
In statistics, the conditional change model is the analytic procedure in which change scores are regressed on baseline values, together with the explanatory variables of interest (often including an indicator of a treatment group). ...
Given two jointly distributed random variables X and Y, the conditional probability distribution of Y given X (written Y  X) is the probability distribution of Y when X is known to be a particular value. ...
In probability theory, two events A and B are conditionally independent given a third event C precisely if the occurrence or nonoccurrence of A and B are independent events in their conditional probability distribution given C. In other words, Two random variables X and Y are conditionally independent given...
This article defines some terms which characterize probability distributions of two or more variables. ...
In this diagram, the bars represent observation means and the red lines represent the confidence intervals surrounding them. ...
In statistics, a confounding factor is a factor which is the common cause of two things that may falsely appear to be in a causal relationship. ...
Conjoint analysis, also called multiattribute compositional models, is a statistical technique that originated in mathematical psychology. ...
In statistics, a consistent estimator is an estimator that converges in probability to the quantity being estimated as the sample size grows. ...
In statistics, contingency tables are used to record and analyse the relationship between two or more variables, most usually categorical variables. ...
In mathematics, a probability distribution assigns to every interval of the real numbers a probability, so that the probability axioms are satisfied. ...
In statistical process control, the control chart, also known as the Shewhart chart or processbehaviour chart is a tool to determine whether a manufacturing or business process is in a state of statistical control or not. ...
Control limits are horizontal lines drawn on an SPC control chart, usually at a distance of Â±3 standard deviations from the mean of the plotted statistic. ...
In Monte Carlo methods, one or more control variates may be employed to achieve variance reduction by exploiting the correlation between statistics. ...
Controlling for a variable means to deliberately vary the experimental conditions in order to take that variable into account in the prediction of the response variable. ...
In statistics, a copula is a multivariate cumulative distribution function defined on the ndimensional unit cube [0, 1]n such that every marginal distribution is uniform on the interval [0, 1]. Sklars theorem is as follows. ...
Positive linear correlations between 1000 pairs of numbers. ...
To meet Wikipedias quality standards, this article or section may require cleanup. ...
In statistics, the correlation ratio is a measure of the relationship between the statistical dispersion within individual categories and the dispersion across the whole population or sample. ...
In statistics, and especially in the statistical analysis of psychological data, the counternull is a statistic used to aid the understanding and presentation of research results. ...
In probability theory and statistics, the covariance between two realvalued random variables X and Y, with expected values and is defined as: where E is the expected value. ...
In statistics and probability theory, the covariance matrix is a matrix of covariances between elements of a vector. ...
Cricket is a sport that generates a large number of statistics. ...
Cronbachs (alpha) has an important use as a measure of the reliability of a psychometric instrument. ...
Cross tabs (or cross tabulations) display the joint distribution of two or more variables. ...
In statistics crossvalidation is the practice of partitioning a sample of data into subsamples such that analysis is initially performed on a single subsample, while further subsamples are retained blind in order for subsequent use in confirming and validating the initial analysis. ...
// Cumulants of probability distributions In probability theory and statistics, the cumulants Îºn of the probability distribution of a random variable X are given by In other words, Îºn/n! is the nth coefficient in the power series representation of the logarithm of the momentgenerating function. ...
In probability theory, the cumulative distribution function (abbreviated cdf) completely describes the probability distribution of a realvalued random variable, X. For every real number x, the cdf is given by where the righthand side represents the probability that the random variable X takes on a value less than...
Curve fitting is finding a curve which matches a series of data points and possibly other constraints. ...
Harald CramÃ©r (September 25, 1893  October 5, 1985) was a Swedish mathematician and statistician, specialised in mathematical statistics. ...
In statistics, the CramÃ©rRao bound (CRB) or CramÃ©rRao lower bound (CRLB), named in honor of Harald CramÃ©r and Calyampudi Radhakrishna Rao, expresses a lower bound on the variance of estimators of a deterministic parameter. ...
In statistics the CramÃ©rvonMises criterion for judging the goodness of fit of a probability distribution compared to a given distribution is given by In applications is the theoretical distribution and is the empirically observed distribution. ...
D The introduction to this article provides insufficient context for those unfamiliar with the subject matter. ...
Clustering is the classification of objects into different groups, or more precisely, the partitioning of a data set into subsets (clusters), so that the data in each subset (ideally) share some common trait  often proximity according to some defined distance measure. ...
Kurt Thearling, An Introduction to Data Mining (also available is a corresponding online tutorial) Dean Abbott, I. Philip Matkovsky, and John Elder IV, Ph. ...
In statistics, a data point is a single typed measurement. ...
A data set (or dataset) is a collection of data, usually presented in tabular form. ...
In statistics, data transformation is carried in order to transform the data and assure that it has a normal distribution (a remedy for outliers, failures of normality, linearity, and homoscedasticity). ...
In probability theory, de Finettis theorem explains why exchangeable observations are conditionally independent given some (usually) unobservable quantity to which an epistemic probability distribution would then be assigned. ...
Decision theory is an area of study of discrete mathematics that models human decisionmaking in science, engineering and indeed all human social activities. ...
This article or section is in need of attention from an expert on the subject. ...
In statistics, the delta method is a method for deriving an approximate probability distribution for a function of an asymptotically normal statistical estimator from knowledge of the limiting variance of that estimator. ...
In statistics, Deming regression, named after W. Edwards Deming, is a method of linear regression that finds a line of best fit for a set of related data. ...
Demographics refers to selected population characteristics as used in government, marketing or opinion research, or the demographic profiles used in such research. ...
Map of countries by population Population growth showing projections for later this century Demography is the statistical study of human populations. ...
Among the kinds of data that national leaders need are the demographic statistics of their population. ...
In probability and statistics, density estimation is the construction of an estimate, based on observed data, of an unobservable underlying probability density function. ...
A design matrix is a matrix that is used in the certain statistical models, e. ...
Descriptive statistics are used to describe the basic features of the data in a study. ...
The first statistician to consider a methodology for the design of experiments was Sir Ronald A. Fisher. ...
Detection theory, or signal detection theory, is a means to quantify the ability to discern between signal and noise. ...
In statistics, deviance is a quantity whose expected values can be used for statistical hypothesis testing. ...
The DIC (Deviance Information Criteria) is a hierarchical modeling generalization of the AIC (Akaike Information Criteria). ...
In statistics, the DickeyFuller test tests whether a unit root is present in an autoregressive model. ...
In statistics, dimension reduction is the process of reducing the number of random variables under consideration, and can be divided into feature selection and feature extraction. ...
Circular or directional statistics is the subdiscipline of statistics that deals with circular or directional data. ...
Discrete choice analysis is a statistical technique. ...
In mathematics, a probability distribution assigns to every interval of the real numbers a probability, so that the probability axioms are satisfied. ...
In regression analysis, a dummy variable is one that takes the values 0 or 1 to indicate the absence or presence of some categorical effect that may be expected to shift the outcome. ...
In statistics, Duncans new Multiple Range Test (MRT) is a multiple comparison procedure developed by David B Duncan in 1955. ...
In gambling a Dutch book or lock is a set of odds and bets which guarantees a profit, no matter what the outcome of the gamble. ...
E An ecological correlation is a correlation between two variables that are group means, in contrast to a correlation between two variables that describe individuals. ...
The ecological fallacy is a widely recognised error in the interpretation of statistical data, whereby inferences about the nature of individuals are based solely upon aggregate statistics collected for the group to which those individuals belong. ...
Econometrics literally means economic measurement. It is the branch of economics that applies statistical methods to the empirical study of economic theories and relationships. ...
The Edgeworth series or GramCharlier A series, named in honor of Francis Ysidro Edgeworth, are series that approximate a probability distribution in terms of its cumulants. ...
In statistics, effect size is a measure of the strength of the relationship between two variables. ...
In statistics, efficiency is one measure of desirability of an estimator. ...
In statistics, empirical Bayes methods involve: An underlying probability distribution of some unobservable quantity assigned to each member of a statistical population. ...
In statistics, an empirical distribution function is a cumulative probability distribution function that concentrates probability 1/n at each of the n numbers in a sample. ...
Suppose is a sample space of observations. ...
Energy statistics refers to collecting, compiling, analyzing and disseminating data on commodities such as coal, crude oil, natural gas, electricity, or renewable energy sources (biomass, geothermal, wind or solar energy), when they are used for the energy they contain. ...
This article is one of a group being considered for deletion in accordance with Wikipedias deletion policy. ...
Full name Tore Olaus Engset born 1865, died 1943. ...
Agner Krarup Erlang (January 1, 1878–February 3, 1929) was a Danish mathematician, statistician, and engineer who invented the fields of queueing theory and traffic engineering. ...
In statistics and optimization, the concepts of error and residual are easily confused with each other. ...
ErrorsinVariables is a robust modeling technique in statistics, which assumes that every variable can have error or noise. ...
Estimation is the calculated approximation of a result which is usable even if input data may be incomplete, uncertain, or noisy. ...
Estimation theory is a branch of statistics and signal processing that deals with estimating the values of parameters based on measured/empirical data. ...
In multivariate statistics, the importance of the Wishart distribution stems in part from the fact that it is the probability distribution of the maximum likelihood estimator of the covariance matrix of a multivariate normal distribution. ...
In statistics, an estimator is a function of the observable sample data that is used to estimate an unknown population parameter; an estimate is the result from the actual application of the function to a particular set of data. ...
In population genetics, Ewenss sampling formula, introduced by Warren Ewens, states that under certain conditions (specified below), if a random sample of n gametes is taken from a population and classified according to the gene at a particular locus then the probability that there are a1 alleles represented once...
An exact (significance) test is a test where all assumptions that the derivation of the distribution of the test statistic is based on are met. ...
In probability theory the expected value (or mathematical expectation) of a random variable is the sum of the probability of each possible outcome of the experiment multiplied by its payoff (value). Thus, it represents the average amount one expects as the outcome of the random trial when identical odds are...
In statistical computing, an expectationmaximization (EM) algorithm is an algorithm for finding maximum likelihood estimates of parameters in probabilistic models, where the model depends on unobserved latent variables. ...
Experimental research designs are used for the controlled testing of causal processes. ...
In statistics, an explained sum of squares (ESS) is the sum of squared predicted values in a standard regression model (for example ), where is the response variable, is the explanatory variable, and are coefficients, indexes the observations from to , and is the error term. ...
In statistics, an explanatory variable (also regressor or independent variable) is a variable in a regression model which appears on the right hand side of the equation. ...
Exploratory data analysis (EDA) is that part of statistical practice concerned with reviewing, communicating and using data where there is a low level of knowledge about its cause system. ...
In probability theory and statistics, the exponential distributions are a class of continuous probability distribution. ...
In probability and statistics, an exponential family is any class of probability distributions having a certain form. ...
In statistics, exponential smoothing refers to a particular type of moving average technique applied to time series data, either to produce smoothed data for presentation, or to make forecasts. ...
Extreme value theory is a branch of statistics dealing with the extreme deviations from the median of probability distributions. ...
F Failure rate is the frequency with which an engineered system or component fails, expressed for example in failures per hour. ...
In statistics and probability, the Fdistribution is a continuous probability distribution. ...
An Ftest is any statistical test in which the test statistic has an Fdistribution if the null hypothesis is true. ...
Factor analysis is a statistical data reduction technique used to explain variability among observed random variables in terms of fewer unobserved random variables called factors. ...
In statistics, a factorial experiment is an experiment whose design consists of two or more factors, each with discrete possible values or levels, and whose experimental units take on all possible combinations of these levels across all such factors. ...
Coin flipping or coin tossing is the practice of throwing a coin in the air to resolve a dispute between two parties. ...
False discovery rate (FDR) control is a statistical method used in multiple hypothesis testing to correct for multiple comparisons. ...
Type I errors (or Î± error, or false positive) and type II errors (Î² error, or a false negative) are two terms used to describe statistical errors. ...
Type I errors (or Î± error, or false positive) and type II errors (Î² error, or a false negative) are two terms used to describe statistical errors. ...
In statistics, familywise error rate (FWER) is the probability of making one or more false discoveries, or type I errors among all the hypotheses when performing multiple pairwise tests[1][2]. // The m specific hypotheses of interest are assumed to be known in advance, but the numbers of true null...
In applied statistics, the file drawer problem results from the fact that academics tend not to publish results that indicate the null hypothesis could not be rejected. ...
In statistics and information theory, the Fisher information (denoted ) is the variance of the score. ...
Sir Ronald Aylmer Fisher, FRS (17 February 1890 â€“ 29 July 1962) was an English statistician, evolutionary biologist, and geneticist. ...
Fishers exact test is a statistical significance test used in the analysis of categorical data where sample sizes are small. ...
Linear discriminant analysis (LDA), is sometimes known as Fishers linear discriminant, after its inventor, Ronald A. Fisher, who published it in The Use of Multiple Measures in Taxonomic Problems (1936). ...
In statistics, Fishers method is a data fusion or metaanalysis (analysis after analysis) technique for combining the results from a variety of independent tests bearing upon the same overall hypothesis (H0) as if in a single large test. ...
In statistics, hypotheses about the value of r, the correlation coefficient between variables x and y of the underlying population, can be tested using the Fisher transformation applied to r. ...
It has been suggested that this article or section be merged with fixed effects model. ...
// Fleiss kappa is a generalisation of Scotts pi statistic, a statistical measure of interrater reliability. ...
In statistics a forecast error is the difference between the actual/real and the predicted/forecast value of a time series. ...
A forest plot is a graph displaying the results of multiple studies in a metaanalysis. ...
In statistics, fractional factorial designs are experimental designs consisting of a carefully chosen subset (fraction) of the experimental runs of a full factorial design. ...
FreedmanDiaconis rule is used to specify the number of bins to be used in a histogram. ...
In statistics, a frequency distribution is a list of the values that a variable takes in a sample. ...
Statistical regularity has motivated the development of the relative frequency concept of probability. ...
Functional data analysis is a series of techniques in statistics for characterizing a series of data points as a single piece of data. ...
G In statistics, Gtests are likelihoodratio or maximum likelihood statistical significance tests that are increasingly being used in situations where chisquare tests were previously recommended. ...
The GaltonWatson process is a stochastic process arising from Francis Galtons statistical investigation of the extinction of surnames. ...
Galtonâ€™s problem, named after Sir Francis Galton, is the problem of drawing inferences from crosscultural data, due to the statistical phenomenon now called autocorrelation. ...
In probability theory and statistics, the gamma distribution is a twoparameter family of continuous probability distributions. ...
This article is not about GaussMarkov processes. ...
In statistics, the generalized canonical correlation analysis (gCCA), is a way of making sense of crosscorrelation matrices between the sets of random variables when there are more than two sets. ...
In statistics, the generalized linear model (GLM) is a useful generalization of ordinary least squares regression. ...
The generalized method of moments is a very general statistical method for obtaining estimates of parameters of statistical models. ...
There are very few or no other articles that link to this one. ...
This article or section is in need of attention from an expert on the subject. ...
In mathematics and physics, Gibbs sampling is an algorithm to generate a sequence of samples from the joint probability distribution of two or more random variables. ...
Graphical representation of the Gini coefficient The Gini coefficient is a measure of inequality of income distribution or inequality of wealth distribution. ...
Goodâ€“Turing Frequency Estimation is a statistical technique for predicting the probability of occurrence of objects belonging to an unknown number of species, given past observations of such objects and their species. ...
Goodness of fit means how well a statistical model fits a set of observations. ...
William Sealy Gosset (June 13, 1876 – October 16, 1937) was a chemist and statistician, better known by his pen name Student. ...
An n×n GraecoLatin square is a table, each cell of which contains a pair of symbols, composed of a symbol from each of two sets of n elements. ...
In probability theory and statistics, a graphical model (GM) represents dependencies among random variables by a graph in which each random variable is a node. ...
H In economics, the Herfindahl index is a measure of the size of firms in relationship to the industry and an indicator of the amount of competition among them. ...
In statistics, Halton sequences are wellknown quasirandom sequences, first introduced in 1960 as an alternative to pseudorandom number sequences. ...
In statistics, the HannanQuinn information criterion (HQC) is an alternative to Akaike Information Criterion (AIC) and Bayesian information criterion (BIC). ...
The Hausman specification test is the first easy method allowing scientists to evaluate if their statistical models correspond to the data. ...
The hazard ratio in survival analysis is a summary of the difference between two survival curves, representing the reduction in the risk of death on treatment compared to control, over the period of followup. ...
In statistics, a sequence or a vector of random variables is heteroskedastic if the random variables in the sequence or vector may have different variances. ...
In statistics, a frequent assumption in linear regression is that the disturbances ui have the same variance. ...
State transitions in a hidden Markov model (example) x â€” hidden states y â€” observable outputs a â€” transition probabilities b â€” output probabilities A hidden Markov model (HMM) is a statistical model in which the system being modeled is assumed to be a Markov process with unknown parameters, and the challenge is to...
Hierarchical linear modeling (HLM), also known as multilevel analysis, is a more advanced form of simple linear regression and multiple linear regression. ...
For the histogram used in digital image processing, see Color histogram. ...
In statistics, the HolmBonferroni method [1] performs more than one hypothesis test simultaneously. ...
This article or section is in need of attention from an expert on the subject. ...
In statistics, a sequence or a vector of random variables is homoscedastic if all random variables in the sequence or vector have the same finite variance. ...
In statistics, Hotellings Tsquare statistic, named for Harold Hotelling, is a generalization of Students t statistic that is used in multivariate hypothesis testing. ...
The Howland will forgery trial was a US court case in 1868 to decide Henrietta Howland Robinsons contest of the will of Sylvia Ann Howland. ...
In econometrics, HuberWhite standard errors are standard errors that are adjusted for correlations of error terms across observations, especially in panel and survey data as well as data with cluster structure. ...
The Hubbert curve, named after the geophysicist M. King Hubbert, is the derivative of the logistic curve. ...
I Here is an illustration of the central limit theorem. ...
There is also an imputation disambiguation page. ...
Independent component analysis (ICA) is a computational method for separating a multivariate signal into additive subcomponents supposing the mutual statistical independence of the nonGaussian source signals. ...
In probability theory, a sequence or other collection of random variables is independent and identically distributed (i. ...
For probability distributions having an expected value and a median, the mean (i. ...
It has been suggested that this article or section be merged with statistical inference. ...
The information bottleneck method is a technique for finding the best tradeoff between accuracy and compression when summarizing (e. ...
This article or section is in need of attention from an expert on the subject. ...
In statistics, an instrumental variable (IV, or instrument) can be used in regression analysis to produce a consistent estimator when the explanatory variables (covariates) are correlated with the error terms. ...
A graphical depiction of a statistical interaction in which the extent to which experience impacts cost depends on decision time. ...
In statistics, the interclass correlation (or interclass correlation coefficient) measures a bivariate relation among variables. ...
In statistics, interclass dependence (or class interdependence) means that the occurrence of one class is probabilistically dependent on other classes that may occur in the same space. ...
In descriptive statistics, the interquartile range (IQR), also called the midspread and middle fifty is the range between the third and first quartiles and is a measure of statistical dispersion. ...
Interrater reliability or Interrater agreement is the measurement of agreement between raters. ...
In statistics, interval estimation is the use of sample data to calculate an interval of possible (or probable) values of an unknown population parameter. ...
An intervening variable is a hypothetical construct that attempts to explain relationships between variables, and especially the relationships between independent variables and dependent variables. ...
In statistics, the intraclass correlation (or the intraclass correlation coefficient[1]) is a measure of correlation, consistency or conformity for a data set when it has multiple groups. ...
In statistics, the Inverse Wishart distribution, also the inverse Wishart distribution and inverted Wishart distribution is a probability density function defined on matrices. ...
Inverse transform sampling , also known as the probability integral transform, is a method of sampling a number at random from any probability distribution given its cumulative distribution function (cdf). ...
Item response theory (IRT) is a body of related psychometric theory that provides a foundation for scaling persons and items based on responses to assessment items. ...
The method of iteratively reweighted least squares (IRLS) is a numerical algorithm for minimizing any specified objective function using a standard weighted least squares method such as Gaussian elimination. ...
J The JamesStein estimator is a nonlinear estimator which can be shown to dominate, or outperform, the ordinary (least squares) estimator. ...
In statistics, the JarqueBera test is a goodnessoffit measure of departure from normality, based on the sample kurtosis and skewness. ...
In Bayesian probability, the Jeffreys prior is a noninformative prior distribution proportional to the square root of the Fisher information: and is invariant under reparameterization of . ...
K The KaplanMeier estimator (also known as the Product Limit Estimator) estimates the survival function from lifetime data. ...
Cohens kappa coefficient is a statistical measure of interrater reliability. ...
A kappa statistic is a measure of degree of nonrandom agreement between observers and/or measurements of a specific categorical variable. ...
The Kendall tau distance is a metric that counts the number of pairwise disagreements between two lists. ...
The Kendall tau rank correlation coefficient (or simply the Kendall tau coefficient, Kendalls Ï„ or Tau test(s)) is used to measure the degree of correspondence between two rankings and assessing the significance of this correspondence. ...
The 5parameter FisherBingham distribution or Kent distribution is a probability distribution on the threedimensional sphere. ...
A Kernel is a weighting function used in nonparametric estimation techniques. ...
In statistics, the KolmogorovSmirnov test (often called the KS test) is used to determine whether two underlying probability distributions differ, or whether an underlying probability distribution differs from a hypothesized distribution, in either case based on finite samples. ...
Kriging is group of geostatistical techniques to interpolate the value of a random field (e. ...
In statistics, the KruskalWallis oneway analysis of variance by ranks (named after William Kruskal and Allen Wallis) is a nonparametric method. ...
In statistics, Kuipers test is closely related to the more wellknown KolmogorovSmirnov test (or KS test as it is often called). ...
In probability theory and information theory, the KullbackLeibler divergence (or information divergence, or information gain, or relative entropy) is a natural distance measure from a true probability distribution P to an arbitrary probability distribution Q. Typically P represents data, observations, or a precise calculated probability distribution. ...
The far red light has no effect on the average speed of the gravitropic reaction in wheat coleoptiles, but it changes kurtosis from platykurtic to leptokurtic (0. ...
L Latent variables, as opposed to observable variables, are those variables that cannot be directly observed but are rather inferred from other variables that can be observed and directly measured. ...
It has been suggested that this article or section be merged with latent variable. ...
In statistics the latent class model (LCM) relates a set of discrete multivariate variables to a set of latent variables. ...
A Latin square is an n Ã— n table filled with n different symbols in such a way that each symbol occurs exactly once in each row and exactly once in each column. ...
The statistical method Latin hypercube sampling (LHS) was developed along by Ronald L. Iman, J. C. Helton, and J. E. Campbell, et al to generate a distribution of plausible collections of parameter values from a multidimensional distribution. ...
// The law of large numbers (LLN) is any of several theorems in probability. ...
In probability theory and mathematical statistics, the law of total cumulance is a generalization to cumulants of the law of total probability, the law of total expectation, and the law of total variance. ...
The proposition in probability theory known as the law of total expectation, or the law of iterated expectations, or perhaps by any of a variety of other names, states that if X is an integrable random variable (i. ...
Nomenclature in probability theory is not wholly standard. ...
In probability theory, the law of total variance states that if X and Y are random variables on the same probability space, and the variance of X is finite, then In language perhaps better known to statisticians than to probabilists, the first term is the unexplained component of the variance...
In regression analysis, least squares, also known as ordinary least squares analysis, is a method for linear regression that determines the values of unknown quantities in a statistical model by minimizing the sum of the residuals (the difference between the predicted and observed values) squared. ...
In statistics, computational learning theory is a mathematical field related to the analysis of machine learning algorithms. ...
In statistics, the LehmannScheffÃ© theorem states the any estimator that is complete, sufficient, and unbiased is the unique best unbiased estimator of its expectation. ...
In statistics, Levenes test compares the variances of samples. ...
The level of measurement of a variable in mathematics and statistics is a classification that was proposed in order to describe the nature of information contained within numbers assigned to objects and, therefore, within the variable. ...
This wellknown saying is part of a phrase attributed to Benjamin Disraeli and popularized in the U.S. by Mark Twain: There are three kinds of lies: lies, damned lies, and statistics. ...
This article is about the measure of remaining life. ...
In statistics, the likelihood principle is a controversial principle of statistical inference which asserts that all of the information in a sample is contained in the likelihood function. ...
A likelihoodratio test is a statistical test relying on a test statistic computed by taking the ratio of the maximum value of the likelihood function under the constraint of the null hypothesis to the maximum with that constraint relaxed. ...
In statistics, the Lilliefors test, named after Hubert Lilliefors, professor of statistics at George Washington University, is an adaptation of the KolmogorovSmirnov test. ...
Linear discriminant analysis (LDA) and the related Fishers linear discriminant are methods used in statistics and machine learning to find the linear combination of features which best separate two or more classes of object or event. ...
In statistics the linear model is given by where Y is an nÃ—1 column vector of random variables, X is an nÃ—p matrix of known (i. ...
Linear prediction is a mathematical operation where future values of a discretetime signal are estimated as a linear function of previous samples. ...
In statistics, linear regression is a regression method that models the relationship between a dependent variable Y, independent variables Xi, i = 1, ..., p, and a random term Îµ. The model can be written as Example of linear regression with one dependent and one independent variable. ...
This is a list of probability topics, by Wikipedia page. ...
This is an incomplete list of software that is designed for the explicit purpose of performing statistical analyses. ...
Statisticians or people who made notable contributions to the theories of statistics, or related aspects of probability, or machine learning: // Odd Olai Aalen (1947â€“) Gottfried Achenwall (1719â€“1772) Abraham Manie Adelstein (1916â€“1992) John Aitchison (1926â€“) Alexander Aitken (1895â€“1967) Aleyamma George Hirotsugu Akaike (1927â€“) Oskar Anderson (1887â€“1960) Peter...
LOESS is one of many modern modeling methods that build on classical methods, such as linear and nonlinear least squares regression. ...
In statistics, if a family of probabiblity densities parametrized by a scalar or vectorvalued parameter μ is of the form fμ(x) = f(x − μ) then μ is called a location parameter, since its value determines the location of the probability distribution. ...
In probability theory, especially as that field is used in statistics, a locationscale family is a set of probability distributions on the real line parametrized by a location parameter Î¼ and a scale parameter Ïƒ â‰¥ 0; if X is any random variable whose probability distribution belongs to such a family, then...
In mathematics, especially as applied in statistics, the logit (pronounced with a long o and a soft g, IPA ) of a number p between 0 and 1 is This function is used in logistic regression. ...
In probability and statistics, the lognormal distribution is the probability distribution of any random variable whose logarithm is normally distributed (the base of the logarithmic function is immaterial in that loga X is normally distributed if and only if logb X is normally distributed). ...
It has been suggested that this article or section be merged with The long tail. ...
This article needs to be cleaned up to conform to a higher standard of quality. ...
The Lorenz curve is a graphical representation of the cumulative distribution function of a probability distribution; it is a graph showing the proportion of the distribution assumed by the bottom y% of the values. ...
In statistics, decision theory and economics, a loss function is a function that maps an event (technically an element of a sample space) onto a real number representing the economic cost or regret associated with the event. ...
M In statistics, Mestimators are a type of estimator whose properties are quite wellknown. ...
As a broad subfield of artificial intelligence, machine learning is concerned with the design and development of algorithms and techniques that allow computers to learn. At a general level, there are two types of learning: inductive, and deductive. ...
In statistics, Mahalanobis distance is a distance measure introduced by P. C. Mahalanobis in 1936. ...
Main effect is the term used in research methods for the effect that is produced by the average of an independent variable that has been produced over another independent variable. ...
Majorization is a mathematical relation. ...
In statistics, the MannWhitney U test (also called the MannWhitneyWilcoxon (MWW), Wilcoxon ranksum test, or WilcoxonMannWhitney test) is a nonparametric test for assessing whether two samples of observations come from the same distribution. ...
Multivariate analysis of variance (MANOVA) is an extension of analysis of variance (ANOVA) methods to cover cases where there is more than one dependent variable and where the dependent variables cannot simply be combined. ...
The Mantel test is a statistical test of the correlation between two matrices. ...
In statistics, MAP estimates come from maximizing the likelihood function multiplied by an a priori distribution. ...
The top portion of this graphic depicts probability densities (for a binomial distribution) that show the relative likelihood that the true percentage is in a particular area given a reported percentage of 50%. The bottom portion of this graphic shows the margin of error, the corresponding zone of 95% confidence. ...
In probability theory, given two jointly distributed random variables X and Y, the marginal distribution of X is simply the probability distribution of X ignoring information about Y, typically calculated by summing or integrating the joint probability distribution over Y. For discrete random variables, the marginal probability mass function can...
In Bayesian probability theory, a marginal likelihood function is a likelihood function integrated over some variables, typically model parameters. ...
In statistics, marginal models (Heagerty & Zeger, 2000) are a technique for obtaining regression estimates in multilevel modeling, also called hierarchical linear models. ...
Markov chain geostatistics applies Markov chains in geostatistics for conditional simulation on sparse observed data; see Li et al. ...
Markov chain Monte Carlo (MCMC) methods (which include random walk Monte Carlo methods) are a class of algorithms for sampling from probability distributions based on constructing a Markov chain that has the desired distribution as its stationary distribution. ...
It is possible to model mathematically the progress of most infectious diseases to discover the likely outcome of an epidemic or to help manage them by vaccination. ...
Mathematical statistics uses probability theory and other branches of mathematics to study statistics from a purely mathematical standpoint. ...
Maximum likelihood estimation (MLE) is a popular statistical method used to make inferences about parameters of the underlying probability distribution from a given data set. ...
Maximum parsimony, often simply referred to as parsimony, is a nonparametric statistical method commonly used in computational phylogenetics for estimating phylogenies. ...
In statistics, McNemars test is a nonparametric method used on nominal data to determine whether the row and column marginal frequencies are equal. ...
In probability theory the expected value (or mathematical expectation) of a random variable is the sum of the probability of each possible outcome of the experiment multiplied by its payoff (value). Thus, it represents the average amount one expects as the outcome of the random trial when identical odds are...
The absolute deviation of an element of a data set is the absolute difference between that element and a given point. ...
The mean difference is a measure of statistical dispersion equal to the average absolute difference of two independent values drawn from a probability distribution. ...
In mathematics, a mean of circular quantities is a mean which suited for quantities like angles, daytimes, fractional parts of real numbers. ...
Mean reciprocal rank is a statistic for evaluating any process that produces a list of possible responses to a query, ordered by probability of correctness. ...
In statistics the mean squared error of an estimator T of an unobservable parameter Î¸ is i. ...
In statistics the mean squared prediction error of a smoothing procedure is the expected sum of squared deviations of the fitted values from the (unobservable) function . ...
The level of measurement of a variable in mathematics and statistics is a classification that was proposed in order to describe the nature of information contained within numbers assigned to objects and, therefore, within the variable. ...
In probability theory and statistics, a median is a type of average that is described as the number dividing the higher half of a sample, a population, or a probability distribution, from the lower half. ...
In statistics, the median test is a special case of Pearsons chisquare test. ...
Mean time between failures (MTBF) is the mean (average) time between failures of a system, the reciprocal of the failure rate in the special case when the failure rate is constant. ...
In probability theory, memorylessness is a property of certain probability distributions: the exponential distributions and the geometric distributions. ...
A metaanalysis is a statistical practice of combining the results of a number of studies. ...
In statistics, the method of moments is a method of estimation of population parameters such as mean, variance, median, etc. ...
The Proposal distribution Q proposes the next point that the random walk might move to. ...
In statistics, the midhinge, Hspread, or interquartile range is the difference of the first and third quartiles. ...
The midrange of a set of statistical data values is the arithmetic mean of the smallest and largest values in the set. ...
â€œMinmaxâ€ redirects here. ...
In statistics, and more specifically in estimation theory, a minimumvariance unbiased estimator (MVUE or MVU estimator) is an unbiased estimator of parameters, whose variance is minimized for all values of the parameters. ...
To meet Wikipedias quality standards, this article or section may require cleanup. ...
In statistics, the theory of minimum norm quadratic unbiased estimation (MINQUE) was developed by C.R. Rao. ...
A misuse of statistics occurs when a statistical argument asserts a falsehood. ...
In mathematics, the term mixture model is a model in which independent variables are fractions of a total. ...
Model selection is the task of selecting a mathematical model from a set of potential models, given evidence. ...
The Modifiable Areal Unit Problem (MAUP) is a potential source of error that can affect spatial studies which utilise aggregate data sources (Unwin, 1996). ...
1...
In probability theory and statistics, the momentgenerating function of a random variable X is wherever this expectation exists. ...
In statistics, the method of moments is a method of estimation of population parameters such as mean, variance, median, etc. ...
The term moving average is used in different contexts. ...
Multicollinearity is a statistical term for the existence of a high degree of linear correlation amongst two or more explanatory variables in a regression model. ...
The technique is also used in marketing, see Multidimensional scaling in marketing Multidimensional scaling (MDS) are a set of related statistical techniques often used in data visualisation for exploring similarities or dissimilarities in a given data set. ...
In statistics, multilevel models are used when some variable under study varies at more than one level. ...
In statistics, the multiple comparisons problem tests null hypotheses stating that the averages of several disjoint populations are equal to each other (homogeneous). ...
// Introduction In statistics, regression analysis is a method for explanation of phenomena and prediction of future events. ...
Multiple Testing Correction refers to recalculating probabilities obtained from a statistical test which was repeated multiple times. ...
Multivariate statistics or multivariate statistical analysis in statistics describes a collection of procedures which involve observation and analysis of more than one statistical variable at a time. ...
In probability theory and statistics, a multivariate normal distribution, also sometimes called a multivariate Gaussian distribution, is a specific probability distribution, which can be thought of as a generalization to higher dimensions of the onedimensional normal distribution (also called a Gaussian distribution). ...
In statistics, a multivariate Student distribution is a multivariate generalization of the Students tdistribution. ...
N National statistical services Australia: Australian Bureau of Statistics Brazil: Brazilian Institute of Geography and Statistics (IBGE) Belgium: Statistics Belgium Canada: Statistics Canada Colombia: Departamento Administrativo Nacional de Estadistica (DANE) Denmark: Danmarks statistik  http://www. ...
In probability and statistics the negative binomial distribution is a discrete probability distribution. ...
In statistics, a relationship between two variables is negative if the slope in a corresponding graph is negative, orâ€”what is in some contexts equivalentâ€”if the correlation between them is negative. ...
In statistics, the NeymanPearson lemma states that when doing a hypothesis test between two point hypotheses H0: θ=θ0 and H1: θ=θ1, then the likelihoodratio test which rejects H0 in favour of H1 when is the most powerful test of size α. ...
In probability theory and statistics, the noncentral chi distribution is a generalization of the chi distribution. ...
In probability theory and statistics, the noncentral chisquare or noncentral distribution is a generalization of the chisquare distribution. ...
In probability theory and statistics, the noncentral Fdistribution is a continuous probability distribution that is a generalization of the (ordinary) Fdistribution. ...
// In statistics, the hypergeometric distribution is the discrete probability distribution generated by picking colored balls at random from an urn without replacement. ...
High dimensional data can be difficult to interpret. ...
dataset with approximating polynomials Nonlinear regression in statistics is the problem of fitting a model to multidimensional x,y data, where f is a nonlinear function of x with parameters Î¸. In general, there is no algebraic expression for the bestfitting parameters, as there is in linear regression. ...
NMF redirects here. ...
NonParametric statistics are statistics where it is not assumed that the population fits any parametrized distributions. ...
Sampling is the use of a subset of the population to represent the whole population. ...
The normal distribution, also called the Gaussian distribution, is an important family of continuous probability distributions, applicable in many fields. ...
In statistics, the rankits of the data points in a data set consisting simply of a list of scalars are expected values of order statistics of the standard normal distribution corresponding to data points in a manner determined by the order in which the data points appear. ...
In statistics, normality tests are concerned with determining whether or not a random variable is normally distributed. ...
In probability theory, it is almost a cliche to say that uncorrelatedness of two random variables does not entail independence. ...
In industrial statistics, the npchart is a type of control chart that is very similar to the pchart except that the statistic being plotted is a number count rather than a sample proportion of items. ...
In statistics, a null hypothesis is a hypothesis set up to be nullified or refuted in order to support an alternative hypothesis. ...
O Observational error is the difference between a measured value of quantity and its true value. ...
In probability theory and statistics the odds in favor of an event or a proposition are the quantity p / (1 âˆ’ p), where p is the probability of the event or proposition. ...
The oddsratio is a statistical measure, particularly important in Bayesian statistics and logistic regression. ...
Omittedvariable bias is the bias that appears in an estimate of a parameter if a regression run does not have the appropriate form and data for other parameters. ...
An opinion poll is a survey of opinion from a particular sample. ...
Probability distributions for the n = 5 order statistics of an exponential distribution with Î¸ = 3. ...
In statistics, ordered logit is a flavor of the popular logit analysis, used for ordinal dependent variables. ...
In statistics, ordered probit is a flavor of the popular probit analysis, used for ordinal dependent variables. ...
In community ecology, ordination is a method of multivariate analysis complementary to data clustering, and used mainly in exploratory data analysis (rather than in hypothesis testing). ...
Figure 1. ...
Noisy (roughly linear) data is fit to both linear and polynomial functions. ...
P Pchart The P chart is very similar to the Xbar chart except that the statistic being plotted is the sample proportion rather than the sample mean. ...
In statistics, the Page test for multiple comparisons between ordered alternatives is a generalisation of the test of the statistical significance of a correlation performed using Spearmans rank correlation coefficient. ...
Paleontology often faces phenomena so vast and complex they can be described only through statistics. ...
In statistics, parallel factor analysis (PARAFAC) also named canonical decomposition (candecomp) is a multiway method originating from psychometrics. ...
It has been suggested that this article or section be merged into parametric statistics. ...
Parametric inferential statistical methods are mathematical procedures for statistical hypothesis testing which assume that the distributions of the variables being assessed belong to known parametrized families of probability distributions. ...
The factual accuracy of this article is disputed. ...
In statistics, the method of partial least squares bears some relation to principal component analysis; instead of finding the hyperplanes of maximum variance, it finds a linear model describing some predicted variables in terms of other observable variables. ...
In probability theory and statistics, partial correlation measures the degree of association between two random variables, with the effect of a set of controlling random variables removed. ...
In statistics, the Parzen window method (or kernel density estimation), named after Emanuel Parzen, is a way of estimating the probability density function of a random variable. ...
In project management, path analysis (also known as critical path analysis) is a technique to analyse events. ...
Path coefficients are linear regression weights expressing the causal linkage between statistical variables in the structural equation modeling approach. ...
Karl Pearson FRS (March 27, 1857 â€“ April 27, 1936) established the discipline of mathematical statistics. ...
Pearsons chisquare test (Ï‡2) is one of a variety of chisquare tests â€“ statistical procedures whose results are evaluated by reference to the chisquare distribution. ...
A chisquare test is any statistical hypothesis test in which the test statistic has a chisquare distribution if the null hypothesis is true. ...
In statistics, the Pearson productmoment correlation coefficient (sometimes known as the PMCC) (r) is a measure of the correlation of two variables X and Y measured on the same object or organism, that is, a measure of the tendency of the variables to increase or decrease together. ...
The percentile rank of a score is the percentage of scores in its frequency distribution which are lower. ...
In statistics, the periodic variation of a time series is its cyclic variation, either regular or semiregular. ...
To meet Wikipedias quality standards, this article or section may require cleanup. ...
In statistics, the exponential family of probability density functions or probability mass functions comprises those that have the following form: where: h(x) is the reference density, η is the natural parameter, a column vector, so that ηT, its transpose, is a row vector, T(x) is called the...
In statistics, a pivotal quantity is a function of Y1,...,Yn whose distribution does not depend on unknown parameters. ...
In statistics, point estimation involves the use of sample data to calculate a single value (known as a statistic) which is to serve as a best guess for an unknown (fixed or random) population parameter. ...
In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a number of events occurring in a fixed period of time if these events occur with a known average rate, and are independent of the time since the last event. ...
It has been suggested that this article be split into multiple articles. ...
In statistics, the Poisson regression model attributes to a response variable Y a Poisson distribution whose expected value depends on a predictor variable x (written in lower case because the model treats x as nonrandom, in the following way: (where log means natural logarithm). ...
In statistics, polychoric correlation is a technique for estimating the correlation between two theorised normally distributed continuous latent variables, from two ordinal variables. ...
Population dynamics is the study of marginal and longterm changes in the numbers, individual weights and age composition of individuals in one or several populations, and biological and environmental processes influencing those changes. ...
Population modeling is an application of statistical models to the study of changes in populations. ...
Population statistics is the use of statistics to analyze characteristics or changes to a population. ...
Population viability analysis is a branch of conservation biology dealing with techniques for determining the genetic diversity, spatial and temporal features of a population so as to evaluate the risk of extinction for that population. ...
The posterior probability can be calculated by Bayes theorem from the prior probability and the likelihood function. ...
In statistics, a prediction interval bears the same relationship to a future observation that a confidence interval bears to an unobservable population parameter. ...
Principal components analysis (PCA) is a technique used to reduce multidimensional data sets to lower dimensions for analysis. ...
A prior probability is a marginal probability, interpreted as a description of what is known about a variable in the absence of some evidence. ...
Probability is the likelihood that something is the case or will happen. ...
In mathematics, a probability density function (pdf) is a function that represents a probability distribution in terms of integrals. ...
In mathematics and statistics, a probability distribution is a function of the probabilities of a mutually exclusive and exhaustive set of events. ...
In probability theory, a probability mass function (abbreviated pmf) gives the probability that a discrete random variable is exactly equal to some value. ...
Probability of error in hypothesis testing In hypothesis testing in statistics, two types of error are distinguished. ...
Probability theory is the branch of mathematics concerned with analysis of random phenomena. ...
In probability theory and statistics the probit function is the inverse cumulative distribution function, or quantile function of the normal distribution. ...
In mathematics, a proper linear model is a linear model in which the weights given to the predictor variables are chosen in such a way as to optimize the relationship between the prediction and the criterion. ...
// Proportional hazards models are a subclass of survival models in statistics. ...
To meet Wikipedias quality standards, this article or section may require cleanup. ...
Psephology is a term for the statistical study of elections. ...
A pseudocount is a count added to observed data in order to change the probability in a model of those data, which is known not to be zero, to being negligible rather than being zero. ...
Psychological statistics is the application of statistics to psychology. ...
In statistical hypothesis testing, the pvalue of a random variable T used as a test statistic is the probability that T will assume a value at least as extreme as the observed value tobserved, given that a null hypothesis being considered is true. ...
Pythagorean expectation is a formula invented by Bill James to estimate how many games a baseball team should have won based on the number of runs they scored and allowed. ...
Q In statistics, the Q test is used for identification and rejection of outliers. ...
In statistics, a QQ plot (Q stands for quantile) is a tool for diagnosing differences in distributions (such as nonnormality) of a population from which a random sample has been taken. ...
If is a vector of random variables, and is an dimensional square matrix, then the scalar quantity is known as a quadratic form in . ...
This article or section does not cite any references or sources. ...
Quantitative marketing research is a social research method that utilizes statistical techniques. ...
Quantitative psychological research is psychological research which performs statistical estimation or statistical inference. ...
Quantitative research is the systematic scientific investigation of quantitative properties and phenomena and their relationships. ...
In descriptive statistics, a quartile is any of the three values which divide the sorted data set into four equal parts, so that each part represents 1/4th of the sample or population. ...
In statistics, the quartile coefficient of dispersion is a descriptive statistic used to make comparisons within and between data sets. ...
This article is one of a group being considered for deletion in accordance with Wikipedias deletion policy. ...
Lambert Adolphe Jacques QuÃ©telet (February 22, 1796 â€“ February 17, 1874) was a Belgian astronomer, mathematician, statistician and sociologist. ...
R â€œRandomâ€ redirects here. ...
It has been suggested that this article or section be merged with random effects model. ...
It has been suggested that this article or section be merged with random effects estimation. ...
Sampling is that part of statistical practice concerned with the selection of individual observations intended to yield some knowledge about a population of concern, especially for the purposes of statistical inference. ...
A random sequence is a kind of stochastic process. ...
Randomization is the process of making something random; this can mean: Generating a random permutation of a sequence (such as when shuffling cards). ...
A randomized controlled trial (RCT) is a form of clinical trial, or scientific procedure used in the testing of the efficacy of medicines or medical procedures. ...
There are many practical measures of randomness for a binary sequence. ...
In descriptive statistics, the range is the length of the smallest interval which contains all the data. ...
Ranksize distribution or the ranksize rule or law describes the remarkable regularity in many phenomena including the distribution of city sizes around the world, size of businesses, particle sizes (such as sand), lengths of rivers, frequency of word usage, wealth among individuals, etc. ...
In statistics, the rankits of the data points in a data set consisting simply of a list of scalars are expected values of order statistics of the standard normal distribution corresponding to data points in a manner determined by the order in which the data points appear. ...
In statistics, the RaoBlackwell theorem describes a technique that can transform an absurdly crude estimator into an estimator that is optimal by the meansquarederror criterion or any of a variety of similar criteria. ...
Rasch models are probabilistic measurement models which currently find their application primarily in psychological and attainment assessment, and are being increasingly used in other areas, including the health profession and market research. ...
A ratio distribution (or quotient distribution) is a statistical distribution constructed as the distribution of the ratio of random variables having two other distributions. ...
The polytomous Rasch model is a measurement model that has potential application in any context in which the objective is to measure a trait or ability through a process in which responses to items are scored with successive integers. ...
In statistics and data analysis, a raw score is an original datum that has not been transformed â€“ for example, the original result obtained by a student on a test (i. ...
ROC curve of three epitope predictors. ...
In descriptive statistics and chaos theory, a recurrence plot (RP) is a plot showing, for a given moment in time, the times at which a phase space trajectory visits roughly the same area in the phase space. ...
// Description Recursive least squares algorithm is used in adaptive filters to find the filter coefficients that relate to producing the recursively least squares of the error signal (difference between the desired and the actual signal) Performance This algorithm converges faster than the LMS algorithm. ...
Recursive partitioning is a statistical method for the multivariable analysis of medical diagnostic tests. ...
In statistics, regression analysis examines the relation of a dependent variable (response variable) to specified independent variables (explanatory variables). ...
In statistics, linear regression is a regression method that models the relationship between a dependent variable Y, independent variables Xi, i = 1, ..., p, and a random term Îµ. The model can be written as Example of linear regression with one dependent and one independent variable. ...
// Introduction Regression dilution is a statistical phenomena also known as attenuation. Consider fitting a straight line (linear regression) for the relationship of an outcome variable y to a predictor variable x, and estimating the gradient (slope) of the line. ...
The regression (or regressive) fallacy is a logical fallacy. ...
Regression toward the mean refers to the fact that those with extreme scores on any measure at one point in time will, for purely statistical reasons, probably have less extreme scores the next time they are tested. ...
In mathematics, rejection sampling is a technique used to generate observations from a distribution. ...
In statistics and mathematical epidemiology, relative risk (RR) of an event associated with the exposure is a ratio of probability of outcome of interest in exposed group versus treatment group. ...
In statistics, reliability is the consistency of a set of measurements or measuring instrument. ...
Reliability theory developed apart from the mainstream of probability and statistics, and was used originally as a tool to help nineteenth century maritime insurance and life insurance companies compute profitable rates to charge their customers. ...
Reliability Theory of Aging and Longevity is a scientific approach aimed to gain theoretical insights into mechanisms of biological aging and species survival patterns by applying a general theory of systems failure, known as reliability theory. ...
In statistics, resampling is any of a variety of methods for doing one of the following: Estimating the precision of sample statistics (medians, variances, percentiles) by using subsets of available data (jackknife) or drawing randomly with replacement from a set of data points (bootstrapping) Exchanging labels on data points when...
In statistics and optimization, the concepts of error and residual are easily confused with each other. ...
In statistics, the residual sum of squares (RSS) is the sum of squares of residuals. ...
In statistics, a response variable (or response) is what one measures in an experiment. ...
Tikhonov regularization, is the most commonly used method of regularization of illposed problems. ...
Brian D. Ripley is a distinguished statistician, professor of Applied Statistics at the University of Oxford and a Professorial fellow at St Peters College. ...
In statistics, the Robbins lemma, named after Herbert Robbins, states that if X is a random variable with a Poisson distribution, and f is any function for which the expected value E(f(X)) exists, then Robbins introduced this proposition while developing empirical Bayes methods. ...
In robust statistics, robust regression is a form of regression analysis designed to circumvent the limitations of traditional parametric and nonparametric methods. ...
Robust statistics provides an alternative approach to classical statistical methods. ...
Buildings near the manor house The Rothamsted Experimental Station, one of the oldest agricultural research institutions in the world, is located at Harpenden in Hertfordshire, England. ...
The R programming language, sometimes described as GNU S, is a programming language and software environment for statistical computing and graphics. ...
The Rubin Causal Model (RCM) is an approach to the statistical analysis of cause and effect based on the framework of RCM is named after its originator, Donald Rubin, Professor of Statistics at Harvard University. ...
In probability theory, the rule of succession is a formula introduced in the 18th century by PierreSimon Laplace in the course of treating the sunrise problem. ...
S The sacramental index is a statistic sometimes used by Roman Catholic bishops as a rough approximation of how active a parish is, based on the occurrence of sacraments or rites of passage. ...
Sample mean and covariance are statistics computed from a collection of data, thought of as being random. ...
The sample size of a statistical sample is the number of repeated measurements that constitute it. ...
Sampling is that part of statistical practice concerned with the selection of individual observations intended to yield some knowledge about a population of concern, especially for the purposes of statistical inference. ...
In statistics, when analyzing collected data, the samples observed differ in such things as means and standard deviations from the population from which the sample is taken. ...
In statistics, a simple random sample from a population is a sample chosen randomly, in which each member of the population has the same probability of being chosen. ...
Systematic sampling is a statistical method involving the selection of every kth element from a sampling frame, where k, the sampling interval, is calculated as: k = population size (N) / sample size (n) Using this procedure each element in the population has a known and equal probability of selection. ...
In statistics, stratified sampling is a method of sampling from a population. ...
Cluster sampling is a sampling technique used when natural groupings are evident in a statistical population. ...
Multistage sampling is a complex form of cluster sampling. ...
Sampling is the use of a subset of the population to represent the whole population. ...
In mathematics and physics, Slice sampling is a type of Markov chain Monte Carlo sampling algorithm based on the observation that to sample a random variable one can sample uniformly from the region under the graph of its density function. ...
In statistics, a sampling distribution is the probability distribution, under repeated sampling of the population, of a given statistic (a numerical quantity calculated from the data values in a sample). ...
In statistics, if a family of probabiblity densities parametrized by a parameter s is of the form fs(x) = f(sx)/s then s is called a scale parameter, since its value determines the scale of the probability distribution. ...
Waiting time between eruptions and the duration of the eruption for the Old Faithful geyser in Yellowstone National Park, Wyoming, USA. A scatterplot or scatter graph is a graph used in statistics to visually display and compare two or more sets of related quantitative, or numerical, data by displaying only...
In statistics, the Schwarz criterion (short for Schwarz information criterion, abbreviated SIC) is a statistical information criterion. ...
The score test is a statistical test of a simple null hypothesis (that the parameter of interest is equal to some particular value ): Where is the likelihood function, is the value of the parameter of interest under the null hypothesis, and is a constant set depending on the size of...
In decision theory a score function, or scoring rule, is a measure of someones performance when they are repeatedly making decisions under uncertainty. ...
The introduction to this article provides insufficient context for those unfamiliar with the subject matter. ...
In numerical descriptions, such as of a time series of numbers, a secular trend is the longterm upward or downward trend in the numbers, as opposed to a smaller cyclical variation with a periodic and shortterm duration. ...
It has been suggested that this article or section be merged with secular trend. ...
In econometrics, seemingly unrelated regression (SUR) is a technique for analyzing a model with multiple equations and correlated error terms. ...
Selection bias is the error of distorting a statistical analysis by pre or postselecting the samples. ...
Selective recruitment is a term introduced to explain an observed effect in traffic safety. ...
In statistics a semiparametric model is a model that has parametric and nonparametric components. ...
In spatial statistics, semivariance can be described by where z is a data value at a particular location, h is the distance between data values, and n(h) counts the number of pairs of data values we are given, spaced a distance of h apart. ...
The sensitivity of a binary classification test or algorithm, such as a blood test to determine if a person has a certain disease, or an automated system to detect faulty products in a factory, is a parameter that expresses something about the tests performance. ...
A separation test is a statistical procedure for earlyphase research, to decide whether or not to pursue further research. ...
Although the subject of sexual dimorphism is not in itself controversial, the measures by which it is assessed differ widely. ...
In statistics, the ShapiroWilk test tests the null hypothesis that a sample x1, ..., xn came from a normally distributed population. ...
In statistics, the SiegelTukey test is a nonparametric test, which applies to data measured at least on an ordinal scale, and it also tests for the differences in scale between the two groups. ...
In statistics, sieve estimators are a class of nonparametric estimator which use progressively more complex models to estimate an unknown highdimensional function as more data becomes available, with the aim of asymptotically reducing error towards zero as the amount of data increases. ...
// Aiming to account for the wide range of empirical distributions following a powerlaw, Herbert Simon[1] proposed a class of stochastic models that results in a powerlaw distribution function. ...
An illustration of Simpsons paradox for continuous data: while a positive trend is seen for the two separate groups (blue and red), a negative trend (black, dashed) appears when the data is combined. ...
Example of experimental data with nonzero skewness (gravitropic response of wheat coleoptiles, 1,790) In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a realvalued random variable. ...
Small area estimation is any of several statistical techniques involving the estimation of parameters for small subpopulations, generally used when the subpopulation of interest is included in a larger survey. ...
Social statistics is the use of statistical measurement systems to study human behavior in a social environment. ...
In statistics, spatial analysis or spatial statistics includes any of the formal techniques used in various fields of research which study entities using their topological, geometric, or geographic properties. ...
In mathematical statistics, spatial dependence is a measure for the degree of associative dependence between independently measured values in a temporally or in situ ordered set, determined at different locations in a sample space or a sampling unit. ...
In statistics, Spearmans rank correlation coefficient, named after Charles Spearman and often denoted by the Greek letter Ï (rho), is a nonparametric measure of correlation â€“ that is, it assesses how well an arbitrary monotonic function could describe the relationship between two variables, without making any assumptions about the frequency...
The SpearmanBrown prediction formula (also known as the SpearmanBrown prophecy formula) is a formula relating psychometric reliability to test length: where is the predicted reliability; N is the number of tests combined (see below); and is the reliability of the current test. The formula predicts the reliability of...
In ecology, the Species Discovery Curve is a graph recording the cumulative number of species of living things recorded in a particular environment as a function of the cumulative effort expended searching for them (usually measured in personhours). ...
The specificity is a statistical measure of how well a binary classification test correctly identifies the negative cases, or those cases that do not meet the condition under study. ...
Spectrum continuation analysis (SCA) is a generalization of the concept of Fourier series to nonperiodic functions of which only a fragment has been sampled in the time domain. ...
S is a statistical programming language developed by John Chambers of Bell Laboratories. ...
SPSS is a computer program used for statistical analysis and is also the name of the company (SPSS Inc. ...
In statistics, a spurious relationship (or, sometimes, spurious correlation) is a mathematical relationship in which two occurrences have no causal connection, yet it may be inferred that they do, due to a certain third, unseen factor (referred to as a confounding factor or lurking variable). The spurious relationship gives an...
// The definition of variance is either the expected value (when considering a theoretical distribution), or average (for actual experimental data) of squared deviations from the mean. ...
In probability theory and decision theory the St. ...
In probability and statistics, the standard deviation of a probability distribution, random variable, or population or multiset of values is a measure of the spread of its values. ...
The standard error of a method of measurement or estimate is the estimated standard deviation of the error in that method. ...
Compares the various grading methods in a normal distribution. ...
Stanine (STAndard NINE) is a method of scaling test scores on a nine point standard scale with a mean of 5 and a standard deviation of two. ...
A statistic (singular) is the result of applying a statistical algorithm to a set of data. ...
Statistical arbitrage, or StatArb, as opposed to (deterministic) arbitrage, is related to the statistical mispricing of one or more assets based on the expected value of these assets. ...
A statistical assembly is a study of the relationships among the components in a statistical unit that is made of discrete components like organs or machine parts. ...
Statistical assumptions are general assumptions about statistical populations. ...
Statistical classification is a type of supervised learning problem in which labeled training data is used to create a function that will correctly predict the label of future data. ...
In statistics, deviance is a quantity whose expected values can be used for statistical hypothesis testing. ...
In descriptive statistics, statistical dispersion (also called statistical variability) is quantifiable variation of measurements of differing members of a population within the scale on which they are measured. ...
In statistics, efficiency is one measure of desirability of an estimator. ...
Statistical epidemiology is an emerging branch of the disciplines of epidemiology and biostatistics that aims to: Bring more statistical rigour to bear in the field of epidemiology Recognise the importance of applied statistics, especially with respect to the context in which statistical methods are appropriate and inappropriate Aid and improve...
There are many different ways of discussing statistical estimation. ...
Statistical geography is the study and practice of collecting, analysing and presenting data that has a geographic or areal dimension, such as census or demographics data. ...
One may be faced with the problem of making a definite decision with respect to an uncertain hypothesis which is known only through its observable consequences. ...
In probability theory, to say that two events are independent intuitively means that the occurrence of one event makes it neither more nor less probable that the other occurs. ...
It has been suggested that this article or section be merged with inferential statistics. ...
Statistical learning theory is an ambiguous term. ...
Statistical Methods for Research Workers (ISBN 0050021702) is a classic 1925 book on statistics by the statistician Ronald Fisher. ...
Sir Ronald Aylmer Fisher, FRS (17 February 1890 â€“ 29 July 1962) was an English statistician, evolutionary biologist, and geneticist. ...
A statistical model is used in applied statistics. ...
Statistical noise is the colloquial term for recognized amounts of variation in a sample. ...
This is an incomplete list of software that is designed for the explicit purpose of performing statistical analyses. ...
A statistical parameter is a parameter that indexes a family of probability distributions. ...
Statistical parametric mapping or SPM is a statistical technique for examining differences in brain activity recorded during functional neuroimaging experiments using neuroimaging technologies such as fMRI or PET. It may also refer to a specific piece of software created by the Wellcome Department of Imaging Neuroscience (part of University College...
This is not an attempt at a comprehensive list of statistical topics; see that article. ...
In statistics, a statistical population is a set of entities concerning which statistical inferences are to be drawn, often based on a random sample taken from the population. ...
The power of a statistical test is the probability that the test will reject a false null hypothesis (that it will not make a Type II error). ...
Statistical process control (SPC) is a method for achieving quality control in manufacturing processes. ...
In descriptive statistics, the range is the length of the smallest interval which contains all the data. ...
Statistical regularity is a notion in statistics that if we throw a thumbtack onto a table once, we would have a hard time predicting whether the point would touch the surface of the table or not. ...
A sample is that part of a population which is actually observed. ...
In statistics, a result is significant if it is unlikely to have occurred by chance, given that a presumed null hypothesis is true. ...
Statistical surveys are used to collect quantitative information about items in a population. ...
The theory of statistics includes a number of topics: Statistical models of the sources of data and typical problem formulation: Sampling from a finite population Measuring observational error and refining procedures Studying statistical relations Planning statistical research to measure and control observational error: Design of experiments to determine treatment effects...
In different statistical disciplines, the statistical unit is the source of a random variable. ...
This article is about the field of statistics. ...
Statistics Belgium is the main official statistical institution in Belgian offering a large choice of figures. ...
Statistics New Zealand (Te Tari Tatau) is a New Zealand government department, and the source of the countrys official statistics. ...
The Statistics Online Computational Resource (SOCR) is a suite of online tools and interactive aids for handson learning and teaching concepts in statistical analyses and probability theory. ...
Steins example, also known as Steins paradox (after Charles Stein) is a very important example in decision theory which is much celebrated since it contradicts a mathematicians natural intuition. ...
Steins lemma, named in honor of Charles Stein, is a theorem of probability theory that is of interest primarily because of its application to statistical inference â€” in particular, its application to JamesStein estimation and empirical Bayes methods. ...
In statistics, stepwise regression includes regression models in which the choice of predictive variables is carried out by an automatic procedure. ...
A stochastic kernel is the transition function of a (usually discrete) stochastic process. ...
In statistics, the studentized range computed from a list x1, ..., xn of numbers is where is the sample variance and is the sample mean. ...
In statistics, a Studentized residual, named in honor of William Sealey Gosset, who wrote under the pseudonym Student, is a residual adjusted by dividing it by an estimate of its standard deviation. ...
In probability and statistics, the tdistribution or Students tdistribution is a probability distribution that arises in the problem of estimating the mean of a normally distributed population when the sample size is small. ...
A t test is any statistical hypothesis test in which the test statistic has a Students t distribution if the null hypothesis is true. ...
In statistics, a statistic is sufficient for the parameter Î¸, which indexes the distribution family of the data, precisely when the datas conditional probability distribution, given the statistics value, no longer depends on Î¸. Intuitively, a sufficient statistic for Î¸ captures all the possible information about Î¸ that is in the data. ...
Sum of squares is a concept that permeates much of inferential statistics and descriptive statistics. ...
In descriptive statistics, summary statistics are used to summarize a set of observations, in order to communicate as much as possible as simply as possible. ...
In statistics, survey sampling is random selection of a sample from a finite population. ...
Survival analysis is a branch of statistics which deals with death in biological organisms and failure in mechanical systems. ...
In biostatistics, survival rate is a part of the the survival analysis, indicating the percentage of people in a study or treatment group who are alive for a given period of time after diagnosis. ...
The survival function, also known as a survivor function or reliability function, is a property of any random variable that maps a set of events, usually associated with mortality or failure of some system, onto time. ...
System dynamics is an approach to understanding the behaviour of complex systems over time. ...
Italic textSystematic errorsBold text are biases in measurement which lead to measured values being systematically too high or too low. ...
In statistics, the term bias is used for two different concepts. ...
In statistics and optimization, the concepts of error and residual are easily confused with each other. ...
T In probability and statistics, the tdistribution or Students tdistribution is a probability distribution that arises in the problem of estimating the mean of a normally distributed population when the sample size is small. ...
Taguchi methods are statistical methods developed by Genichi Taguchi to improve the quality of manufactured goods. ...
The introduction to this article provides insufficient context for those unfamiliar with the subject matter. ...
Testretest is a statistical method used to examine how reliable a test is: A test is performed twice, e. ...
In statistics, signal processing, and econometrics, a time series is a sequence of data points, measured typically at successive times, spaced at (often uniform) time intervals. ...
In statistics, hypotheses suggested by the data must be tested differently from hypotheses formed independently of the data. ...
Also know as Tolerance limits. ...
A transect is a path along which one records and counts occurrences of the phenomenon of study (e. ...
Treatment learning is a process by which an ordered classified data set can be evaluated as part of a data mining session to produce a representative data model. ...
A series of measurements of a process may be treated as a time series, and then trend estimation is the application of statistical techniques to make and justify statements about trends in the data. ...
A truncated mean or trimmed mean is a statistical measure of central tendency, much like the mean and median. ...
Type I errors (or Î± error, or false positive) and type II errors (Î² error, or a false negative) are two terms used to describe statistical errors. ...
U The MannWhitney U test is one of the bestknown nonparametric statistical significance tests. ...
In statistics, the term bias is used for two different concepts. ...
The standard deviation is often estimated from a random sample drawn from the population. ...
Uncomfortable science is the term coined by statistician John Tukey for cases in which there is a need to draw an inference from a limited sample of data, where further samples influenced by the same cause system will not be available. ...
Unitweighted regression is perhaps the easiest form of multiple regression analysis, a method in which two or more variables are used to predict the value of an outcome. ...
An urn problem is an idealized thought experiment in which some objects of real interest (such as atoms, people, cars, etc. ...
V In psychology, validity has two distinct fields of application. ...
This article is about mathematics. ...
In probability theory and statistics, the variancetomean ratio (VMR), like the coefficient of variation, is a measure of the dispersion of a probability distribution. ...
Three functions are used in geostatistics for describing the spatial or the temporal correlation of observations: these are the correlogram, the covariance and the semivariogram. ...
The VC dimension (for Vapnik Chervonenkis dimension) is a measure of the capacity of a classification algorithm. ...
Vapnik Chervonenkis theory (also known as VC theory) was developed during 19601990 by Vladimir Vapnik and Alexey Chervonenkis. ...
Points sampled from three von MisesFisher distributions on the sphere (blue: , green: , red: ). The mean directions are shown with arrows. ...
In probability theory, the VysochanskijPetunin inequality gives a lower bound for the probability that a random variable with finite variance lies within a certain number of standard deviations of the variables mean. ...
W Under the Wald statistical test, named after Abraham Wald, the maximum likelihood estimate of the parameter(s) of interest is compared with the proposed value , with the assumption that the difference between the two will be approximately normal. ...
In probability theory and statistics, the Weibull distribution (named after Waloddi Weibull) is a continuous probability distribution with the probability density function where and is the shape parameter and is the scale parameter of the distribution. ...
In statistics and uncertainty analysis, the WelchSatterthwaite equation is used to calculate an approximation to the effective degrees of freedom of a linear combination of sample variances. ...
This article or section is in need of attention from an expert on the subject. ...
A Winsorized mean is a statistical measure of central tendency, much like the mean and median, and even more similar to the truncated mean. ...
Respondents to a census or other surveys sometimes inaccurately report their or other household members age or date of birth. ...
In statistics the White test is a test which establishes whether the residual variance of a variable in a regression model is constant (homoskedasticity). ...
The Wilcoxon signedrank test is a nonparametric alternative to the paired Students ttest for the case of two related samples or repeated measurements on a single sample. ...
In signal processing, a window function (or apodization function) is a function that is zerovalued outside of some chosen interval. ...
Winsorising is the transformation of outliers in statistical data. ...
In statistics, the Wishart distribution, named in honor of John Wishart, is any of a family of probability distributions for nonnegativedefinite matrixvalued random variables (random matrices). These distributions are of great importance in the estimation of covariance matrices in multivariate statistics. ...
In statistics, Wolds theorem or Wold representation theorem, named after Herman Wold, says that every covariancestationary time series can be written as an infinite moving average (MA()) process of its innovation process. ...
X X12ARIMA is the U.S. Census Bureaus software package for seasonal adjustment. ...
An Xbar/R chart is a specific member of a family of control charts. ...
Y The Yamartino method is an algorithm for calculating the standard deviation of wind direction () during a single pass through the incoming data. ...
Yates correction for continuity, or Yates chisquare test, adjusts the formula for Pearsons chisquare test by subtracting 0. ...
Youdens J statistic is a single statistic that captures the performance of a diagnostic test. ...
In probability and statistics, the YuleSimon distribution is a discrete probability distribution. ...
Z  zscore
 zfactor
 z statistic
 ZipfMandelbrot law
