FACTOID # 18: Alaska spends more money per capita on elementary and secondary education than any other state.

 Home Encyclopedia Statistics States A-Z Flags Maps FAQ About

 WHAT'S NEW

SEARCH ALL

FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:

(* = Graphable)

Encyclopedia > Pareto distribution
 Probability density function Pareto probability density functions for various k  with xm = 1. The horizontal axis is the x  parameter. Note that as k->∞ the distribution approaches δ(x − xm) where δ is the Dirac delta function. Cumulative distribution function Pareto cumulative density functions for various k  with xm = 1. The horizontal axis is the x  parameter. Parameters $x_mathrm{m}>0,$ location (real) $k>0,$ shape (real) Support $x in [x_mathrm{m}; +infty)!$ pdf $frac{k,x_mathrm{m}^k}{x^{k+1}}!$ cdf $1-left(frac{x_mathrm{m}}{x}right)^k!$ Mean $frac{k,x_mathrm{m}}{k-1}!$ for k > 1 Median $x_mathrm{m} sqrt[k]{2}$ Mode $x_mathrm{m},$ Variance $frac{x_mathrm{m}^2k}{(k-1)^2(k-2)}!$ for k > 2 Skewness $frac{2(1+k)}{k-3},sqrt{frac{k-2}{k}}!$ for k > 3 Kurtosis $frac{6(k^3+k^2-6k-2)}{k(k-3)(k-4)}!$ for k > 4 Entropy $lnleft(frac{k}{x_mathrm{m}}right) - frac{1}{k} - 1!$ mgf undefined; see text for raw moments Char. func. $k(-ix_mathrm{m}t)^kGamma(-k,-ix_mathrm{m}t),$

Pareto originally used this distribution to describe the allocation of wealth among individuals since it seemed to show rather well the way that a larger portion of the wealth of any society is owned by a smaller percentage of the people in that society. This idea is sometimes expressed more simply as the Pareto principle or the "80-20 rule" which says that 20% of the population owns 80% of the wealth. It can be seen from the probability density function (PDF) graph on the right, that the "probability" or fraction of the population f(x) that owns a small amount of wealth per person (x) is rather high, and then decreases steadily as wealth increases. This distribution is not limited to describing wealth or income distribution, but to many situations in which an equilibrium is found in the distribution of the "small" to the "large". The following examples are sometimes seen as approximately Pareto-distributed: The so-called Pareto principle (also known as the 80-20 rule, the law of the vital few and the principle of factor sparsity) states that for many phenomena 80% of consequences stem from 20% of the causes. ... The misnamed Pareto principle (also known as the 80-20 Rule, the law of the vital few and the principle of factor sparsity) states that for many phenomena 80% of consequences stem from 20% of the causes. ...

• Frequencies of words in longer texts
• The size of human settlements (few cities, many hamlets/villages)
• File size distribution of Internet traffic which uses the TCP protocol (many smaller files, few larger ones)
• Clusters of Bose-Einstein condensate near absolute zero
• The value of oil reserves in oil fields (a few large fields, many small fields)
• The length distribution in jobs assigned supercomputers (a few large ones, many small ones)
• The standardized price returns on individual stocks
• Size of sand particles
• Size of meteorites
• Number of species per genus (please note the subjectivity involved: The tendency to divide a genus into two or more increases with the number of species in it)
• Areas burnt in forest fires

A Bose-Einstein condensate is a phase of matter formed by bosons cooled to temperatures very near to absolute zero. ... Absolute zero is a fundamental lower bound on the temperature of any macroscopic system. ...

If X is a random variable with a Pareto distribution, then the probability that X is greater than some number x is given by: A random variable is a term used in mathematics and statistics. ...

$P(X>x)=left(frac{x}{x_mathrm{m}}right)^{-k}$

for all xxm, where xm is the (necessarily positive) minimum possible value of X, and k is a positive parameter. The family of Pareto distributions is parameterized by two quantities, xm and k. When this distribution is used to model the distribution of wealth, then the parameter k is called the Pareto index. In economics the Pareto index is a measure of the breadth of income distribution. ...

It follows that the probability density function is

$f(x;k,x_mathrm{m}) = k,frac{x_mathrm{m}^k}{x^{k+1}} mbox{for} x ge x_mathrm{m}. ,$

Pareto distributions are continuous probability distributions. Zipf's law, also sometimes called the zeta distribution, may be thought of as a discrete counterpart of the Pareto distribution. This article may be too technical for most readers to understand. ... In probability theory and statistics, the zeta distribution is a discrete probability distribution. ...

The expected value of a random variable following a Pareto distribution is In probability theory (and especially gambling), the expected value (or mathematical expectation) of a random variable is the sum of the probability of each possible outcome of the experiment multiplied by its payoff (value). Thus, it represents the average amount one expects to win per bet if bets with identical... A random variable is a term used in mathematics and statistics. ...

$E(X)=frac{kx_m}{k-1} ,$

(if k ≤ 1, the expected value is infinite). Its variance is In probability theory and statistics, the variance of a random variable is a measure of its statistical dispersion, indicating how far from the expected value its values typically are. ...

$mathrm{var}(X)=left(frac{x_m}{k-1}right)^2 frac{k}{k-2}$

(Note: if $k le 2$, the variance is infinite). The raw moments are found to be: This article is in need of attention from an expert on the subject. ...

$mu_n'=frac{kx_mathrm{m}^n}{k-n} ,$

but they are only defined for k > n. This means that the moment generating function, which is just a Taylor series in x with μn' / n! as coefficients, is not defined. The characteristic function is given by: As the degree of the Taylor series rises, it approaches the correct function. ...

$varphi(t;k,x_m)=k(-ix_m t)^kGamma(-k,-ix_m t)$

where Γ(a,x) is the incomplete Gamma function. The Pareto distribution is related to the exponential distribution by: The Gamma function along part of the real axis In mathematics, the Gamma function extends the factorial function to complex and non natural numbers (where it is defined). ... In probability theory and statistics, the exponential distributions are a class of continuous probability distribution. ...

$f(x;k,x_mathrm{m})=mathrm{Exponential}(ln(x/x_mathrm{m});k),$

The Dirac delta function is a limiting case of the Pareto distribution: The Dirac delta function, sometimes referred to as the unit impulse function and introduced by the British theoretical physicist Paul Dirac, can usually be informally thought of as a function Î´(x) that has the value of infinity for x = 0, the value zero elsewhere. ...

$lim_{krightarrow infty} f(x;k,x_mathrm{m})=delta(x-x_mathrm{m}).$

Pareto, Lorenz, and Gini

Lorenz curves for a number of Pareto distributions. Note that the k = ∞ corresponds to perfectly equal distribution (G = 0) and the k = 1 line corresponds to complete inequality (G = 1)

The Lorenz curve is often used to characterize income and wealth distributions. For any distribution, the Lorenz curve L(F) is written in terms of the PDF (f(x)) or the CDF (F(x)) as: Download high resolution version (1300x975, 149 KB) Wikipedia does not have an article with this exact name. ... Download high resolution version (1300x975, 149 KB) Wikipedia does not have an article with this exact name. ... The Lorenz curve was developed by Max O. Lorenz in 1905 as a graphical representation of income distribution. ...

$L(F)=frac{int_{x_mathrm{m}}^{x(F)} xp(x),dx}{int_{x_mathrm{m}}^infty xp(x),dx} =frac{int_0^F x(F'),dF'}{int_0^1 x(F'),dF'}$

where x(F) is the inverse of the CDF. For the Pareto distribution,

$x(F)=frac{x_mathrm{m}}{(1-F)^{1/k}}$

and the Lorenz curve is calculated to be:

$L(F) = 1-(1-F)^{1-1/k},$

where k must be greater than or equal to unity, since the denominator in the expression for L(F) is just the mean value of x. Examples of the Lorenz curve for a number of Pareto distributions are shown in the graph on the right.

The Gini coefficient is a measure of the deviation of the Lorenz curve from the equidistribution line which is a line connecting [0,0] and [1,1], which is shown in black (k = ∞) in the Lorenz plot on the right. Specifically, the Gini coefficient is twice the area between the Lorenz curve and the equidistribution line. The Gini coefficient for the Pareto distribution is then calculated to be: The Gini coefficient is a measure of inequality developed by the Italian statistician Corrado Gini and published in his 1912 paper VariabilitÃ  e mutabilitÃ . It is usually used to measure income inequality, but can be used to measure any form of uneven distribution. ...

$G = 1-2int_0^1L(F),dF = frac{1}{2k-1}$

(see Aaberge 2005).

Parameter estimation

The likelihood function for the Pareto distribution parameters k and xm, given a sample x = (x1,x2,...,xn), is In statistics, a likelihood function is a conditional probability function considered a function of its second argument with its first argument held fixed, thus: and also any other function proportional to such a function. ...

$L(k, x_mathrm{m}) = prod _{i=1} ^n {k frac {x_mathrm{m}^k} {x_i^{k+1}}} = k^n x_mathrm{m}^{nk} prod _{i=1} ^n {frac 1 {x_i^{k+1}}}. !$

Therefore, the logarithmic likelihood function is

$ell(k, x_mathrm{m}) = n ln k + nk ln x_mathrm{m} - (k + 1) sum _{i=1} ^n {ln x_i}. !$

It can be seen that $ell(k, x_mathrm{m})$ is monotonically increasing with xm, that is, the greater the value of xm, the greater the value of the likelihood function. Hence, since $x ge x_m$, we conclude that

$widehat x_mathrm{m} = min _i {x_i}.$

To find the estimator for k, we compute the corresponding partial derivative and determine where it is zero:

$frac{partial ell}{partial k} = frac{n}{k} + n ln x_mathrm{m} - sum _{i=1} ^n {ln x_i} = 0$.

Thus the maximum likelihood estimator for k is: Maximum likelihood estimation (MLE) is a popular statistical method used to make inferences about parameters of the underlying probability distribution of a given data set. ...

$widehat k = frac n {sum _i {left( ln x_i - ln widehat x_mathrm{m} right)}}.$

References

• Lorenz, M. O. (1905). Methods of measuring the concentration of wealth. Publications of the American Statistical Association. 9: 209-219.

The so-called Pareto principle (also known as the 80-20 rule, the law of the vital few and the principle of factor sparsity) states that for many phenomena 80% of consequences stem from 20% of the causes. ... Pareto interpolation is a nonlinear method of interpolation to find the median of a set of data. ... Pareto efficiency, or Pareto optimality, is a central theory in economics with broad applications in game theory, engineering and the social sciences. ... Pareto analysis is a statistical technique in decision making used for selection of a limited of number of tasks that produce significant overall effect. ... The phrase The Long Tail (as a proper noun with capitalized letters) was first coined by Chris Anderson in a 2004 article in Wired magazine [1] to describe certain business and economic models such as Amazon. ...

Results from FactBites:

 Zipf, Power-law, Pareto - a ranking tutorial (1699 words) Although the literature surrounding both the Zipf and Pareto distributions is vast, there are very few direct connections made between Zipf and Pareto, and when they exist, it is by way of a vague reference [1] or an overly complicated mathematical analysis[2,3]. This is exactly the definition of the Pareto distribution, except the x and y axes are flipped. Whereas for Zipf, r is on the x-axis and n is on the y-axis, for Pareto, r is on the y-axis and n is on the x-axis.
 Vilfredo Pareto, Biography: The Concise Encyclopedia of Economics: Library of Economics and Liberty (511 words) Pareto is best known for two concepts that are named after him. Although Pareto thought his law should be "provisionally accepted as universal," he thought that exceptions were possible, and as it turns out, many exceptions have been found. Pareto is also known for showing that the assumption that the utility of goods can actually be measured was not necessary for deriving any of the standard results in consumer theory.
More results at FactBites »

Share your thoughts, questions and commentary here