FACTOID # 20: Statistically, Delaware bears more cost of the US Military than any other state.

 Home Encyclopedia Statistics States A-Z Flags Maps FAQ About

 WHAT'S NEW

SEARCH ALL

Search encyclopedia, statistics and forums:

(* = Graphable)

Encyclopedia > Markov chain Monte Carlo

Markov chain Monte Carlo (MCMC) methods (which include random walk Monte Carlo methods) are a class of algorithms for sampling from probability distributions based on constructing a Markov chain that has the desired distribution as its stationary distribution. The state of the chain after a large number of steps is then used as a sample from the desired distribution. The quality of the sample improves as a function of the number of steps. In mathematics, a (discrete-time) Markov chain, named after Andrei Markov, is a discrete-time stochastic process with the Markov property. ... Monte Carlo methods are a class of computational algorithms for simulating the behavior of various physical and mathematical systems. ... In mathematics and physics, a random walk is a formalization of the intuitive idea of taking successive steps, each in a random direction. ... In mathematics, a probability distribution assigns to every interval of the real numbers a probability, so that the probability axioms are satisfied. ... In mathematics, a (discrete-time) Markov chain, named after Andrei Markov, is a discrete-time stochastic process with the Markov property. ... In mathematics, a (discrete-time) Markov chain, named after Andrei Markov, is a discrete-time stochastic process with the Markov property. ...

Usually it is not hard to construct a Markov Chain with the desired properties. The more difficult problem is to determine how many steps are needed to converge to the stationary distribution within an acceptable error. A good chain will have rapid mixingâ€”the stationary distribution is reached quickly starting from an arbitrary position. Tools for proving rapid mixing include arguments based on conductance and the coupling method. Conductance can refer to: Electrical conductance, the reciprocal of electrical resistance. ...

Typical use of MCMC sampling can only approximate the target distribution, as there is always some residual effect of the starting position. More sophisticated MCMC-based algorithms such as coupling from the past can produce exact samples, at the cost of additional computation and an unbounded (though finite on average) running time. To analyze an algorithm is to determine the amount of resources (such as time and storage) necessary to execute it. ...

The most common application of these algorithms is numerically calculating multi-dimensional integrals. In these methods, an ensemble of "walkers" moves around randomly. At each point where the walker steps, the integrand value at that point is counted towards the integral. The walker then may make a number of tentative steps around the area, looking for a place with reasonably high contribution to the integral to move into next. Random walk methods are a kind of random simulation or Monte Carlo method. However, whereas the random samples of the integrand used in a conventional Monte Carlo integration are statistically independent, those used in MCMC are correlated. A Markov chain is constructed in such a way as to have the integrand as its equilibrium distribution. Surprisingly, this is often easy to do. In calculus, the integral of a function is a generalization of area, mass, volume, sum, and total. ... In physics, a statistical ensemble is a very large set of similar systems, considered all at once. ... Monte Carlo methods are a class of computational algorithms for simulating the behavior of various physical and mathematical systems. ... In probability theory, to say that two events are independent intuitively means that knowing whether or not one of them occurs makes it neither more probable nor less probable that the other occurs. ... In probability theory and statistics, correlation, also called correlation coefficient, is a numeric measure of the strength of linear relationship between two random variables. ... In mathematics, a (discrete-time) Markov chain, named after Andrei Markov, is a discrete-time stochastic process with the Markov property. ...

Multi-dimensional integrals often arise in Bayesian statistics and computational physics, so Markov chain Monte Carlo methods are widely used in those fields. Bayesian inference is statistical inference in which probabilities are interpreted not as frequencies or proportions or the like, but rather as degrees of belief. ... Computational physics is the study and implementation of numerical algorithms in order to solve problems in physics for which a quantitative theory already exists. ...

## Contents

Many Markov chain Monte Carlo methods move around the equilibrium distribution in relatively small steps, with no tendency for the steps to proceed in the same direction. These methods are easy to implement and analyse, but unfortunately it can take a long time for the walker to explore all of the space. The walker will often double back and cover ground already covered. Here are some random walk MCMC methods:

• Metropolis-Hastings algorithm: Generates a random walk using a proposal density and a method for rejecting proposed moves.
• Gibbs sampling: Requires that all the conditional distributions of the target distribution can be sampled exactly. Popular partly because when this is so, the method does not require any `tuning'.
• Slice sampling: Depends on the principle that one can sample from a distribution by sampling uniformly from the region under the plot of its density function. This method alternates uniform sampling in the vertical direction with uniform sampling from the horizontal `slice' defined by the current vertical position.

The Sampling distribution Q determines the next point to move to in the random walk. ... In mathematics and physics, a random walk is a formalization of the intuitive idea of taking successive steps, each in a random direction. ... In mathematics and physics, Gibbs sampling is an algorithm to generate a sequence of samples from the joint probability distribution of two or more random variables. ... Given two jointly distributed random variables X and Y, the conditional probability distribution of Y given X (written Y | X) is the probability distribution of Y when X is known to be a particular value. ...

## Avoiding random walks

More sophisticated algorithms use some method of preventing the walker from doubling back. These algorithms may be harder to implement, but may exhibit faster convergence (i.e. fewer steps for an accurate result).

• Simultaneous over-relaxation: can be seen as a variation on Gibbs sampling, which sometimes avoids random walks.
• Hybrid Monte Carlo (HMC) (Would be better called `Hamiltonian Monte Carlo'): Tries to avoid random walk behaviour by introducing an auxiliary momentum vector and implementing Hamiltonian dynamics where the potential function is the target density. The momentum samples are discarded after sampling. The end result of Hybrid MCMC is that proposals move across the sample space in larger steps and are therefore less correlated and converge to the target distribution more rapidly.
• Some variations on slice sampling also avoid random walks.

In physics, momentum is the product of the mass and velocity of an object. ... Hamiltonian mechanics is a re-formulation of classical mechanics that was invented in 1833 by William Rowan Hamilton. ...

## Changing dimension

The Reversible Jump method is a variant of Metropolis-Hastings that allows proposals that change the dimensionality of the space.

## References

• Bernd A. Berg. "Markov Chain Monte Carlo Simulations and Their Statistical Analysis". Singapore, World Scientific 2004.
• George Casella and Edward I. George. "Explaining the Gibbs sampler". The American Statistician, 46:167-174, 1992. (Basic summary and many references.)
• A.E. Gelfand and A.F.M. Smith. "Sampling-Based Approaches to Calculating Marginal Densities". J. American Statistical Association, 85:398-409, 1990.
• Andrew Gelman, John B. Carlin, Hal S. Stern, and Donald B. Rubin. Bayesian Data Analysis. London: Chapman and Hall. First edition, 1995. (See Chapter 11.)
• S. Geman and D. Geman. "Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images". IEEE Transactions on Pattern Analysis and Machine Intelligence, 6:721-741, 1984.
• Radford M. Neal, Probabilistic Inference Using Markov Chain Monte Carlo Methods, 1993.
• C.P. Robert and G. Casella. "Monte Carlo Statistical Methods" (second edition). New York: Springer-Verlag, 2004.

Results from FactBites:

 Markov Chain Monte Carlo - MLpedia (704 words) Markov chain Monte Carlo (MCMC) methods, sometimes called random walk Monte Carlo methods, are a class of algorithms for sampling from probability distributions based on constructing a Markov chain that has the desired distribution as its stationary distribution. A Markov chain is constructed in such a way as to have the integrand as its equilibrium distribution. Hybrid Markov chain Monte Carlo: Tries to avoid random walk behaviour by introducing an auxiliary momentum vector and implementing Hamiltonian dynamics where the potential function is the target density.
 The World-Wide Web Virtual Library: Random Numbers and Monte Carlo methods (1022 words) MCMC Preprint Service: this link provides a list of all registered papers on Markov chain Monte Carlo methodology currently submitted for publication. The Molecular Monte Carlo Home Page is meant to serve as an information resource for those who use "random walks" (stochastic methods) to simulate and analyze molecular systems throughout the world. The EGS4 computer code system is a general purpose package for the Monte Carlo simulation of the coupled transport of electrons and photons in an arbitrary geometry for particles with energies from a few keV up to several TeV.
More results at FactBites »

Share your thoughts, questions and commentary here