FACTOID # 5: Minnesota and Connecticut are both in the top 5 in saving money and total tax burden per capita.

 Home Encyclopedia Statistics States A-Z Flags Maps FAQ About

 WHAT'S NEW

SEARCH ALL

Search encyclopedia, statistics and forums:

(* = Graphable)

Encyclopedia > Conditional probability

This article defines some terms which characterize probability distributions of two or more variables. In mathematics and statistics, a probability distribution, more properly called a probability density, assigns to every interval of the real numbers a probability, so that the probability axioms are satisfied. ... In computer science and mathematics, a variable is a symbol denoting a quantity or symbolic representation. ...

Conditional probability is the probability of some event A, given the occurrence of some other event B. Conditional probability is written P(A|B), and is read "the probability of A, given B". Probability is the extent to which something is likely to happen or be the case[1]. Probability theory is used extensively in areas such as statistics, mathematics, science, philosophy to draw conclusions about the likelihood of potential events and the underlying mechanics of complex systems. ... In probability theory, an event is a set of outcomes (a subset of the sample space) to which a probability is assigned. ...

Joint probability is the probability of two events in conjunction. That is, it is the probability of both events together. The joint probability of A and B is written $P(A cap B)$ or P(A,B).

Marginal probability is the probability of one event, regardless of the other event. Marginal probability is obtained by summing (or integrating, more generally) the joint probability over the unrequired event. This is called marginalization. The marginal probability of A is written P(A), and the marginal probability of B is written P(B). Summation is the addition of a set of numbers; the result is their sum. ... In calculus, the integral of a function is an extension of the concept of a sum. ...

In these definitions, note that there need not be a causal or temporal relation between A and B. A may precede B or vice versa or they may happen at the same time. A may cause B or vice versa or they may have no causal relation at all. Notice, however, that casual and temporal relations are informal notions, not belonging to the probabilistic framework. They may apply in some examples, depending on the interpretation given to events. A causal system is a system that depends only on the current and previous inputs. ... For alternate uses of time, see Time (disambiguation) or see TIME (magazine). ...

Conditioning of probabilities, i.e. updating them to take account of (possibly new) information, may be achieved through Bayes' theorem. Bayes theorem (also known as Bayes rule or Bayes law) is a result in probability theory, which relates the conditional and marginal probability distributions of random variables. ...

## Contents

Given a probability space (Ω,F,P) and two events $A, Bin F$ with P(B) > 0, the conditional probability of A given B is defined by In mathematics, a probability space or probability measure is a set S, together with a Ïƒ-algebra X on S and a measure P on that Ïƒ-algebra such that P(S) = 1. ... In probability theory, an event is a set of outcomes (a subset of the sample space) to which a probability is assigned. ...

$P(A mid B) = frac{P(A cap B)}{P(B)}.$

If P(B) = 0 then $P(A mid B)$ is undefined. In mathematics, defined and undefined are used to explain whether expressions have meaningful, sensible output. ...

## Statistical independence

Two random events A and B are statistically independent if and only if In probability theory, an event is a set of outcomes (a subset of the sample space) to which a probability is assigned. ... In probability theory, to say that two events are independent intuitively means that the occurrence of one event makes it neither more nor less probable that the other occurs. ...

$P(A cap B) = P(A) P(B).$

Thus, if A and B are independent, then their joint probability can be expressed as a simple product of their individual probabilities.

Equivalently, for two independent events A and B,

$P(A|B) = P(A)$

and

$P(B|A) = P(B).$

In other words, if A and B are independent, then the conditional probability of A, given B is simply the individual probability of A alone; likewise, the probability of B given A is simply the probability of B alone.

## Mutual exclusivity

Two events A and B are mutually exclusive if and only if In logic, two mutually exclusive (or mutual exclusive according to some sources) propositions are propositions that logically cannot both be true. ...

$P(A cap B) = 0$

as long as

$P(A) ne 0$

and

$P(B) ne 0.$

Then

$P(Amid B) = 0$

and

$P(Bmid A) = 0.$

In other words, the probability of A happening, given that B happens, is nil since A and B cannot both happen in the same situation; likewise, the probability of B happening, given that A happens, is also nil.

## Other considerations

• If B is an event and P(B) > 0, then the function Q defined by Q(A) = P(A | B) for all events A is a probability measure.

In mathematics, a probability space is a set S, together with a &#963;-algebra X on S and a measure P on that &#963;-algebra such that P(S) = 1. ... Data mining (DM), also called Knowledge-Discovery in Databases (KDD) or Knowledge-Discovery and Data Mining, is the process of automatically searching large volumes of data for patterns using tools such as classification, association rule mining, clustering, etc. ... In decision theory (for example risk management), a decision tree is a graph of decisions and their possible consequences, (including resource costs and risks) used to create a plan to reach a goal. ... A Bayesian network or Bayesian belief network is a directed acyclic graph of nodes representing variables and arcs representing dependence relations among the variables. ...

## The conditional probability fallacy

The conditional probability fallacy is the assumption that P(A|B) is approximately equal to P(B|A). The mathematician John Allen Paulos discusses this in his book Innumeracy, where he points out that it is a mistake often made even by doctors, lawyers, and other highly educated non-statisticians. It can be overcome by describing the data in actual numbers rather than probabilities. Look up fallacy in Wiktionary, the free dictionary. ... John Allen Paulos is a professor of mathematics at Temple University in Philadelphia who has gained fame as a writer and speaker, usually on the topic of public ignorance about mathematics. ... Innumeracy: Mathematical Illiteracy and its Consequences is a 1989 book by mathematician John Allen Paulos about innumeracy (a portmanteau of numerical illiteracy) in society. ... Statisticians are mathematicians who work with theoretical and applied statistics in the both the private and public sectors. ...

The relation between P(A|B) and P(B|A) is as follows:

$P(B mid A)= P(A mid B) cdot frac{P(B)}{P(A)}.$

### An example

In the following constructed but realistic situation, the difference between P(A|B) and P(B|A) may be surprising, but is at the same time obvious.

In order to identify individuals having a serious disease in an early curable form, one may consider screening a large group of people. While the benefits are obvious, an argument against such screenings is the disturbance caused by false positive screening results: If a person not having the disease is incorrectly found to have it by the initial test, they will most likely be quite distressed until a more careful test shows that they do not have the disease. Even after being told they are well, their lives may be affected negatively.

The magnitude of this problem is best understood in terms of conditional probabilities.

Suppose 1% of the group suffer from the disease, and the rest are well. Choosing an individual at random,

P(disease) = 1% = 0.01 and P(well) = 99% = 0.99.

Suppose that when the screening test is applied to a person not having the disease, there is a 1% chance of getting a false positive result, i.e.

P(positive | well) = 1%, and P(negative | well) = 99%.

Finally, suppose that when the test is applied to a person having the disease, there is a 1% chance of a false negative result, i.e.

P(negative | disease) = 1% and P(positive | disease) = 99%.

Now, calculation shows that:

$P(text{well}captext{negative})=P(text{well})times P(text{negative}|text{well})=99%times99%=98.01%$ is the fraction of the whole group being well and testing negative.
$P(text{disease}captext{positive})=P(text{disease})times P(text{positive}|text{disease})=1%times99%=0.99%$ is the fraction of the whole group being ill and testing positive.
$P(text{well}captext{positive})=P(text{well})times P(text{positive}|text{well})=99%times1%=0.99%$ is the fraction of the whole group having false positive results.
$P(text{disease}captext{negative})=P(text{disease})times P(text{negative}|text{disease})=1%times1%=0.01%$ is the fraction of the whole group having false negative results.

Furthermore,

$P(text{positive})=P(text{well}captext{positive})+P(text{disease}captext{positive})=0.99%+0.99%=1.98%$ is the fraction of the whole group testing positive.
is the probability that you actually have the disease if you tested positive.

In this example, it should be easy to relate to the difference between P(positive|disease)=99% and P(disease|positive)=50%: The first is the conditional probability that you test positive if you have the disease; the second is the conditional probability that you have the disease if you test positive. With the numbers chosen here, the last result is likely to be deemed unacceptable: Half the people testing positive are actually false positives.

Probability distributionsview  talk  edit ]
Univariate Multivariate
Discrete: BenfordBernoullibinomialBoltzmanncategoricalcompound Poissondegenerate • Gauss-Kuzmin • geometrichypergeometriclogarithmicnegative binomialparabolic fractalPoissonRademacherSkellamuniform • Yule-Simon • zetaZipf • Zipf-Mandelbrot Ewensmultinomialmultivariate Polya
Continuous: BetaBeta primeCauchychi-squareDirac delta functionErlangexponentialexponential powerFfading • Fisher's z • Fisher-Tippett • Gammageneralized extreme valuegeneralized hyperbolicgeneralized inverse Gaussian • Half-Logistic • Hotelling's T-square • hyperbolic secant • hyper-exponential • hypoexponential • inverse chi-square • inverse Gaussianinverse gammaKumaraswamyLandauLaplaceLévy • Lévy skew alpha-stable • logistic • log-normal • Maxwell-Boltzmann • Maxwell speednormal (Gaussian) • normal inverse Gaussian • ParetoPearsonpolarraised cosineRayleigh • relativistic Breit-Wigner • Riceshifted Gompertz • Student's t • triangular • type-1 Gumbel • type-2 Gumbel • uniform • Variance-Gamma • Voigtvon MisesWeibullWigner semicircleWilks' lambda DirichletKentmatrix normalmultivariate normalmultivariate Student • von Mises-Fisher • Wigner quasi • Wishart
Miscellaneous: Cantorconditionalexponential family • infinitely divisible • location-scale family • marginalmaximum entropy • phase-type • posteriorprior • quasi • samplingsingular

Results from FactBites:

 Probability theory - Wikipedia, the free encyclopedia (838 words) Probability theory is the mathematical study of probability. The probability that an event A occurs given the known occurrence of an event B is the conditional probability of A given B; its numerical value is If the conditional probability of A given B is the same as the ("unconditional") probability of A, then A and B are said to be independent events.
More results at FactBites »

Share your thoughts, questions and commentary here