This article needs additional citations for verification. Please help this article by adding reliable references. Unsourced material may be challenged and removed.(September 2007)  Information theory is a branch of applied mathematics and engineering involving the quantification of information to find fundamental limits on compressing and reliably communicating data. A key measure of information that comes up in the theory is known as information entropy, which is usually expressed by the average number of bits needed for storage or communication. Intuitively, entropy quantifies the uncertainity involved in a random variable. For example, a fair coin flip will have less entropy than a roll of a die. Information and communication technology spending in 2005 Information technology (IT), as defined by the Information Technology Association of America (ITAA), is the study, design, development, implementation, support or management of computerbased information systems, particularly software applications and computer hardware. ...
The Ancient Library of Alexandria, an early form of information storage and retrieval. ...
Informatics includes the science of information, the practice of information processing, and the engineering of information systems. ...
Applied mathematics is a branch of mathematics that concerns itself with the mathematical techniques typically used in the application of mathematical knowledge to other domains. ...
Claude Shannon In information theory, the Shannon entropy or information entropy is a measure of the uncertainty associated with a random variable. ...
In probability theory, a random variable is a quantity whose values are random and to which a probability distribution is assigned. ...
Applications of fundamental topics of information theory include lossless data compression (e.g. ZIP files), lossy data compression (e.g. MP3s), and channel coding (e.g. for DSL lines). The field is at the crossroads of mathematics, statistics, computer science, physics, neurobiology, and electrical engineering. Its impact has been crucial to success of the Voyager missions to deep space, the invention of the CD, the feasibility of mobile phones, the development of the Internet, the study of linguistics and of human perception, the understanding of black holes, and numerous other fields. Important subfields of information theory are source coding, channel coding, algorithmic complexity theory, algorithmic information theory, and measures of information. Lossless data compression is a class of data compression algorithms that allows the exact original data to be reconstructed from the compressed data. ...
The ZIP file format is a popular data compression and archival format. ...
A lossy data compression method is one where compressing data and then decompressing it retrieves data that may well be different from the original, but is close enough to be useful in some way. ...
For other uses, see MP3 (disambiguation). ...
This article does not cite any references or sources. ...
DSL may refer to: Damn Small Linux Dark and Shattered Lands, a MUD based loosely on Forgotten Realms and Dragonlance books. ...
Euclid, Greek mathematician, 3rd century BC, as imagined by by Raphael in this detail from The School of Athens. ...
This article is about the field of statistics. ...
Computer science, or computing science, is the study of the theoretical foundations of information and computation and their implementation and application in computer systems. ...
A magnet levitating above a hightemperature superconductor demonstrates the Meissner effect. ...
Neurobiology is the study of cells of the nervous system and the organization of these cells into functional circuits that process information and mediate behavior. ...
Electrical Engineers design power systemsâ€¦ â€¦ and complex electronic circuits. ...
Voyager Project redirects here. ...
Linguistics is the scientific study of language, which can be theoretical or applied. ...
For other uses, see Black hole (disambiguation). ...
Overview
The main concepts of information theory can be grasped by considering the most widespread means of human communication: language. Two important aspects of a good language are as follows: First, the most common words (e.g., "a," "the," "I") should be shorter than less common words (e.g., "benefit," "generation," "mediocre"), so that sentences will not be too long. Such a tradeoff in word length is analogous to data compression and is the essential aspect of source coding. Second, if part of a sentence is unheard or misheard due to noise—e.g., a passing car—the listener should still be able to glean the meaning of the underlying message. Such robustness is as essential for an electronic communication system as it is for a language; properly building such robustness into communications is done by channel coding. Source coding and channel coding are the fundamental concerns of information theory. â€œSource codingâ€ redirects here. ...
It has been suggested that this article or section be merged with Entropy encoding. ...
This article does not cite any references or sources. ...
Note that these concerns have nothing to do with the importance of messages. For example, a platitude such as "Thank you; come again" takes about as long to say or write as the urgent plea, "Call an ambulance!" while clearly the latter is more important and more meaningful. Information theory, however, does not involve message importance or meaning, as these are matters of the quality of data rather than the quantity of data, the latter of which is determined solely by probabilities. Information theory is generally considered to have been founded in 1948 by Claude Shannon in his seminal work, "A Mathematical Theory of Communication." The central paradigm of classical information theory is the engineering problem of the transmission of information over a noisy channel. The most fundamental results of this theory are Shannon's source coding theorem, which establishes that, on average, the number of bits needed to represent the result of an uncertain event is given by its entropy; and Shannon's noisychannel coding theorem, which states that reliable communication is possible over noisy channels provided that the rate of communication is below a certain threshold called the channel capacity. The channel capacity can be approached by using appropriate encoding and decoding systems. Claude Shannon Claude Elwood Shannon (April 30, 1916 â€“ February 24, 2001), an American electrical engineer and mathematician, has been called the father of information theory, and was the founder of practical digital circuit design theory. ...
The article entitled A Mathematical Theory of Communication, published in 1948 by mathematician Claude E. Shannon, was one of the founding works of the field of information theory. ...
In information theory, the source coding theorem (Shannon 1948) informally states that: N i. ...
Claude Shannon In information theory, the Shannon entropy or information entropy is a measure of the uncertainty associated with a random variable. ...
In information theory, the noisy channel coding theorem establishes that however contaminated with noise interference a communication channel may be, it is possible to communicate digital data (information) errorfree up to a given maximum rate through the channel. ...
Information theory is closely associated with a collection of pure and applied disciplines that have been investigated and reduced to engineering practice under a variety of rubrics throughout the world over the past half century or more: adaptive systems, anticipatory systems, artificial intelligence, complex systems, complexity science, cybernetics, informatics, machine learning, along with systems sciences of many descriptions. Information theory is a broad and deep mathematical theory, with equally broad and deep applications, amongst which is the vital field of coding theory. An adaptive system is a system that is able to adapt its behavior according to changes in its environment or in parts of the system itself. ...
In artificial intelligence, anticipation is the concept of an agent making decisions based on predictions, expectations, or beliefs about the future. ...
Bold text[[Link title]] â€œAIâ€ redirects here. ...
There are many definitions of complexity, therefore many natural, artificial and abstract objects or networks can be considered to be complex systems, and their study (complexity science) is highly interdisciplinary. ...
Complex adaptive systems, are a special case of complex systems. ...
For other uses, see Cybernetics (disambiguation). ...
Informatics includes the science of information, the practice of information processing, and the engineering of information systems. ...
As a broad subfield of artificial intelligence, machine learning is concerned with the design and development of algorithms and techniques that allow computers to learn. At a general level, there are two types of learning: inductive, and deductive. ...
Systems science is the science of complex systems. ...
Coding theory is a branch of mathematics and computer science dealing with the errorprone process of transmitting data across noisy channels, via clever means, so that a large number of errors that occur can be corrected. ...
Coding theory is concerned with finding explicit methods, called codes, of increasing the efficiency and reducing the net error rate of data communication over a noisy channel to near the limit that Shannon proved is the maximum possible for that channel. These codes can be roughly subdivided into data compression (source coding) and errorcorrection (channel coding) techniques. In the latter case, it took many years to find the methods Shannon's work proved were possible. A third class of information theory codes are cryptographic algorithms (both codes and ciphers). Concepts, methods and results from coding theory and information theory are widely used in cryptography and cryptanalysis. See the article ban (information) for a historical application. â€œSource codingâ€ redirects here. ...
In computer science and information theory, error correction consists of using methods to detect and/or correct errors in the transmission or storage of data by the use of some amount of redundant data and (in the case of transmission) the selective retransmission of incorrect segments of the data. ...
In the context of cryptography, a code is a method used to transform a message into an obscured form, preventing those not in on the secret from understanding what is actually transmitted. ...
This article is about algorithms for encryption and decryption. ...
The German Lorenz cipher machine, used in World War II for encryption of very highlevel general staff messages Cryptography (or cryptology; derived from Greek ÎºÏÏ…Ï€Ï„ÏŒÏ‚ kryptÃ³s hidden, and the verb Î³ÏÎ¬Ï†Ï‰ grÃ¡fo write or Î»ÎµÎ³ÎµÎ¹Î½ legein to speak) is the study of message secrecy. ...
Cryptanalysis (from the Greek kryptÃ³s, hidden, and analÃ½ein, to loosen or to untie) is the study of methods for obtaining the meaning of encrypted information, without access to the secret information which is normally required to do so. ...
A ban, sometimes called a hartley, is a logarithmic unit which measures information or entropy, based on base 10 logarithms and powers of 10, rather than the powers of 2 and base 2 logarithms which define the bit. ...
Information theory is also used in information retrieval, intelligence gathering, gambling, statistics, and even in musical composition. Information retrieval (IR) is the science of searching for information in documents, searching for documents themselves, searching for metadata which describe documents, or searching within databases, whether relational standalone databases or hypertext networked databases such as the Internet or World Wide Web or intranets, for text, sound, images or...
Intelligence (abbreviated or ) is the process and the result of gathering information and analyzing it to answer questions or obtain advance warnings needed to plan for the future. ...
Caravaggio, The Cardsharps, c. ...
This article is about the field of statistics. ...
Musical composition is a phrase used in a number of contexts, the most commonly used being a piece of music. ...
Historical background 
The landmark event that established the discipline of information theory, and brought it to immediate worldwide attention, was the publication of Claude E. Shannon's classic paper "A Mathematical Theory of Communication" in the Bell System Technical Journal in July and October of 1948. The decisive event which established the discipline of information theory, and brought it to immediate worldwide attention, was the publication of Claude E. Shannon (1916â€“2001)s classic paper A Mathematical Theory of Communication in the Bell System Technical Journal in July and October of 1948. ...
Claude Elwood Shannon (April 30, 1916  February 24, 2001) has been called the father of information theory, and was the founder of practical digital circuit design theory. ...
The article entitled A Mathematical Theory of Communication, published in 1948 by mathematician Claude E. Shannon, was one of the founding works of the field of information theory. ...
Bell System Technical Journal was the inhouse journal of Bell Laboratories. ...
Prior to this paper, limited information theoretic ideas had been developed at Bell Labs, all implicitly assuming events of equal probability. Harry Nyquist's 1924 paper, Certain Factors Affecting Telegraph Speed, contains a theoretical section quantifying "intelligence" and the "line speed" at which it can be transmitted by a communication system, giving the relation W = Klogm, where W is the speed of transmission of intelligence, m is the number of different voltage levels to choose from at each time step, and K is a constant. Ralph Hartley's 1928 paper, Transmission of Information, uses the word information as a measurable quantity, reflecting the receiver's ability to distinguish that one sequence of symbols from any other, thus quantifying information as H = logS^{n} = nlogS, where S was the number of possible symbols, and n the number of symbols in a transmission. The natural unit of information was therefore the decimal digit, much later renamed the hartley in his honour as a unit or scale or measure of information. Alan Turing in 1940 used similar ideas as part of the statistical analysis of the breaking of the German second world war Enigma ciphers. Harry Nyquist (pron. ...
Ralph Vinton Lyon Hartley (November 30, 1888  May 1, 1970) was an electronics researcher. ...
A ban, sometimes called a hartley, is a logarithmic unit which measures information or entropy, based on base 10 logarithms and powers of 10, rather than the powers of 2 and base 2 logarithms which define the bit. ...
Alan Mathison Turing, OBE, FRS (23 June 1912 â€“ 7 June 1954) was an English mathematician, logician, and cryptographer. ...
This article or section contains information that has not been verified and thus might not be reliable. ...
Much of the mathematics behind information theory with events of different probabilities was developed for the field of thermodynamics by Ludwig Boltzmann and J. Willard Gibbs. Connections between informationtheoretic entropy and thermodynamic entropy, including the important contributions by Rolf Landauer in the 1960s, are explored in Entropy in thermodynamics and information theory. Thermodynamics (from the Greek Î¸ÎµÏÎ¼Î·, therme, meaning heat and Î´Ï…Î½Î±Î¼Î¹Ï‚, dynamis, meaning power) is a branch of physics that studies the effects of changes in temperature, pressure, and volume on physical systems at the macroscopic scale by analyzing the collective motion of their particles using statistics. ...
Ludwig Eduard Boltzmann (Vienna, Austrian Empire, February 20, 1844 â€“ Duino near Trieste, September 5, 1906) was an Austrian physicist famous for his founding contributions in the fields of statistical mechanics and statistical thermodynamics. ...
Josiah Willard Gibbs (February 11, 1839 – April 28, 1903) was an American physical chemist. ...
Rolf Landauer (1927 â€“ 1999) was an IBM physicist who in 1961 demonstrated that when information is lost in an irreversible circuit, the information becomes entropy and an associated amount of energy is dissipated as heat. ...
This article or section is in need of attention from an expert on the subject. ...
In Shannon's revolutionary and groundbreaking paper, the work for which had been substantially completed at Bell Labs by the end of 1944, Shannon for the first time introduced the qualitative and quantitative model of communication as a statistical process underlying information theory, opening with the assertion that  "The fundamental problem of communication is that of reproducing at one point, either exactly or approximately, a message selected at another point."
With it came the ideas of Claude Shannon In information theory, the Shannon entropy or information entropy is a measure of the uncertainty associated with a random variable. ...
Redundancy in information theory is the number of bits used to transmit a message minus the number of bits of actual information in the message. ...
In information theory, the source coding theorem (Shannon 1948) informally states that: N i. ...
In probability theory and, in particular, information theory, the mutual information, or transinformation, of two random variables is a quantity that measures the mutual dependence of the two variables. ...
This article does not cite any references or sources. ...
In information theory, the noisy channel coding theorem establishes that however contaminated with noise interference a communication channel may be, it is possible to communicate digital data (information) errorfree up to a given maximum rate through the channel. ...
In information theory, the Shannonâ€“Hartley theorem is an application of the noisy channel coding theorem to the archetypal case of a continuoustime analog communications channel subject to Gaussian noise. ...
This article is about the unit of information. ...
Quantities of information 
Information theory is based on probability theory and statistics. The most important quantities of information are entropy, the information in a random variable, and mutual information, the amount of information in common between two random variables. The former quantity indicates how easily message data can be compressed while the latter can be used to find the communication rate across a channel. The mathematical theory of information is based on probability theory and statistics, and measures information with several quantities of information. ...
Probability theory is the branch of mathematics concerned with analysis of random phenomena. ...
This article is about the field of statistics. ...
Claude Shannon In information theory, the Shannon entropy or information entropy is a measure of the uncertainty associated with a random variable. ...
In probability theory, a random variable is a quantity whose values are random and to which a probability distribution is assigned. ...
In probability theory and, in particular, information theory, the mutual information, or transinformation, of two random variables is a quantity that measures the mutual dependence of the two variables. ...
â€œSource codingâ€ redirects here. ...
Channel, in communications (sometimes called communications channel), refers to the medium used to convey information from a sender (or transmitter) to a receiver. ...
The choice of logarithmic base in the following formulae determines the unit of information entropy that is used. The most common unit of information is the bit, based on the binary logarithm. Other units include the nat, which is based on the natural logarithm, and the hartley, which is based on the common logarithm. The former Weights and Measures office in Middlesex, England. ...
Claude Shannon In information theory, the Shannon entropy or information entropy is a measure of the uncertainty associated with a random variable. ...
This article is about the unit of information. ...
Plot of log2 x In mathematics, the binary logarithm (log2 n) is the logarithm for base 2. ...
A nat (sometimes also nit or even nepit) is a logarithmic unit of information or entropy, based on natural logarithms and powers of e, rather than the powers of 2 and base 2 logarithms which define the bit. ...
The natural logarithm, formerly known as the hyperbolic logarithm, is the logarithm to the base e, where e is an irrational constant approximately equal to 2. ...
A ban, sometimes called a hartley, is a logarithmic unit which measures information or entropy, based on base 10 logarithms and powers of 10, rather than the powers of 2 and base 2 logarithms which define the bit. ...
In mathematics, the common logarithm is the logarithm with base 10. ...
In what follows, an expression of the form is considered by convention to be equal to zero whenever p is. This is justified because for any logarithmic base.
Entropy
Entropy of a Bernoulli trial as a function of success probability, often called the binary entropy function, H_{b}(p). The entropy is maximized at 1 bit per trial when the two possible outcomes are equally probable, as in an unbiased coin toss. The entropy, H, of a discrete random variable X is a measure of the amount of uncertainty associated with the value of X. Image File history File links This is a lossless scalable vector image. ...
Image File history File links This is a lossless scalable vector image. ...
In the theory of probability and statistics, a Bernoulli trial is an experiment whose outcome is random and can be either of two possible outcomes, called success and failure. ...
Entropy of a Bernoulli trial as a function of success probability, called the binary entropy function. ...
Claude Shannon In information theory, the Shannon entropy or information entropy is a measure of the uncertainty associated with a random variable. ...
Suppose one transmits 1000 bits (0s and 1s). If these bits are known ahead of transmission (to be a certain value with absolute probability), logic dictates that no information has been transmitted. If, however, each is equally and independently likely to be 0 or 1, 1000 bits (in the information theoretic sense) have been transmitted. Between these two extremes, information can be quantified as follows. If is the set of all messages x that X could be, and p(x) = Pr(X = x), then the entropy of X is defined:^{[1]} (Here, I(x) is the selfinformation, which is the entropy contribution of an individual message.) An important property of entropy is that it is maximized when all the messages in the message space are equiprobable—i.e., most unpredictable—in which case In information theory, selfinformation is measure of the information content associated with the outcome of a random variable. ...
The special case of information entropy for a random variable with two outcomes is the binary entropy function: Entropy of a Bernoulli trial as a function of success probability, called the binary entropy function. ...
Joint entropy The joint entropy of two discrete random variables X and Y is merely the entropy of their pairing: (X,Y). This implies that if X and Y are independent, then their joint entropy is the sum of their individual entropies. The joint entropy is an entropy measure used in information theory. ...
For example, if (X,Y) represents the position of a chess piece — X the row and Y the column, then the joint entropy of the row of the piece and the column of the piece will be the entropy of the position of the piece. For other uses, see Chess (disambiguation). ...
Despite similar notation, joint entropy should not be confused with cross entropy. In information theory, the cross entropy between two probability distributions measures the overall difference between the two distributions. ...
Conditional entropy (equivocation) The conditional entropy or conditional uncertainty of X given random variable Y (also called the equivocation of X about Y) is the average conditional entropy over Y:^{[2]} The conditional entropy is an entropy measure used in information theory. ...
Because entropy can be conditioned on a random variable or on that random variable being a certain value, care should be taken not to confuse these two definitions of conditional entropy, the former of which is in more common use. A basic property of this form of conditional entropy is that: Mutual information (transinformation) Mutual information measures the amount of information that can be obtained about one random variable by observing another. It is important in communication where it can be used to maximize the amount of information shared between sent and received signals. The mutual information of X relative to Y is given by: In probability theory and, in particular, information theory, the mutual information, or transinformation, of two random variables is a quantity that measures the mutual dependence of the two variables. ...
where SI is the pointwise mutual information. Pointwise Mutual Information (PMI) is a measure of association used in information theory and statistical probability theory. ...
A basic property of the mutual information is that That is, knowing Y, we can save an average of I(X;Y) bits in encoding X compared to not knowing Y. Mutual information is symmetric: In mathematics, the theory of symmetric functions is part of the theory of polynomial equations, and also a substantial chapter of combinatorics. ...
Mutual information can be expressed as the average Kullback–Leibler divergence (information gain) of the posterior probability distribution of X given the value of Y to the prior distribution on X: In probability theory and information theory, the Kullbackâ€“Leibler divergence (or information divergence, or information gain, or relative entropy) is a natural distance measure from a true probability distribution P to an arbitrary probability distribution Q. Typically P represents data, observations, or a precise calculated probability distribution. ...
The posterior probability can be calculated by Bayes theorem from the prior probability and the likelihood function. ...
A prior probability is a marginal probability, interpreted as a description of what is known about a variable in the absence of some evidence. ...
In other words, this is a measure of how much, on the average, the probability distribution on X will change if we are given the value of Y. This is often recalculated as the divergence from the product of the marginal distributions to the actual joint distribution: Mutual information is closely related to the loglikelihood ratio test in the context of contingency tables and the multinomial distribution and to Pearson's χ^{2} test: mutual information can be considered a statistic for assessing independence between a pair of variables, and has a wellspecified asymptotic distribution. A likelihoodratio test is a statistical test relying on a test statistic computed by taking the ratio of the maximum value of the likelihood function under the constraint of the null hypothesis to the maximum with that constraint relaxed. ...
In probability theory, the multinomial distribution is a generalization of the binomial distribution. ...
Pearsons chisquare test (Ï‡2) is one of a variety of chisquare tests â€“ statistical procedures whose results are evaluated by reference to the chisquare distribution. ...
Kullback–Leibler divergence (information gain) The Kullback–Leibler divergence (or information divergence, information gain, or relative entropy) is a way of comparing two distributions: a "true" probability distribution p(X), and an arbitrary probability distribution q(X). If we compress data in a manner that assumes q(X) is the distribution underlying some data, when, in reality, p(X) is the correct distribution, the Kullback–Leibler divergence is the number of average additional bits per datum necessary for compression. It is thus defined In probability theory and information theory, the Kullbackâ€“Leibler divergence (or information divergence, or information gain, or relative entropy) is a natural distance measure from a true probability distribution P to an arbitrary probability distribution Q. Typically P represents data, observations, or a precise calculated probability distribution. ...
In mathematics and statistics, a probability distribution is a function of the probabilities of a mutually exclusive and exhaustive set of events. ...
Although it is sometimes used as a 'distance metric', it is not a true metric since it is not symmetric and does not satisfy the triangle inequality (making it a semiquasimetric). In mathematics a metric or distance function is a function which defines a distance between elements of a set. ...
In mathematics, triangle inequality is the theorem stating that for any triangle, the measure of a given side must be less than the sum of the other two sides but greater than the difference between the two sides. ...
Other quantities Other important information theoretic quantities include Rényi entropy (a generalization of entropy) and differential entropy (a generalization of quantities of information to continuous distributions.) In information theory, the RÃ©nyi entropy, a generalisation of Shannon entropy, is one of a family of functionals for quantifying the diversity, uncertainty or randomness of a system. ...
Differential entropy (also referred to as continuous entropy) is a concept in information theory which tries to extend the idea of (Shannon) entropy, a measure of average surprisal of a random variable, to continuous probability distributions. ...
Coding theory 
A picture showing scratches on the readable surface of a CDR. Music and data CDs are coded using error correcting codes and thus can still be read even if they have minor scratches using error detection and correction. Coding theory is the most important and direct application of information theory. It can be subdivided into source coding theory and channel coding theory. Using a statistical description for data, information theory quantifies the number of bits needed to describe the data, which is the information entropy of the source. Coding theory is a branch of mathematics and computer science dealing with the errorprone process of transmitting data across noisy channels, via clever means, so that a large number of errors that occur can be corrected. ...
Image File history File links No higher resolution available. ...
Image File history File links No higher resolution available. ...
It has been suggested that this article or section be merged with SECDED. (Discuss) In mathematics, computer science, telecommunication, and information theory, error detection and correction has great practical importance in maintaining data (information) integrity across noisy channels and lessthanreliable storage media. ...
Coding theory is a branch of mathematics and computer science dealing with the errorprone process of transmitting data across noisy channels, via clever means, so that a large number of errors that occur can be corrected. ...
â€œSource codingâ€ redirects here. ...
In computer science and information theory, error correction consists of using methods to detect and/or correct errors in the transmission or storage of data by the use of some amount of redundant data and (in the case of transmission) the selective retransmission of incorrect segments of the data. ...
 Data compression (source coding): There are two formulations for the compression problem:
 lossless data compression: the data must be reconstructed exactly;
 lossy data compression: allocates bits needed to reconstruct the data, within a specified fidelity level measured by a distortion function. This subset of Information theory is called rate–distortion theory.
 Errorcorrecting codes (channel coding): While data compression removes as much redundancy as possible, an error correcting code adds just the right kind of redundancy (i.e. error correction) needed to transmit the data efficiently and faithfully across a noisy channel.
This division of coding theory into compression and transmission is justified by the information transmission theorems, or source–channel separation theorems that justify the use of bits as the universal currency for information in many contexts. However, these theorems only hold in the situation where one transmitting user wishes to communicate to one receiving user. In scenarios with more than one transmitter (the multipleaccess channel), more than one receiver (the broadcast channel) or intermediary "helpers" (the relay channel), or more general networks, compression followed by transmission may no longer be optimal. Network information theory refers to these multiagent communication models. Lossless data compression is a class of data compression algorithms that allows the exact original data to be reconstructed from the compressed data. ...
A lossy data compression method is one where compressing data and then decompressing it retrieves data that may well be different from the original, but is close enough to be useful in some way. ...
Rateâ€“distortion theory is a major branch of information theory; it addresses the problem of determining the minimal amount of entropy (or information) R that should be communicated over a channel, so that the source (input signal) can be approximately reconstructed at the receiver (output signal) without exceeding a given...
Redundancy in information theory is the number of bits used to transmit a message minus the number of bits of actual information in the message. ...
In computer science and information theory, error correction consists of using methods to detect and/or correct errors in the transmission or storage of data by the use of some amount of redundant data and (in the case of transmission) the selective retransmission of incorrect segments of the data. ...
In information theory, a relay channel is a probability model on the communication between a sender and a receiver aided by one or more intermediate relay nodes. ...
For the scientific and engineering discipline studying computer networks, see Computer networking. ...
Source theory Any process that generates successive messages can be considered a source of information. A memoryless source is one in which each message is an independent identicallydistributed random variable, whereas the properties of ergodicity and stationarity impose more general constraints. All such sources are stochastic. These terms are well studied in their own right outside information theory. A source is one of the basic concepts of communication and information processing. ...
In probability theory, a sequence or other collection of random variables is independent and identically distributed (i. ...
In mathematics, a measurepreserving transformation T on a probability space is said to be ergodic if the only measurable sets invariant under T have measure 0 or 1. ...
In the mathematical sciences, a stationary process (or strict(ly) stationary process) is a stochastic process in which the probability density function of some random variable X does not change over time or position. ...
In the mathematics of probability, a stochastic process is a random function. ...
Rate Information rate is the average entropy per symbol. For memoryless sources, this is merely the entropy of each symbol, while, in the case of a stationary stochastic process, it is The entropy rate of a stochastic process is, informally, the time density of the average information in a stochastic process. ...
that is, the conditional entropy of a symbol given all the previous symbols generated. For the more general case of a process that is not necessarily stationary, the average rate is that is, the limit of the joint entropy per symbol. For stationary sources, these two expressions give the same result.^{[3]} It is common in information theory to speak of the "rate" or "entropy" of a language. This is appropriate, for example, when the source of information is English prose. The rate of a source of information is related to its redundancy and how well it can be compressed, the subject of source coding. Redundancy in information theory is the number of bits used to transmit a message minus the number of bits of actual information in the message. ...
â€œSource codingâ€ redirects here. ...
Channel capacity 
Communications over a channel—such as an ethernet wire—is the primary motivation of information theory. As anyone who's ever used a telephone (mobile or landline) knows, however, such channels often fail to produce exact reconstruction of a signal; noise, periods of silence, and other forms of signal corruption often degrade quality. How much information can one hope to communicate over a noisy (or otherwise imperfect) channel? In information theory, the noisychannel coding theorem establishes that however contaminated with noise interference a communication channel may be, it is possible to communicate digital data (information) errorfree up to a given maximum rate through the channel. ...
Ethernet is a large, diverse family of framebased computer networking technologies that operate at many speeds for local area networks (LANs). ...
Consider the communications process over a discrete channel. A simple model of the process is shown below: Here X represents the space of messages transmitted, and Y the space of messages received during a unit time over our channel. Let p(y  x) be the conditional probability distribution function of Y given X. We will consider p(y  x) to be an inherent fixed property of our communications channel (representing the nature of the noise of our channel). Then the joint distribution of X and Y is completely determined by our channel and by our choice of f(x), the marginal distribution of messages we choose to send over the channel. Under these constraints, we would like to maximize the rate of information, or the signal, we can communicate over the channel. The appropriate measure for this is the mutual information, and this maximum mutual information is called the channel capacity and is given by: Image File history File links This is a lossless scalable vector image. ...
This article defines some terms which characterize probability distributions of two or more variables. ...
In science, and especially in physics and telecommunication, noise is fluctuations in and the addition of external factors to the stream of target information (signal) being received at a detector. ...
In the fields of communications, signal processing, and in electrical engineering more generally, a signal is any timevarying quantity. ...
In probability theory and, in particular, information theory, the mutual information, or transinformation, of two random variables is a quantity that measures the mutual dependence of the two variables. ...
This article does not cite any references or sources. ...
This capacity has the following property related to communicating at information rate R (where R is usually bits per symbol). For any information rate R < C and coding error ε > 0, for large enough N, there exists a code of length N and rate ≥ R and a decoding algorithm, such that the maximal probability of block error is ≤ ε; that is, it is always possible to transmit with arbitrarily small block error. In addition, for any rate R > C, it is impossible to transmit with arbitrarily small block error. Channel coding is concerned with finding such nearly optimal codes that can be used to transmit data over a noisy channel with a small coding error at a rate near the channel capacity. Channel Codes In a wireless network following the IEEE 802. ...
It has been suggested that this article or section be merged with SECDED. (Discuss) In mathematics, computer science, telecommunication, and information theory, error detection and correction has great practical importance in maintaining data (information) integrity across noisy channels and lessthanreliable storage media. ...
Channel capacity of particular model channels  A continuoustime analog communications channel subject to Gaussian noise — see Shannon–Hartley theorem.
 A binary symmetric channel (BSC) with crossover probability p is a binary input, binary output channel that flips the input bit with probability p. The BSC has a capacity of 1 − H_{b}(p) bits per channel use, where H_{b} is the binary entropy function:


 A binary erasure channel (BEC) with erasure probability p is a binary input, ternary output channel. The possible channel outputs are 0, 1, and a third symbol 'e' called an erasure. The erasure represents complete loss of information about an input bit. The capacity of the BEC is 1  p bits per channel use.


In information theory, the Shannonâ€“Hartley theorem is an application of the noisy channel coding theorem to the archetypal case of a continuoustime analog communications channel subject to Gaussian noise. ...
In coding theory, a binary symmetric channel (or BSC) is an idealized model of a communications channel that sends bits. ...
Entropy of a Bernoulli trial as a function of success probability, called the binary entropy function. ...
Image File history File links Binarysymmetricchannel. ...
Image File history File links Binaryerasurechannel. ...
Applications to other fields Intelligence uses and secrecy applications Information theoretic concepts apply to cryptography and cryptanalysis. Turing's information unit, the ban, was used in the Ultra project, breaking the German Enigma machine code and hastening the end of WWII in Europe. Shannon himself defined an important concept now called the unicity distance. Based on the redundancy of the plaintext, it attempts to give a minimum amount of ciphertext necessary to ensure unique decipherability. The German Lorenz cipher machine, used in World War II for encryption of very highlevel general staff messages Cryptography (or cryptology; derived from Greek ÎºÏÏ…Ï€Ï„ÏŒÏ‚ kryptÃ³s hidden, and the verb Î³ÏÎ¬Ï†Ï‰ grÃ¡fo write or Î»ÎµÎ³ÎµÎ¹Î½ legein to speak) is the study of message secrecy. ...
Cryptanalysis (from the Greek kryptÃ³s, hidden, and analÃ½ein, to loosen or to untie) is the study of methods for obtaining the meaning of encrypted information, without access to the secret information which is normally required to do so. ...
Turing may refer to: Alan Turing Turing (programming language) Turing (cipher) Turing machine Turing completeness Turing test Reverse Turing test Turing Award Turing Police ChurchTuring thesis TURing interface Turing (novel), by Christos Papadimitrou, published in 2003 Turing Scholars Category: ...
A ban, sometimes called a hartley, is a logarithmic unit which measures information or entropy, based on base 10 logarithms and powers of 10, rather than the powers of 2 and base 2 logarithms which define the bit. ...
Ultra (sometimes capitalized ULTRA) was the name used by the British for intelligence resulting from decryption of German communications in World War II. The term eventually became the standard designation in both Britain and the United States for all intelligence from highlevel cryptanalytic sources. ...
For other uses, see Enigma. ...
Churchill waves to crowds in Whitehall on the day he broadcast to the nation that the war with Germany had been won, 8 May 1945. ...
Unicity distance is a term used in cryptography referring to the length of an original ciphertext needed to break the cipher by reducing the number of possible spurious keys to zero in a brute force attack. ...
Redundancy in information theory is the number of bits used to transmit a message minus the number of bits of actual information in the message. ...
In cryptography, plaintext is information used as input to an encryption algorithm; the output is termed ciphertext. ...
This article is about algorithms for encryption and decryption. ...
Information theory leads us to believe it is much more difficult to keep secrets than it might first appear. A brute force attack can break systems based on publickey cryptography or on most commonly used methods of privatekey cryptography, such as block ciphers. The security of such methods comes from the assumption that no known attack can break them in a practical amount of time. The EFFs US$250,000 DES cracking machine contained over 1,800 custom chips and could brute force a DES key in a matter of days â€” the photograph shows a DES Cracker circuit board fitted with several Deep Crack chips. ...
A big random number is used to make a publickey pair. ...
This article does not cite any references or sources. ...
Encryption Decryption In cryptography, a block cipher is a symmetric key cipher which operates on fixedlength groups of bits, termed blocks, with an unvarying transformation. ...
Information theoretic security refers to methods such as the onetime pad that are not vulnerable to such brute force attacks. In such cases, the positive conditional mutual information between the plaintext and ciphertext (conditioned on the key) can ensure proper transmission, while the unconditional mutual information between the plaintext and ciphertext remains zero, resulting in absolutely secure communications. In other words, an eavesdropper would not be able to improve his or her guess of the plaintext by gaining knowledge of the ciphertext but not of the key. However, as in any other cryptographic system, care must be used to correctly apply even informationtheoretically secure methods; the Venona project was able to crack the onetime pads of the Soviet Union due to their improper reuse. A cryptosystem is informationtheoretically secure or perfectly secure if a ciphertext produced using it provides no information about the plaintext to anyone other than the key holders. ...
Excerpt from a onetime pad. ...
In probability theory and, in particular, information theory, the mutual information, or transinformation, of two random variables is a quantity that measures the mutual dependence of the two variables. ...
A key is a piece of information that controls the operation of a cryptography algorithm. ...
The Venona project was a longrunning and highly secret collaboration between intelligence agencies of the United States and United Kingdom that involved the cryptanalysis of messages sent by several intelligence agencies of the Soviet Union. ...
Pseudorandom number generation Cryptographically secure pseudorandom number generators need effectively random seeds, which can be obtained via extractors. The measure of sufficient randomness for extractors is minentropy, a value related to Shannon entropy through Rényi entropy; Rényi entropy is also used in evaluating randomness in cryptographic systems. Although related, the distinctions among these measures mean that a random variable with high Shannon entropy is not necessarily satisfactory for use in an extractor. A cryptographically secure pseudorandom number generator (CSPRNG) is a pseudorandom number generator (PRNG) with properties that make it suitable for use in cryptography. ...
A random seed (or seed state, or just seed) is a number (or vector) used to initialize a pseudorandom number generator. ...
Mathematics An (N,M,D,K,e)extractor is a bipartite graph with N nodes on the left and M nodes on the right such that each node on the left has D neighbors (on the right), which has the added property that for any subset A of N of...
A probability distribution has minentropy k if no value in the distribution has probability higher than 2−k. ...
In information theory, the RÃ©nyi entropy, a generalisation of Shannon entropy, is one of a family of functionals for quantifying the diversity, uncertainty or randomness of a system. ...
In probability theory, a random variable is a quantity whose values are random and to which a probability distribution is assigned. ...
Miscellaneous applications Information theory also has applications in gambling and investing, black holes, bioinformatics, and music. Kelly gambling or proportional gambling is an application of information theory to gambling and (with some ethical and legal reservations) investing. ...
This article or section cites very few or no references or sources. ...
Map of the human X chromosome (from the NCBI website). ...
References Footnotes  ^ Fazlollah M. Reza (1961, 1994). An Introduction to Information Theory. Dover Publications, Inc., New York. ISBN 0486682102.
 ^ Robert B. Ash (1965, 1990). Information Theory. Dover Publications, Inc.. ISBN 0486665216.
 ^ Jerry D. Gibson (1998). Digital Compression for Multimedia: Principles and Standards. Morgan Kaufmann. ISBN 1558603697.
The classic work  Ludwig Boltzmann formally defined entropy in 1870. Compare: Boltzmann, Ludwig (1896, 1898). Vorlesungen über Gastheorie : 2 Volumes  Leipzig 1895/98 UB: O 52626. English version: Lectures on gas theory. Translated by Stephen G. Brush (1964) Berkeley: University of California Press; (1995) New York: Dover ISBN 0486684555
Claude Shannon Claude Elwood Shannon (April 30, 1916 â€“ February 24, 2001), an American electrical engineer and mathematician, has been called the father of information theory, and was the founder of practical digital circuit design theory. ...
The article entitled A Mathematical Theory of Communication, published in 1948 by mathematician Claude E. Shannon, was one of the founding works of the field of information theory. ...
Ludwig Eduard Boltzmann (Vienna, Austrian Empire, February 20, 1844 â€“ Duino near Trieste, September 5, 1906) was an Austrian physicist famous for his founding contributions in the fields of statistical mechanics and statistical thermodynamics. ...
Other journal articles  R.V.L. Hartley, "Transmission of Information," Bell System Technical Journal, July 1928
 J. L. Kelly, Jr., "A New Interpretation of Information Rate," Bell System Technical Journal, Vol. 35, July 1956, pp. 91726
 R. Landauer, Information is Physical Proc. Workshop on Physics and Computation PhysComp'92 (IEEE Comp. Sci.Press, Los Alamitos, 1993) pp. 14.
 R. Landauer, "Irreversibility and Heat Generation in the Computing Process" IBM J. Res. Develop. Vol. 5, No. 3, 1961
Textbooks on information theory  Claude E. Shannon, Warren Weaver. The Mathematical Theory of Communication. Univ of Illinois Press, 1949. ISBN 0252725484
 Robert Gallager. Information Theory and Reliable Communication. New York: John Wiley and Sons, 1968. ISBN 0471290483
 Robert B. Ash. Information Theory. New York: Interscience, 1965. ISBN 0470034459. New York: Dover 1990. ISBN 0486665216
 Thomas M. Cover, Joy A. Thomas. Elements of information theory, 1st Edition. New York: WileyInterscience, 1991. ISBN 0471062596.
 2nd Edition. New York: WileyInterscience, 2006. ISBN 0471241954.
 Imre Csiszar, Janos Korner. Information Theory: Coding Theorems for Discrete Memoryless Systems Akademiai Kiado: 2nd edition, 1997. ISBN 9630574403
 Raymond W. Yeung. A First Course in Information Theory Kluwer Academic/Plenum Publishers, 2002. ISBN 0306467917
 David J. C. MacKay. Information Theory, Inference, and Learning Algorithms Cambridge: Cambridge University Press, 2003. ISBN 0521642981
 Stanford Goldman. Information Theory. New York: Prentice Hall, 1953. New York: Dover 1968 ISBN 0486622096, 2005 ISBN 0486442713
 Fazlollah Reza. An Introduction to Information Theory. New York: McGrawHill 1961. New York: Dover 1994. ISBN 0486682102
 Masud Mansuripur. Introduction to Information Theory. New York: Prentice Hall, 1987. ISBN 0134846680
 Christoph Arndt: Information Measures, Information and its Description in Science and Engineering (Springer Series: Signals and Communication Technology), 2004, ISBN 9783540408550, [1];
Claude Elwood Shannon (April 30, 1916  February 24, 2001) has been called the father of information theory, and was the founder of practical digital circuit design theory. ...
Robert G. Gallager (born May 29, 1931 in Philadelphia, PA) is an American electrical engineer known for his work on information theory and communications networks. ...
Thomas M. Cover (born August 7, 1938) is Professor jointly in the Departments of Electrical Engineering and Statistics at Stanford University. ...
Imre CsiszÃ¡r is a Hungarian mathematician with contributions to information theory and probability theory. ...
Fazlollah Reza (born on January 1, 1915 in Rasht, Iran) graduated from University of Tehran in 1938 by receiving his bachelors degree in electrical engineering. ...
Other books  James Bamford, The Puzzle Palace, Penguin Books, 1983. ISBN 0140067485
 Leon Brillouin, Science and Information Theory, Mineola, N.Y.: Dover, [1956, 1962] 2004. ISBN 0486439186
 A. I. Khinchin, Mathematical Foundations of Information Theory, New York: Dover, 1957. ISBN 0486604349
 H. S. Leff and A. F. Rex, Editors, Maxwell's Demon: Entropy, Information, Computing, Princeton University Press, Princeton, NJ (1990). ISBN 069108727X
 Tom Siegfried, The Bit and the Pendulum, Wiley, 2000. ISBN 0471321745
 Charles Seife, Decoding The Universe, Viking, 2006. ISBN 067003441X
See also There is much discussion in the academic world of communication as to what actually constitutes communication. ...
This is a list of important publications in computer science, organized by field. ...
The philosophy of information (PI) is a new area of research, which studies conceptual issues arising at the intersection of computer science, information technology, and philosophy. ...
Applications The German Lorenz cipher machine, used in World War II for encryption of very highlevel general staff messages Cryptography (or cryptology; derived from Greek ÎºÏÏ…Ï€Ï„ÏŒÏ‚ kryptÃ³s hidden, and the verb Î³ÏÎ¬Ï†Ï‰ grÃ¡fo write or Î»ÎµÎ³ÎµÎ¹Î½ legein to speak) is the study of message secrecy. ...
Cryptanalysis (from the Greek kryptÃ³s, hidden, and analÃ½ein, to loosen or to untie) is the study of methods for obtaining the meaning of encrypted information, without access to the secret information which is normally required to do so. ...
This article or section is in need of attention from an expert on the subject. ...
Intelligence (abbreviated or ) is the process and the result of gathering information and analyzing it to answer questions or obtain advance warnings needed to plan for the future. ...
Caravaggio, The Cardsharps, c. ...
For other uses, see Cybernetics (disambiguation). ...
History The decisive event which established the discipline of information theory, and brought it to immediate worldwide attention, was the publication of Claude E. Shannon (1916â€“2001)s classic paper A Mathematical Theory of Communication in the Bell System Technical Journal in July and October of 1948. ...
A timeline of events related to information theory, data compression, error correcting codes and related subjects. ...
Claude Shannon Claude Elwood Shannon (April 30, 1916 â€“ February 24, 2001), an American electrical engineer and mathematician, has been called the father of information theory, and was the founder of practical digital circuit design theory. ...
Ralph Vinton Lyon Hartley (November 30, 1888  May 1, 1970) was an electronics researcher. ...
Professor Hubert P. Yockey (b. ...
Theory Coding theory is a branch of mathematics and computer science dealing with the errorprone process of transmitting data across noisy channels, via clever means, so that a large number of errors that occur can be corrected. ...
It has been suggested that this article or section be merged with Entropy encoding. ...
Detection theory, or signal detection theory, is a means to quantify the ability to discern between signal and noise. ...
Estimation theory is a branch of statistics and signal processing that deals with estimating the values of parameters based on measured/empirical data. ...
In statistics and information theory, the Fisher information (denoted ) is the variance of the score. ...
In computer science, the Kolmogorov complexity (also known as descriptive complexity, KolmogorovChaitin complexity, stochastic complexity, algorithmic entropy, or programsize complexity) of an object such as a piece of text is a measure of the computational resources needed to specify the object. ...
Classical information theory goes back to Claude Shannon. ...
This article or section is in need of attention from an expert on the subject. ...
This article or section is in need of attention from an expert on the subject. ...
The logic of information, or the logical theory of information, considers the information content of logical signs and expressions along the lines initially developed by Charles Sanders Peirce. ...
Network coding is a field of information theory and coding theory and is a method of attaining maximum information flow in a network. ...
Quantum information science is a field of research at the interface of quantum mechanics and computer science. ...
Semiotic information theory considers the information content of signs and expressions as it is conceived within the semiotic or signrelational framework developed by Charles Sanders Peirce. ...
The philosophy of information (PI) is a new area of research, which studies conceptual issues arising at the intersection of computer science, information technology, and philosophy. ...
Concepts In information theory, selfinformation is measure of the information content associated with the outcome of a random variable. ...
Claude Shannon In information theory, the Shannon entropy or information entropy is a measure of the uncertainty associated with a random variable. ...
The joint entropy is an entropy measure used in information theory. ...
The conditional entropy is an entropy measure used in information theory. ...
Redundancy in information theory is the number of bits used to transmit a message minus the number of bits of actual information in the message. ...
Channel, in communications (sometimes called communications channel), refers to the medium used to convey information from a sender (or transmitter) to a receiver. ...
A source is one of the basic concepts of communication and information processing. ...
The receiver in information theory is the receiving end of a communication channel (in particular the binary symmetric channel) in information theory. ...
In information theory, the RÃ©nyi entropy, a generalisation of Shannon entropy, is one of a family of functionals for quantifying the diversity, uncertainty or randomness of a system. ...
Variety was a term introduced to cybernetics by Ashby to describe how systems can be stable under perturbation. ...
In probability theory and, in particular, information theory, the mutual information, or transinformation, of two random variables is a quantity that measures the mutual dependence of the two variables. ...
Pointwise Mutual Information (PMI) is a measure of association used in information theory and statistical probability theory. ...
Differential entropy (also referred to as continuous entropy) is a concept in information theory which tries to extend the idea of (Shannon) entropy, a measure of average surprisal of a random variable, to continuous probability distributions. ...
In probability theory and information theory, the KullbackLeibler divergence (or information divergence, or information gain, or relative entropy) is a natural distance measure from a true probability distribution P to an arbitrary probability distribution Q. Typically P represents data, observations, or a precise calculated probability distribution. ...
This article does not cite any references or sources. ...
Unicity distance is a term used in cryptography referring to the length of an original ciphertext needed to break the cipher by reducing the number of possible spurious keys to zero in a brute force attack. ...
A ban, sometimes called a hartley, is a logarithmic unit which measures information or entropy, based on base 10 logarithms and powers of 10, rather than the powers of 2 and base 2 logarithms which define the bit. ...
In information theory, a covert channel is a communications channel that does a writingbetweenthelines form of communication. ...
An encoder is a device used to encode a signal (such as a bitstream) or data into a form that is acceptable for transmission or storage. ...
A Digitrax DH163AT DCC decoder in an Athearn locomotive before the shell goes on. ...
External links  Gibbs, M., "Quantum Information Theory", Eprint
 Schneider, T., "Information Theory Primer", Eprint
 Srinivasa, S. "A Review on Multivariate Mutual Information" PDF.
 Challis, J. Lateral Thinking in Information Retrieval
 Journal of Chemical Education, Shuffled Cards, Messy Desks, and Disorderly Dorm Rooms  Examples of Entropy Increase? Nonsense!
 IEEE Information Theory Society and the review articles.
 Online textbook: Information Theory, Inference, and Learning Algorithms, by David MacKay  gives an entertaining and thorough introduction to Shannon theory, including stateoftheart methods from coding theory, such as arithmetic coding, lowdensity paritycheck codes, and Turbo codes.
