### From Wikipedia

In cryptography, a **message authentication code based on universal hashing**, or **UMAC**, is a type of message authentication code (MAC) calculated choosing a hash function from a class of hash functions according to some secret (random) process and applying it to the message. The resulting digest or fingerprint is then encrypted to hide the identity of the hash function used. As with any MAC, it may be used to simultaneously verify both the *data integrity* and the *authenticity* of a message. A UMAC has provable cryptographic strength and is usually a lot less computationally intensive than other MACs. See also: Topics in cryptography The security of all practical encryption schemes remains unproven, both for symmetric and asymmetric schemes. ...
In cryptography, a message authentication code (MAC) is a short piece of information used to authenticate a message. ...
In telecommunication, the term data integrity has the following meanings: The condition that exists when data is unchanged from its source and has not been accidentally or maliciously modified, altered, or destroyed. ...
Message in its most general meaning is the object of communication. ...

## Universal Hashing

Let's say the hash function is chosen from a class of hash functions H, which maps messages into a D, the set of possible message digests. This class is called universal if, for any distict pair of messages, there are at most |H|/|D| functions that map them to the same member of D.

This that if an attacker wants to replace one message with another and, from his point of view the hash function was chosen completely randomly, the probability that the UMAC will not detect his modification is at most 1/|D|.

But this definition is not strong enough — if the possible messages are 0 and 1, D={0,1} and H consists of the identity operation and *not*, H is universal. But if the digest is then encrypted by modular addition, the attacker can change the message and the digest at the same time and the receiver wouldn't know the difference.

## Strongly universal hashing

A class of hash functions H that is good to use will make it difficult for an attacker to guess the correct digest *d* of a fake message *f* after intercepting one message *a* with digest *c*. In other words

needs to be very small, preferably 1/|D|.

It's easy to construct a class of hash functions when D is field. For example if |D| is prime, all the operations are taken modulo |D|. The message *a* is then encoded as an n-dimensional vector over D (a_{1},a_{2},..,a_{n}). H then has |D|^{n+1} members, each corresponding to an n+1-dimensional vector over D (h_{0},h_{1},..,h_{n}). If we let In abstract algebra, a finite field or Galois field (so named in honor of Evariste Galois) is a field that contains only finitely many elements. ...
In mathematics, a prime number (or prime) is a natural number greater than one whose only positive divisors are one and itself. ...
The word modulo is the Latin ablative of modulus. ...

we can use the rules of probabilities and combinatorics to prove that

If we properly encrypt all the digests (e.g. with a one-time pad), an attacker cannot learn anything from them and the same hash function can be used for all communication between the two parties. This may not be true for EBC encryption because it may be quite likely that two messages produce the same hash value. Then some kind of initialization vector should be used, which is often called the nonce. It has become common pratice to set h_{0}=f(nonce), where f is also secret. In cryptography, the one-time pad (OTP) is the only theoretically unbreakable method of encryption: the plaintext is combined with a random pad the same length as the plaintext. ...
In cryptography, a block cipher operates on blocks of fixed length, often 64 or 128 bits. ...
In cryptography, an initialization vector (IV) is a block of bits that is combined with the first block of data in any of several modes of a block cipher. ...
Nonce means for the present time or for a single occasion or purpose, although the word is not often found in general use. ...

Notice that having massive amounts of computer power does not help the attacker at all. If the recipient limits the amount of forgeries it accepts (by sleeping whenever it detects one), |D| can be 2^{32} or smaller.

## References

- UMAC is an internet draft. It's fast and based on AES.

This article is about the block cipher. ...

## See Also

- Poly1305-AES is another fast MAC based on strongly universal hashing and AES.