Gilbert Vernam invented and patented his cipher in 1917 while working at AT&T. The teletype had been recently introduced, and along with this the commerical Baudot code. Now messages were uniformly thought of as streams of zero's and one's (But the word "bit" was not yet invented. This is due to Shannon in the 40's.)

Vernam proposed a bit-wise exclusive or of the message stream with a truely random zero-one stream which was shared by sender and receipient.

**Example:**

SENDING ------- message: 0 0 1 0 1 1 0 1 0 1 1 1 ... pad: 1 0 0 1 1 1 0 0 1 0 1 1 ... XOR --------------------------- cipher: 1 0 1 1 0 0 0 1 1 1 0 0 ... RECEIVING --------- cipher: 1 0 1 1 0 0 0 1 1 1 0 0 ... pad: 1 0 0 1 1 1 0 0 1 0 1 1 ... XOR --------------------------- message: 0 0 1 0 1 1 0 1 0 1 1 1 ...This cipher is unbreakable in a very strong sense. The intuition is that any message can be transformed into any cipher (of the same length) by a pad, and all transformations are equally likely. Given a two letter message, there is a pad which adds to the message to give OK, and another pad which adds to the message to give NO. Since either of these pads are equally likely, the message is equally likely to be OK or NO.

**Formal argument:**

How do we capture the intuition for the security of a one-time-pad in a mathematical proof? As we state the proof, the reader might have to be reminded of some concepts in probability. In particular, probability distributions, conditional probability, and independence of events.

We will take as our definition of knowledge a probability distribution. Given a message m, the probability of m, P(m) will be the likelihood that the message was m. For messages which are gibberish, P(m) will be zero. Some messages are sensible, but in context implausible. For these P(m) will be small. In fact, the details of the distribution P(m) are unimportant. Just that it can represent our initial knowledge of the message m.

Given an observed ciphertext c, we attempt to update our knowledge
of the distribution P(m). This is called the conditional probability
of m given c, and written P(m|c). If we learn nothing by observing
c, then for every m, P(m) does not change, P(m|c)=P(m). If this
is true for every c and every m, then we have *perfect
secrecy*, our knowledge of m has not been improved by observing
c.

The definition of conditional probability is,

P(m|c) = P(m and c)/P(c)The event (m and c) is the same as the event (m and p) where p is the pad which equals m(+)c. Since the message and the pad are independent events,

P(m and c) = P(m and p) = P(m) P(p)We now show that P(c) = P(p), so they cancel and we have P(m|c) = P(m).

The probability of P(c) is the probability that a message m and
a pad p came together to form c. For every message m_{i} there
is exactly one pad p_{i} yielding c, namely,

pSo,_{i}= m_{i}(+) c

P(c) = Sumusing the law of total probability (c must be formed by exactly one of the events (m_i and p_i)), independence of m_i and p_i, and that the sum of probabilities of all m_i equals one._{i}P(m_{i}and p_{i}) = Sum_{i}P(m_{i}) P(p_{i}) = (1/2^{n}) Sum_{i}P(m_{i}) = 1/2^{n}

Also, we have used that,

P(pfor any pad p_{i}) = P(p) = 1/2^{n},

**A word of caution:**

The conclusion that the Verman cipher gives perfect
secrecy depends on the assumption that each pad
is equally likely. If the pad is used to encipher
more than one message, this is no longer true, and
the message may be discovered.
It is important that a pad once used is discarded. That is
the reason for the name *one-time-pad*, also known
as OTP.
If this warning is not heeded, the two ciphertexts can
be subtracted, thus eliminating the pad. What is left
is the difference of messages, which has a distribution
reflecting back on the possiblity of choice of pad.
This has been known to completely break the cipher.

The calculation is,

c = m (+) p and c' = m' (+) p implies c (+) c'= (m(+)p) (+) (m'(+)p)= (m(+)m') (+) (p(+)p) = m (+) m'The pad has been subtracted off.

Although the distribution on pads is uniform, P(p)=1/2^{n}, for
any p, the conditional probabilty of pads given a ciphertext,
P(p|c), is not.
It is exactly the probability of the message being m where m = c (+) p.
Given two ciphertexts and the undestanding that the messages must be
plausible for each ciphertext under a single pad, we can modify
P(p|c) and consequently, P(m|c).
Whether this information is enough to determine the messages and the pads
depends on the situation. However, we have violated our absolute requirement
for perfect secrecy.

*
Author: Burton Rosenberg
Creation: Mon Sep 6 15:11:20 EDT 2004
*