Vernam Cipher, a perfect cipher

As introduction to stream ciphers, and to demonstrate that a perfect cipher does exist, we describe the Vernam Cipher, also known as the one-time-pad.

Gilbert Vernam invented and patented his cipher in 1917 while working at AT&T. The teletype had been recently introduced, and along with this the commerical Baudot code. Now messages were uniformly thought of as streams of zero's and one's (But the word "bit" was not yet invented. This is due to Shannon in the 40's.)

Vernam proposed a bit-wise exclusive or of the message stream with a truely random zero-one stream which was shared by sender and receipient.

Example:

   SENDING
   -------
   message: 0 0 1 0 1 1 0 1 0 1 1 1 ...
   pad:     1 0 0 1 1 1 0 0 1 0 1 1 ...
   XOR      ---------------------------
   cipher:  1 0 1 1 0 0 0 1 1 1 0 0 ...

   RECEIVING
   ---------
   cipher:  1 0 1 1 0 0 0 1 1 1 0 0 ...
   pad:     1 0 0 1 1 1 0 0 1 0 1 1 ...
   XOR      ---------------------------
   message: 0 0 1 0 1 1 0 1 0 1 1 1 ...
This cipher is unbreakable in a very strong sense. The intuition is that any message can be transformed into any cipher (of the same length) by a pad, and all transformations are equally likely. Given a two letter message, there is a pad which adds to the message to give OK, and another pad which adds to the message to give NO. Since either of these pads are equally likely, the message is equally likely to be OK or NO.

Formal argument:

How do we capture the intuition for the security of a one-time-pad in a mathematical proof? As we state the proof, the reader might have to be reminded of some concepts in probability. In particular, probability distributions, conditional probability, and independence of events.

We will take as our definition of knowledge a probability distribution. Given a message m, the probability of m, P(m) will be the likelihood that the message was m. For messages which are gibberish, P(m) will be zero. Some messages are sensible, but in context implausible. For these P(m) will be small. In fact, the details of the distribution P(m) are unimportant. Just that it can represent our initial knowledge of the message m.

Given an observed ciphertext c, we attempt to update our knowledge of the distribution P(m). This is called the conditional probability of m given c, and written P(m|c). If we learn nothing by observing c, then for every m, P(m) does not change, P(m|c)=P(m). If this is true for every c and every m, then we have perfect secrecy, our knowledge of m has not been improved by observing c.

The definition of conditional probability is,

   P(m|c) = P(m and c)/P(c)
The event (m and c) is the same as the event (m and p) where p is the pad which equals m(+)c. Since the message and the pad are independent events,
   P(m and c) = P(m and p) = P(m) P(p)
We now show that P(c) = P(p), so they cancel and we have P(m|c) = P(m).

The probability of P(c) is the probability that a message m and a pad p came together to form c. For every message mi there is exactly one pad pi yielding c, namely,

   pi = mi (+) c
So,
   P(c) = Sumi P(mi and pi)
        = Sumi P(mi) P(pi)
        = (1/2n) Sumi P(mi) = 1/2n
using the law of total probability (c must be formed by exactly one of the events (m_i and p_i)), independence of m_i and p_i, and that the sum of probabilities of all m_i equals one.

Also, we have used that,

   P(pi) = P(p) = 1/2n, 
for any pad pi. This gives the result.

A word of caution:

The conclusion that the Verman cipher gives perfect secrecy depends on the assumption that each pad is equally likely. If the pad is used to encipher more than one message, this is no longer true, and the message may be discovered. It is important that a pad once used is discarded. That is the reason for the name one-time-pad, also known as OTP. If this warning is not heeded, the two ciphertexts can be subtracted, thus eliminating the pad. What is left is the difference of messages, which has a distribution reflecting back on the possiblity of choice of pad. This has been known to completely break the cipher.

The calculation is,

    c = m (+) p
  and
    c' = m' (+) p
  implies 
    c (+) c'= (m(+)p) (+) (m'(+)p)= (m(+)m') (+) (p(+)p) = m (+) m'
The pad has been subtracted off.

Although the distribution on pads is uniform, P(p)=1/2n, for any p, the conditional probabilty of pads given a ciphertext, P(p|c), is not. It is exactly the probability of the message being m where m = c (+) p. Given two ciphertexts and the undestanding that the messages must be plausible for each ciphertext under a single pad, we can modify P(p|c) and consequently, P(m|c). Whether this information is enough to determine the messages and the pads depends on the situation. However, we have violated our absolute requirement for perfect secrecy.


Author: Burton Rosenberg
Creation: Mon Sep 6 15:11:20 EDT 2004