WWII Code Breaking at Bletchley Park

Adversarial Indistinguishability

by: burt rosenberg
at: university of miami
date: sep 2019

Overview

This project uses Python to implement the adversarial indistinguishability game PRIVA,Πeav, a game between the adversary A and "The Protocol" Π. (a.k.a., an Anastasia–Piotr game.)

Recall the purpose of that definition. The Shannon Perfect Secrecy definition,

   Prob[ M | C ] = Prob[ M ]
is succinct and, once the notation is clear, intuitive. From a Bayesian point of view, in which conditional probability is the key concept, and are about updating beliefs about the state of the world, this definition says that in seeing the ciphertext the eaves dropper learns nothing about the message.

However, this definition does not provide us entry into the nature of the eaves dropper, also known as the Adversary. Since an adversary of unbounded power provides such a disappointing theory from a practical standpoint, restricting the power of the adversary moves the theory forward and provides useful results. We will later introduce Adversaries bounded by Probabilistic Polynomial Time (PPT). For this project, this is not a concern.

Problem 1: Indistinguishability for Vigenère

We will construct in Python the adversarial indistinguishability game that is the subject of Definition 2.5 in the class text, and described on page 31. We will run the code to verify and experiment with the problem given as Exercise 2.8 on page 39.
NAME
    adv-indistin-vign

SYNOPSIS
    adv-indistin-vign.py [-v][-k key _plaintext_] 

DESCRIPTION
    Implements the vigenere cipher, and wins the adversarial indistinguishability game
    against it. 

    When called without the -k option, runs the adversary and prints out the probability
    that the adversary decides the protocol's coin.

    When called with the -k option, encrypts _plaintext_ and prints the result.

OPTIONS 
    -k key  run in encrypt mode, to test encryption. Argument is plaintext to encrypt
    -v      verbose

    The plaintext alphabet is a–z, and the ciphertext alphabet is A–Z.
    The encryption key is any length of small letters, where the i-th letter of
    a k length key pertains to the shift of the (i mod k) letter of the plaintext.
    The letter a is shift value 0 (just capitalize the letter), b is 1, etc.
    
HISTORY
    First proposed in csc609/507 semester 182 (spring 2017-2018).

The code templates are given in [repo]/class/proj2/. Comments indicate where you are to complete the code. Copy to your [repo]/[user]/proj2, complete adv-indistin-vign.py and submit. A fully correct solution completes part (a) of the exercise by both calculating the theoretic advantage of the method given, and comparing with the experimental result; also by completing part (b) and finding a stronger method, with greater advantage, showing that in the code, and providing an experimental result.

The adversary_advantage function repeatedly runs adversary_sample to get the probability that the adversary wins the indistinguishability game, i.e. that the adversary can distinguish whether it was message 0 or message 1 that was encrypted. The adversary provides the messages by the return value of adversary_challenge and the logic to decide based on the ciphertext is in adversary_decision.

Please run the experiment as in the exercise. The messages used in the challenge must be of the same size.

To test the encryption function of your code, the -k option provides a keyword, the program simply outputs the argument encrypted under that keyword. The verbose option -v asks the code to report extra information for debugging or checking. If the -k option is not present, the argument is the integer number of trials to run the game to calculate the average advantage.

Historial notes: The Vigenere was first described in 1553 by Giovan Battista Bellaso, but was misattributed to Blaise de Vigenère who published something similar in 1586. Bellaso received a law degree from the University of Padua in 1538. De Vigenere was a French diplomat, and studied Greek, Hebrew and Italian at the Collège Royal (now called Collège de France) in Paris.

Problem 2: Indistinguishability for an Enigma

The Enigma was an almost non-repeating sequence of one letter permutations. It was created mechanically by a cascade of rotating wheels that had wires running through them. There were 26 contacts on the front and 26 contacts on the back of each wheel. If the "a" contact on the front was wired to the "C" contact on the back, the wheel would replace a plaintext "a" by a ciphertext "C""; and so forth for all the 26 letters.

The wheels cascaded these permutations while rotating against each other, in the manner of a mechanical odometer. Therefore, the encryption in each letter position was a simple substitution cipher, but there was a different substitution for each letter position.

I propose a cycle-enigma, a simplified form a enigma. It begins with a key permutation σ. Let the plaintext be

    p1, p2,...
and the ciphertext be
    c1, c2,...
The encryption equation is
    ci = σi(pi)
That is, to the first letter, substitute according to σ, on the next letter, substitute by σ ∘ σ, on the third by σ ∘ σ ∘ σ, and so on.

Example: Assume a five letter alphabet, 'a', 'b', 'c', 'd', and 'e'. The key permutation is -k badec.

It is easier to work with permutations when expressed in cycle notation. (See below for an explanation of cycle notation for permutations.)

    ('a','b')('c','d','e')
Then the successive permutations for the first three letters are,
    ('a','b')('c','d','e'), ('a')('b')('c','e','d'), ('a','b')('c')('d')('e')
So "cab" encrypts to "DAA", and "ace" encrypts to "BEE".

In adv-indistin-enig.py, implement

You should be able to get a significant advantage in distinguishing the challenge messages.

Note: The messages used in the challenge must be of the same size.

To test the encryption function of your code, the -k option provides a keyword, the program simply outputs the argument encrypted under that keyword. The verbose option -v asks the code to report extra information for debugging or checking. If the -k option is not present, the argument is the integer number of trials to run the game to calculate the average advantage.

NAME
    adv-indistin-enig

SYNOPSIS
    adv-indistin-enig.py [-v][-k key _plaintext_] 

DESCRIPTION
    Implements the mini-enimgma cipher, and wins the adversarial indistinguishability game
    against it. 

    When called without the -k option, runs the adversary and prints out the probability
    that the adversary decides the protocol's coin.

    When called with the -k option, encrypts _plaintext_ and prints the result.

OPTIONS 
    -k key  run in encrypt mode, to test encryption. Argument is plaintext to encrypt
    -v      verbose

    The plaintext alphabet is the first n letters of an infinite alphabet, of which
    the first 26 letters are a–z, and the ciphertext alphabet is the uppercase
    variants of the infinite alphabeta, of which the first 26 letters are A–Z.
    The encryption key is a permutation of the n letters composed with itself i times,
    for the i-th letter in the message. The first letter is 1, so the permutation 
    is applied. 
    
    Examples: The key abcde  would be the identity where the plaintext is written 
    only over the five letters a through e. The ciphertext is equal to the plaintext,
    except written in upper-case A through E.
    
    The key badec would be powers of the permutation (ab)(cde).
    
BUGS
    The original project did not allow for keysize to continue to infinity. However,
    in this version, I do not solve the "beyond z" problem, but a solution will extend
    theoretically and we do not need to program that.
    
HISTORY
    First proposed in csc507/608 semester 201 (Fall 2019-2020).
    Discussed the "beyond z" problem for key size.

Historial notes: The Enigma is the famous Nazi encryption broken by Alan Turing, as show in the movie The Imitation Game. There were in fact many people involved in breaking the Enigma, the most important being the Polish mathematician Marian Rejewski.

Rejewski had worked out the operation of the Enigma and had some methods to crack the cipher. On 25 July, 1939, in a town south of Warsaw, Rejewski disclosed his work to French and British intelligence. Poland was invaded just a month later, on September 1, 1939.

Representing and composing permutations

Single letter substitution ciphers are based on a permutation of the plaintext alphabet. A permutation σ assigns to each letter "x" and letter "y" in a way that an inverse permutation, σ-1 exists, that assigns to the letter "y" the letter "x".

        ∀ x, σ-1 (σ (x) )  = σ-1 (y) = x
Since the input and output alphabets of a permutation have to same number of elements, they can be the same alphabet, and so σ can be applied twice. If σ assigns to "y" the letter "z",
        σ2(x) = σ(σ(x)) = σ(y) = z, 
The cycle enigma uses a different permutation for each letter position by applying σ i-times for the i-th position. This can be done by keeping a permutation πi, and following the process,
	π0(x) = σ(x)
	πi+1(x) = πi(σ(x))
There are three ways to representing permutations. The first is to list the map,
     | 0 1 2 3 4 |
     | 1 0 3 4 2 |
This says σ(0) = 1, σ(1) = 0, σ(2) = 3, and so on.

In Python, it is sufficient to create the bottom row as a list,

    p = [ 1, 0, 3, 4, 2 ]
Then σ(i) is equal to p[i]. Then σ2(i) is equal to p[p[i]].

The second way to represent a permutation is cycle notation, where the table is rearranged to form cycles. In the given σ, 0 maps to 1, and 1 maps to 0. This makes a cycle (0,1). Also 2 maps to 3, 3 to 4 and 4 back to 2, so this makes another cycle (2,3,4). The permutation is therefore (0,1)(2,3,4).

Convert to this notation by starting with 0, then listing σ(0), and σ(σ(0)), until the result returns to 0. This creates a cycle which gives the mappings for everything in the cycle. Pick the smallest number not in this cycle and repeat, forming a second cycle. Continue until all numbers are in some cycle.

As a hint how to create an adversary for the cycle enigma, note that when written in cycle notation, the first position is mapped by finding the letter and moving forward (right) by one, the second position is mapped by moving forward by two, and so on. We see immediately from the notation (0,1)(2,3,4) that cycle enigma with this key would give the identity permutation every 6th position, and then repeat with (0,1)(2,3,4),

        ((0,1)(2,3,4))6 = (((0,1)(2,3,4))2)3 = ((0)(1)(2,4,3))3 
            = ((0)(1)(2,4,3)) ∘ ((0)(1)(2,4,3))2 = ((0)(1)(2,4,3)) ∘ ((0)(1)(2,3,4))
            = (0)(1)(2)(3)(4)
As a further hint, Rejewski investigated the enigma's permutations in cycle form. You might notice that letters in a cycle could possibly encrypt to each other, but it is impossible for letters in different cycles to encrypt to each other.

Enigma diagrams for permutations

The third way to represent a permutation takes its inspiration from the enigma machine. Show a line between in and out elements, and to compose, chain these diagrams from let to right. We show using this diagram that permutations are not a commutative group by showing (12)(34) ∘ (1)(23)(4) does not equal (1)(23)(4) ∘ (12)(34),
     ENIGMA DIAGRAMS FOR PERMUATIONS AND THE COMPOSITION    
     
     
    (1)(23)(4)    (12)(34)        ⇒             (1243) 
    
 1 o ---------- o ---+  +--- o 1         1 o ------+  +---- o 1
                      \/                            \/
                      /\                            /\
 2 o ---+  +--- o ---+  +--- o 2         2 o ---+  /  +---- o 2
         \/                                      \/   
         /\                                      /\
 3 o ---+  +--- o ---+  +--- o 3         3 o ---+  \  +---- o 3
                      \/                            \/
                      /\                            /\
 4 o ---------- o ---+  +--- o 4         4 o ------+  +---- o 4
  
  
  
  
    (12)(34)     (1)(23)(4)        ⇒           (1342)
    
 1 o ---+  +--- o ---------- o 1         1 o ---+  +------ o 1
         \/                                      \/ 
         /\                                      /\
 2 o ---+  +--- o ---+  +--- o 2         2 o ---+  \  +--- o 2
                      \/                            \/
                      /\                            /\
 3 o ---+  +--- o ---+  +--- o 3         3 o ---+  /  +--- o 3
         \/                                      \/
         /\                                      /\
 4 o ---+  +--- o ---------- o 4         4 o ---+  +------ o 4
Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

author: burton rosenberg
created: 4 feb 2018
update: 4 nov 2019