NAME
letter-utils.py
SYNOPSIS
letter-utils.py [-h] [-s LETTER_SHIFT] [-c COLUMN_WIDTH] [-p] [-v] reference_text
DESCRIPTION
Computer letter frequencies of stdin and calculate the correlation with the
letter frequencies of the given reference text. The text is assumed to be the
26 ASCII letters, and generally all non-letters are discarded and all letters
are changed to lower case.
The stdin frequency count is subject to a indexing rotation before correlating
against the frequency count of the reference text. For instance, if the specified
rotation is 1, the number of a's in the reference text is matched to the number of
b's in the stdin text, etc. If no specific rotation is given, the calculation
is done for all possible rotations.
Optionally outputs a histogram of the letter frequencies of stdin.
POSITIONAL ARGUMENTS
reference_text file of reference text for the reference distribution
OPTIONAL ARGUMENTS
-h, --help show this help message and exit
-s LETTER_SHIFT shift the distribution
default: calculate for all shifts
-c COLUMN_WIDTH column width for histogram
-p print histogram
-v verbose
HISTORY
Introduced in the 221 offering (Fall 2021–2022).
BUGS
Copy the template, test and Makefile from [repo]/class/proj2 directory to your folder, with the same folder and file names. svn add and svn commit -m "initial commit".
Short coding assignment
The modify the letter-utils.py program so that it runs the three tests in the Makefile without error. This would be a good place to subversion commit.
Frequency distribution exploration
There are three encryption keys in the Makefile. For each key, run letter-utils with no -s option to get the full output of 26 correlations. See the awk command in the Makefile for how to extract just a column of 26 numbers form the output.
Transfer these to three spreadsheet columns in a spread sheet and get the mean and standard deviation for ach column. Make a bar chart for each column. Remark how you might go about cracking vigenere for one, two and three letter keywords.
Commit your spreadsheet.
The file challenge.txt is an vigenere encrypted text. Find the keyword.
Since the keyword is greater then 3 characters long, what techniques do the previous exercise suggest for the decryption?

author: burton rosenberg
created: 31 aug 2021
update: 31 aug 2021