Cover page images (keyboard)

HW4 - Strings Assignment

due on Sunday, 10/9 before 11:59 PM

Sue Evans & Travis Mayberry

Hit the space bar for next slide

Writing a Program in a File

For every remaining assignment, whether a homework or a project, you'll need to create a file that contains your code.

You should change into your hw4 directory and then use emacs to create a file called hw4.py by entering emacs hw4.py & at the linux prompt.

Again, if you are working from home the & probably won't work and you should just login twice to have two different windows, one for your text editor and the other to run your program. Just type emacs hw4.py to run emacs.

Documentation

At the top of each file, we are requiring a file header comment.

The fileheader comment has six required parts:

UNIX Redirection

For this homework you are going to write a program that tries to break the cipher that you wrote in your lab last week.

I have already run the Ceasar cypher with a fixed rotation length on the preamble of the U. S. Constitution and captured the output into a file called code.txt. You can get a copy of it from my directory:

cp /afs/umbc.edu/users/b/o/bogar/pub/code.txt .

UNIX allows us to use the contents of a file as input to our program and also capture output from our program into a file. These operations are called redirection. The < is used when you're using a file for input and the > is used when you want to capture the output into a file. You can just think of these symbols as arrows to or from your programs name. Here are some examples of their use:

python lab3.py < preamble.txt

where preamble.txt is the preamble written in English all on one line. This redirection causes the raw_input statement in lab3.py to read the contents of the preamble.txt file in instead of some message that the user enters from the keyboard. No modifications have to be made to the lab3.py file for this to work.

python lab3.py > code.txt

would run the program getting input from the user and capturing the output in a file called code.txt

Could we do both at the same time? Sure.

python lab3.py < preamble.txt > code.txt

and it would run very fast. This is what I did to produce the code.txt file you just copied.

You will be running your program using Unix redirection to provide input to your program like this:

python hw4.py < code.txt

Decoding

Recall that the Caesar cipher shifts characters by a fixed amount to encrypt the message. How would you break something like this?

It turns out that English has a very specific distribution of letters in an average sample of text. As the amount of text you have increases, the closer to this ideal distribution the letter frequencies become.

Your program must calculate the correct shift amount by finding the letter used most often in the coded text and using the information in the letter frequencies chart shown below. You may not just look at the coded text to figure out the shift amount yourself and hard-code that shift amount into your program.

Here are the English frequencies for each letter :

a	8.167%
b	1.492%
c	2.782%
d	4.253%
e	12.702%
f	2.228%
g	2.015%
h	6.094%
i	6.966%
j	0.153%
k	0.772%
l	4.025%
m	2.406%
n	6.749%
o	7.507%
p	1.929%
q	0.095%
r	5.987%
s	6.327%
t	9.056%
u	2.758%
v	0.978%
w	2.360%
x	0.150%
y	1.974%
z	0.074%

Hints:

Sample Output

linuxserver1.cs.umbc.edu[101] more code.txt
BJ YMJ UJTUQJ TK YMJ ZSNYJI XYFYJX, NS TWIJW YT KTWR F RTWJ UJWKJHY ZSNTS, JXYFGQNXM OZXYNHJ, NSXZWJ ITRJXYNH YWFSVZNQNYD, UWTANIJ KTW YMJ HTRRTS IJKJSXJ, UWTRTYJ YMJ LJSJWFQ BJQKFWJ, FSI XJHZWJ YMJ GQJXXNSLX TK QNGJWYD YT TZWXJQAJX FSI TZW UTXYJWNYD, IT TWIFNS FSI JXYFGQNXM YMNX HTSXYNYZYNTS KTW YMJ ZSNYJI XYFYJX TK FRJWNHF.
linuxserver1.cs.umbc.edu[102] python hw4.py < code.txt
WE THE PEOPLE OF THE UNITED STATES, IN ORDER TO FORM A MORE PERFECT UNION, ESTABLISH JUSTICE, INSURE DOMESTIC TRANQUILITY, PROVIDE FOR THE COMMON DEFENSE, PROMOTE THE GENERAL WELFARE, AND SECURE THE BLESSINGS OF LIBERTY TO OURSELVES AND OUR POSTERITY, DO ORDAIN AND ESTABLISH THIS CONSTITUTION FOR THE UNITED STATES OF AMERICA.
linuxserver1.cs.umbc.edu[103]

Notice that each of these files contain just one long string that needs to go off the edge of the page to print out. If you looked at these files using less, you would see a lot of wrapping. For this exercise, that's perfectly okay.

Extra Credit

For 5 points of extra credit, break the string up so that it prints out normally in an 80-character window without wrapping. Words must go all of the way across the page (one word per line doesn't count). This must be done without hard-coding specifically for the Preamble of the Constitution.

Sample output for Extra Credit

WE THE PEOPLE OF THE UNITED STATES, IN ORDER TO FORM A MORE PERFECT UNION,
ESTABLISH JUSTICE, INSURE DOMESTIC TRANQUILITY, PROVIDE FOR THE COMMON DEFENSE,
PROMOTE THE GENERAL WELFARE, AND SECURE THE BLESSINGS OF LIBERTY TO OURSELVES
AND OUR POSTERITY, DO ORDAIN AND ESTABLISH THIS CONSTITUTION FOR THE UNITED
STATES OF AMERICA.

Submitting your work

When you've finished your homework, use the submit command to submit the file.

Don't forget to watch for the confirmation that submit worked correctly. Specifically, the confirmation will say:

Submitting hw4.py...OK

If not, try again.

You can check your submission by entering
submitls cs201 HW4
You should see the name of the file that you just submitted, in this case, hw4.py.