Consolidation: CryptoPyThy

General     Shi muska di scensand dravenka oblomov Engleska Solzhenitzhin.
SUBTITLE: ‘FORGIVE ME IF I CONTINUE IN ENGLISH IN ORDER TO SAVE TIME’

So far we have learned about:

outputting information via the print statement (implicitly, there wasn’t a specific tutorial on it)

getting information from the user (raw_input)

the difference between strings and numbers, and how to manipulate them a little

how to loop over a range of numbers

conditionals – the if statement

how to get a list of numbers from 0 up to a specified number (range) and the remainder operator %

conditional looping – the while statement

a bit about variables and how they can be used to store information

Some of the reasons why we’re learning Python and not some other language

The List type, and how they can be used to store similar sorts of information

Functions and how they can be used to transform the arguments that you pass to them

How to use other people’s functions through the import statement we imported a module someone else wrote called random and made a guessing game with it; and, finally

We learned a little more about the string type and some ways of cutting them up and putting them back together again

Enter CryptoPyThy

In retrospect, this is probably a bit too much to cover without a breather.  However, we’ve now got a good slice of the basic concepts down and are able to write a program which will  make use of them.  I’m calling it CryptoPyThy (‘crypt-oh-pih-thee‘) and we can use it to make secret messages that no one else can read.  What we will do is write some code which:

* asks for a message to encrypt (or decrypt);

* takes that message and passes it to an encryption function;

* the encryption function breaks it down to each character in the message;

* it passes each character to another function which encrypts the characters; and

* after it has been encrypted, prints the encrypted message on the screen (then translates it back to show).

Or, at least that’s what I wanted it to do, but there’s some glitch in my system which is replacing some of the code by ?? So, in this tutorial, you’re going to have to work out what should go wherever  ?? appears and replace the ??.

To start, I’m going to introduce you to a builtin function – on that you have automatically without having to import it (like range()).  It’s called chr().  The argument you pass to chr() needs to be a number (an integer in fact).  Then, chr() takes that number and returns a letter (which can then be printed). Examples:

>>> chr(65)
'A'
>>> chr(104)+chr(105)+chr(32)+chr(116)+chr(104)+chr(101)+chr(114)+chr(101)+chr(33)
'hi there!'

If you want to see a list of characters try this:

>>> for i in range(32,127):
...     print chr(i),

While I’m here, I want to note that: we’ve got two numbers (ie arguments) in the range function (the first number is now where it starts from), but earlier we only saw one.  Also, we’ve put a comma at the end of the print statement.  This tells Python to keep printing from the same place next time it prints (normally it will start a new line).

There is a reverse function to chr(), and which takes a character and returns a number is called ord().

>>> ord('A')
65

So, chr(65) gave ‘A’ and ord(‘A’) gave 65.  You can test it by doing these:

>>> chr(ord('A'))
'A'  <- ord converted 'A' to 65,
        then chr converted 65 back to A
>>> ord(chr(65))
65   <- as above, but reversed.

We’re going to define our encryption function as follows:

>>> def encryptChar(character=None):
...    if character is None:
...      return ''
...    something = ??
...    spam = ord(character) + something
...    if spam > 127:
...      spam = spam - (128-32)
...    return chr(spam)

Put a number in where the question marks are.  It needs to be that number which, when doubled and then added to 32, equals 128.

Exercise: work out what the number should be and replace ‘??’ by that number in the something = ?? line in the function’s definition.   Hint: you can use Python to do the calculations for you and work backwards from 128 (actually, that’s two hints).

Explanation of the code:

The function encryptChar() expects to receive a single character as an argument.  If no argument is passed in, the character defaults to None (not ‘None’ note quotes – Python actually has a value called None).  If the character is None, then return an empty string.  Otherwise, calculate the number representing the character and add something to it, so the number we now have represents another character.  Unfortunately, we’re limited to characters represented by the numbers 32 through 127 (this was decided many decades ago, don’t question it).  So if, after we’ve added something to it, we have a number bigger than 127, we need to get the number back in that range.   And, in particular, we want 128 to go to 32 (so what’s the number, which when added to 128 gives 32? [1]

Next we need a function which takes a message and splits it into characters.  We could do this by using the [:] operator we met earlier to extract each character one by one – eg message[0:1] etc.  However, we’d need to know how to work out the length of the message, so we’d know when to stop (answer = the len() function).  Rather, we’re going to rely on a neat thing about strings – they have their own implicit iterator. Let me explain by way of an example:

>>> for i in 'ABCDE':
...     print i
...
A
B
C
D
E

So, when Python encounters a statement like for i in someString: in the for loop it assigns to the variable i each of the characters in the string, one by one.  We will use this to split our message up:

>>> def encryptMsg(message= None):
...   if message is None:
...     return ''
...   encrypted = ''
...   for character in message:
...     encrypted += encryptChar(??)
...   return encrypted

As with the encryptChar() function, this function expects a message as its first argument, and defaults the message to None if none is provided.  Then it checks to see if the message is None, and if so, it returns an empty string -> .  As above you need to work out what goes in the place where the question marks are ‘??’.   The encryptChar function expects to receive a single character,  so the ?? need to be replaced by a variable which contains a single character.

Finally we need a way of inputting a message – make sure you replace ?? by  a function call that we’ve seen before for getting user input.

>>> def cryptoPyThy():
...   while True:
...     message = ??('Type your message for encryption: ')
...     encrypted = encryptMsg(message)
...     print 'Encrypted message = \n',encrypted
...     print 'Decrypted message = \n',encryptMsg(encrypted)
...     print
...     
...     if message == 'q':
...       break
...
>>> cryptoPyThy()
Type your message for encryption: This is a secret message
Encrypted message =
$89CP9CP1PC53B5DP=5CC175
Decrypted message =
This is a secret message

Note that this is a special encryption function – if you encrypt the encrypted message, then you get the original message (called the plain text)  back.  You can see that from the printout above.

Extra Points:

(hard): This code replaces one character by another in a consistent way.  Someone who was determined might be able to decrypt the messages if they see a lot of them – or if they see the plain text of the message. In fact, this is the correspondence:

>>> spam = ''
>>> for i in range(32,128):
...     spam +=chr(i)
...
>>>
>>> print spam+'\n'+encryptMsg(spam)
 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
PQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNO

(Easy):  explain what this bit of code did.

(Hard): Change the functions so that the amount of the offset (that is, the variable something) can be passed as an argument to the encryptChar() function.   That way you could encrypt each message slightly differently – but you’d need to tell the recipient the offset so they could decrypt the message (how could you use encryptChar to decrypt one of these messages? – hint: think of negative numbers).

Notes:

[1] Answer ->

128+x = 32

=> 128+x-128 = 32-128

=>  128-128 on the left hand side is zero, so

x = 32-128,

but if we want to subtract x, we add its negative

=> -x = -(32-128)

=> -x = 128-32

String

String

I have this large quantity of string, a hundred and twenty-two thousand miles of it to be exact,…

We have already met strings indirectly throughout some of the other tutorials.  If you’ll recall, strings are the things you get when you put inverted commas around stuff – like this: ‘stuff’.   As far as Python is concerned, the inverted commas transform what might be other sorts of data into a string.  There are actually two three four Amongst our ways of making a string are such diverse elements as: using inverted commas (‘), using quotation marks (“) , using triple inverted commas (”’) [update:] and triple quotes (“””).  Which sort you use depends largely on your preference.  Sometimes, you need to use a certain manner of creating the string because the string itself contains (eg) an inverted comma.

>>> ‘a string’
‘a string’
>>> ‘This isn’t really a string because the inverted comma in the middle cuts the string off early.’
File “<stdin>”, line 1
‘This isn’t really a string because the inverted comma in the middle cuts the string off early.’
^
SyntaxError: invalid syntax
>>> “However, you can use quotation marks when there’s a need to include an inverted comma in a string”
“However, you can use quotation marks when there’s a need to include an inverted comma in a string”
>>> ‘Conversely, if you need to use “quotation marks” in a string, you can make the string using inverted commas’
‘Conversely, if you need to use “quotation marks” in a string, you can make the string using inverted commas’
>>> ”’Finally, you can pretty much always use triple inverted commas.  When you do, you don’t have to worry about whether there are “quotation marks” in the string or inverted commas.”’
‘Finally, you can pretty much always use triple inverted commas.  When you do, you don\’t have to worry about whether there are “quotation marks” in the string or inverted commas.’

Can you see, in the last string, Python has represented the ‘ by using \’?  This is called ‘escaping’ a character.  Some escaped characters have specific meanings.  The ones which will see most often are: \t which means tab, and \n which means new line:

>>> print '1\t2\n3'
1       2
3

When Python printed the string it didn’t actually print ‘\t’ it actually interpreted it as a control character and put a tab space between the 1  and the 2 (and a new line between the 2 and the 3).

Cut String

Mr. Simpson:     Ah, but there’s a snag, you see. Due to bad planning, the hundred and twenty-two thousand miles is in three inch lengths. So it’s not very useful.
Wapcaplet:     Well, that’s our selling point! “SIMPSON’S INDIVIDUAL STRINGETTES!”

Sometimes  you don’t want the whole of the string.  Sometimes you just want a bit of it.   Python comes with a number of ways to chop up strings (and to put them back together again).  The chop function (actually a method, but that’s for another time) is called .split() and the function for putting them together is called .join() (if it flows off the side where you can’t see it you can still copy and paste it – sorry):

>>> simpson = '''the hundred and twenty-two thousand miles is in three inch lengths'''
>>> print simpson
the hundred and twenty-two thousand miles is in three inch lengths
>>> simpsonSplit = simpson.split(' ')
>>> print simpsonSplit
['the', 'hundred', 'and', 'twenty-two', 'thousand', 'miles', 'is', 'in', 'three', 'inch', 'lengths']
>>> print ' '.join(simpsonSplit)
the hundred and twenty-two thousand miles is in three inch lengths

Can you see that .split(‘ ‘) seems to have ‘split’ the string (it’s actually still there, the ‘pieces’ are in a separate list object) .  It split it wherever there was a blank space (‘ ‘).  We could have easily split it on a different (sub-)string, like the letter ‘e’:

>>> print simpson.split('e')
['th', ' hundr', 'd and tw', 'nty-two thousand mil', 's is in thr', '', ' inch l', 'ngths']

You should also note that: there is a dot . at the front of .split(); its usage is someString.split(‘someOtherString’); and after .split() split the string, we had a list of strings. Unlike poor Mr Simpson, if our strings are chopped up into little pieces, we can .join() them back together again.  The structure of .join() is the reverse of .split().  At the front of .join() you put the thing you want to join the strings with.  In the brackets you put a list that you want to join up (it must be a list). As with .split() you can put any string as the joining string.  See what you get from:

>>> print '\n'.join(simpson.split(' '))

You can .split() and .join() things in chains if you really want to, to make some 133t speak:

>>> '1'.join(('4'.join('3'.join(simpson.split('e')).split('a')).split('l')))

‘th3 hundr3d 4nd tw3nty-two thous4nd mi13s is in thr33 inch 13ngths’

Due to Bad Planning

You can also slice strings up, but without actually cutting them.  You do this by using the [:] operator.  It takes up to two number (actually int)  arguments. Instead of explaining I will give you some examples:  (the comments after <- I have typed in after don’t expect to see them in your output)

>>> aString = '0123456789ABCDEFGHIJ'
>>> aString[0]
'0'  <- get the first character (number 0) in the string
>>> aString[1]
'1' <- get the second character (number 1
       - yes, 1 just believe on this) in the string
>>> aString[10]
'A' <- get 11th character
>>> aString[1:1]
''  <- get the characters from the second character up to,
       but not including the second character - ie empty
>>> aString[1:2]
'1' <- starting on the second up to
       but not including the third (ie number 2)
>>> aString[1:3]
'12' <- starting on the second utbni the fourth (number 3)
>>> aString[1:10]
'123456789' <- starting on the second utbni the 11th
>>> aString[:10]
'0123456789' <- starting from the start of the string
                utbni the 11th
             <- note there are 10 numbers in the string
             <- note also nothing in front of the colon
>>> aString[10:]
'ABCDEFGHIJ' <- starting from the 11th to the end of the string.
             <- nothing after the colon
>>> aString[9:]
'9ABCDEFGHIJ'<- starting from the 10th to the end of the string
>>> aString[-1]
'J' <- last character putting a - before the number says start
       from the end of the string and work back
>>> aString[-4:]
'GHIJ' <- last 4 characters
>>> aString[:-10]
'0123456789' <- starting from the start up to but not including
                the 10th character from the end
>>> aString[:-11]
'012345678' <- utbni the 11th character from the end
>>> aString
'0123456789ABCDEFGHIJ'  <- the string has not changed by doing
                           any of this
>>> start=5
>>> end=12
>>> aString[start:end]
'56789AB'   <- you can even use variables around the colon

HomeWork:

Change this for loop so that it separately prints each character in aString above:

>>> for i in range(20):
...     print i,': '+ aString[10]

Change the following to store the name of your school in the variable a.  Then use aInBits = a.split() to break it on the ‘o’s, and .join(aInBits) to join it up with zeroes where the ‘o’s where. You need to work out what goes in the brackets  of a.split() and what goes in front of .join(aInBits) hint: it’s a string and it has zero in it:

a='Baloney Public School'

Extra (or if there were no os in your school’s name):

store what you’ve done and replace the ‘e’s with ’3′s, the ‘a’s with ’4′s, the ‘S’s with ‘$’s and the ‘b’s with ’6′s (you can recycle the variable aInBits)


Follow

Get every new post delivered to your Inbox.

Join 74 other followers