A Big Jar of Pickles

Pither (voice over) As I lay down to the sound of the Russian gentlemen practising their shooting, I realised I was in a bit of a pickle. My heart sank as I realised I should never see the Okehampton by-pass again…

In the last tutorial we learned how to pickle our objects.  Pickling is a way of storing the object (on the computer’s file system) so that it can be used later.   This means that if we want to re use an object we can simply save it and load it when we need it, rather than re-creating it each time we want to use it.  This is very useful when our object is a list of questions for our trivia game.  We really only want to type the questions in once and then reload them later.

Now we need to settle on a way to structure our data.  We saw in our earlier tutorial that each question was a list, and that the list itself had a certain structure.  We also need to think about how a number of questions will be stored.  We will use a list to do that as well!  In this case we will have a list of questions.  Each of the elements in the list will itself be a list.  Let’s build one.  First we make an empty list to store all the questions:

triviaQuestions=[]

It is empty:

len(triviaQuestions)

Next, let’s make a sample question to add to that list.  Feel free to use your own question/ answers if you want to use your own topic:

sampleQuestion = []

Now, we populate the sample question:

sampleQuestion.append("Who expects the Spanish Inquisition?")
# first entry must be the question
sampleQuestion.append("Nobody")
# second entry must be the correct answer
sampleQuestion.append("Eric the Hallibut")
sampleQuestion.append("An unladen swallow")
sampleQuestion.append("Brian")
sampleQuestion.append("Me!")
# any number of incorrect answers can follow
# but they must all be incorrect

There are 6 elements in the sampleQuestion list:

len(sampleQuestion)

Now, we add the sample question (as the first entry) to the list of trivia questions:

triviaQuestions.append(sampleQuestion)

It now has one question in it:

len(triviaQuestions)

To add more questions we “rinse and repeat”:

sampleQuestion = []
# this clears the earlier entries
# if we append without doing this
# we'll have multiple questions in the wrong list
sampleQuestion.append("What is the air-speed velocity of an unladen swallow?")
sampleQuestion.append("What do you mean?  African or European swallow?")
sampleQuestion.append("10 m/s")
sampleQuestion.append("14.4 m/s")
sampleQuestion.append("23.6 m/s")

triviaQuestions.append(sampleQuestion)

Now, the sampleQuestion has five entries and there are two questions in total:

len(sampleQuestion)
len(triviaQuestions)

Now we need to save the question list so we can use it again later.  We will save it to a file called “p4kTriviaQuestions.txt”.  Ideally we would test to see whether this file already exists before first creating it (so that we don’t inadvertently wipe some valuable file).  Today however, we’re just crossing our fingers and hoping that you don’t already have a file of this name in your directory:

import pickle
fileName = "p4kTriviaQuestions.txt"
fileObject = open(fileName,'w')
pickle.dump(triviaQuestions,fileObject)
# oops! earlier draft had these in the wrong order!
fileObject.close()

So far we have spent a lot of time on how to store the data used by the game.  However, in order to hang the various parts of the trivia game together we need to learn about storing a different part of the game – the program itself.  We will be looking at that in the coming tutorials.

An Awful Pickle

Specialist     Come in.
The door opens and Raymond Luxury Yacht enters. He cannot walk straight to the desk as his passage is barred by the strip of wood carrying the degrees, but he discovers the special hinged part of it that opens like a door. Mr Luxury Yacht has his enormous polystyrene nose. It is a foot long.
Specialist     Ah! Mr Luxury Yacht. Do sit down, please.
Mr Luxury Yacht     Ah, no, no. My name is spelled ‘Luxury Yacht’ but it’s pronounced ‘Throatwobbler Mangrove’.
Specialist     Well, do sit down then Mr Throatwobbler Mangrove.
Mr Luxury Yacht     Thank you.

So, we know how to save trivia questions to a file, and how to read them back from a file in the future.  Moreover, we have decided on a particular way of structuring the data which makes a question.  That is, the question is followed by the correct answer and then a number of incorrect answers.   Now we have to translate between a list (which has a concept of elements), and a file (which doesn’t).  Files are “flat” – which is to say that they have no sense of structure, they are simply a stream of data.  A file may record all of the characters which are the questions and answers, but it wouldn’t record the fact that they are a list or, indeed, that they are any kind of Python object.  I was originally just going to run with this to let you find out about files, but I have instead decided to introduce a further concept – the Python pickle!

pickle is a module which allows you to store Python objects including their structure.  That means after you have pickled an object to a file, you can later load that object back up from the file and all the structure associated with that object will be preserved.  While, at the moment, we are only dealing with a list, any object can be pickled – even if it has methods and attributes (ie functions and data which are packaged with the object) – they are saved with the object in the file.  What pickle does is “serialises” the object first before “persisting” it.
To use pickle you must first import it:

import pickle

pickle has two main methods – dump, which dumps an object to a file object and load, which loads an object from a file object.  Note here that the file object referred to here is what is returned by the open() function.  It is not the name of the file.  So to use pickle you must first open() the file (either as ‘w’ if you are dumping an object or as ‘r’ if you are loading one) and store the object that the open() function returns.  I will demonstrate by making a demo list object and pickling it to a file called ‘testfile’:

a = ['A dummy question','The correct answer','A wrong answer']
a
['A dummy question', 'The correct answer', 'A wrong answer']
fileName = "testfile"
fileObject = open(fileName,'w') # open the file for writing
import pickle
pickle.dump(a,fileObject)   # this writes the object a to the file named 'testfile'
fileObject.close()
fileObject = open(fileName,'r')  #open the file for reading
b = pickle.load(fileObject)  #load the object from the file into b
b
['A dummy question', 'The correct answer', 'A wrong answer']
a==b
True

You can see that what is now in b is the same as what is in a (because a==b is True, Python thinks they are the same).  Moreover, this dump/load procedure allows you to preserve the object even when you quit of of python and come back to it later (which is the whole point of this exercise):

fileObject.close()
exit()  # leave python and restart
/home/user> python
Python 2.5 (release25-maint, Dec  9 2006, 14:35:53)
[GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-20)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pickle
>>> fileName = 'testfile'
>>> fileObject = open(fileName,'r')
>>> c = pickle.load(fileObject)  #load the old object
>>> c
['A dummy question', 'The correct answer', 'A wrong answer']

However, now we try to compare c to the original we see that Python has forgotten a when we exited:

c==a
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'a' is not defined

Which is to say that the only place that python got the object c from was the file when it pickle.load()ed.

Homework:
Make some other objects, dump them to a file and load them again.  Make sure that you name the file first then open() it before you pickle and .close() it afterwards.  Use the attribute ‘w’ when you open a file to dump an object and ‘r’ when you are going to load an object.

Pickle vs cPickle:

Python actually has two pickle modules – pickle, which we used above and cPickle. There are some technical differences between them but for most purposes they can be treated as being exactly the same.  The main difference is that cPickle has been written in the C programming language and, as a result, runs much faster.  While I am using pickle here, in future tutorials I will (try to remember to) use cPickle instead.  When you write your own programs you should use the cPickle module by default as it will run faster (ie. wherever you see pickle, use cPickle instead).  Otherwise the usage is exactly the same.

Spelling Note: It is pickle not pickel.


Follow

Get every new post delivered to your Inbox.

Join 67 other followers