Increasingly Trivial Questions

First Hermit     Hello, are you a hermit by any chance?
Second Hermit     Yes that’s right. Are you a hermit?
First Hermit     Yes, I certainly am.
Second Hermit     Well I never. What are you getting away from?
First Hermit     Oh you know, the usual – people, chat, gossip, you know.
Second Hermit     Oh I certainly do – it was the same with me. I mean there comes a time when you realize there’s no good frittering your life away in idleness and trivial chit-chat.

If you remember back a few tutorials ago, we learnt how to pickle objects so that we could get them back later.  To pickle the object triviaQuestions (which was a list object) we first imported the pickle module, then opened a file and dumped the object into the file, then close()d the file:

->code snippet removed to note [1] because I don't want you typing it in<- 

Note here that when we opened the file with ‘w’ this meant we were ‘w’riting to it.  If the file existed, then Python wiped it ready for us to write something new to it. You need to be careful when reading from a file that you have ‘r’ (for ‘r’ead) in the open() command, otherwise, instead of reading the data, you’ll wipe it instead!!!!
Let’s load the triviaQuestions list:

>>> import cPickle
>>> filename = "p4kTriviaQuestions.txt"
>>> fileObject = open(filename,'r')  # note the 'r' for 'read'
>>> triviaQuestions = cPickle.load(fileObject)
>>> len(triviaQuestions)
2
>>> triviaQuestions[0]
['Who expects the Spanish Inquisition?', 'Nobody', 'Eric the Hallibut', 'An unladen swallow', 'Brian', 'Me!']
>>> triviaQuestions[1]
['What is the air-speed velocity of an unladen swallow?', 'What do you mean?  African or European swallow?', '10 m/s', '14.4 m/s', '23.6 m/s']

Wow! We have preserved the list between tutorials. So, rather than continually typing out all the questions every time, we only have to add new questions.

In this code we’ve imported cPickle. We mentioned cPickle earlier.  It does (for our purposes) the same thing as pickle, only it does it faster.  cPickle and pickle behave  little differently in some special circumstances, but that’s not relevant for us.  We also used ‘r’ rather than ‘w’ when we opened the file.  Also, cPickle doesn’t care about the name the object had when it was dump()ed.  We could have loaded the pickle into an object with a different name:

>>> fileObject.close()
>>> fileObject = open(filename,'r')
>>> aList = cPickle.load(fileObject)
>>> fileObject.close()
>>> aList == triviaQuestions
True

We had to first close() the fileObject so that cPickle read from the start of the file.  If there are multiple objects pickled to a file it loads them out in the order they were dumped in.  So once it load()s an object cPickle has moves to the end of the object in the file in order to read the next object.  If there is only one object, then cPickle will be at the end of the file and won’t be able to read any more.

As we can see, we’ve load()ed the same object under a different name (aList).  You should also note here that when an object is load()ed, it doesn’t change the file (unlike when an object is dump()ed).  We could load multiple copies of the object for ever if we wanted to.
Let’s add another question and then dump the list to the file:

sampleQuestion = []
# this clears the earlier entries
# if we append without doing this
# we'll have multiple questions in the wrong list
# first the question
sampleQuestion.append("Is this the right room for an argument?")
# then the correct answer
sampleQuestion.append("I've told you once.")
# now one or more incorrect answers
sampleQuestion.append("No")
sampleQuestion.append("Down the hall, first on the left")
sampleQuestion.append("Yes")

now we append() this question to the end of the list:

triviaQuestions.append(sampleQuestion)

Finally, we dump the object back into the file.  Be careful – when we open the file to write (‘w’) we lose what’s saved in it.  That’s no problem if we fill it back up with data, so be sure to dump the question list!

>>> fileObject = open(filename,'w')  # data that was in the file is now gone!
>>> cPickle.dump(triviaQuestions,fileObject)   # ok, put new data in
>>> fileObject.close()

If you were worried about losing your data you would first dump() the object to a file with a temporary name, then when that was successful, rename the existing file as ‘p4kTriviaQuestions.bak’  and the newly created file to ‘p4kTriviaQuestions.txt’.  Then if anything went wrong with saving the data you would still have the data backed up in the “.bak” file.

Homework:

  • add some more questions to your triviaQuestions list by load()ing the list from your file, adding questions, then dump()ing the list back to the file. Make sure you use ‘r’ when you open a file you are going to load() from and ‘w’ when you open a file to dump() to.

Notes:

[1] This is repeated from the penultimate tute (or antepenultimate if you are counting this tute):

>>> import pickle  
>>> # This is just a flashback don't retype this code because it will wipe your data!!!!
>>> filename = "p4kTriviaQuestions.txt"
>>> fileObject = open(filename,'w')
>>> pickle.dump(triviaQuestions,fileObject)
>>> fileObject.close()

 

Time for Some Introspection

“…The illusion is complete; it is reality, the reality is illusion and the ambiguity is the only truth. But is the truth, as Hitchcock observes, in the box? No there isn’t room, the ambiguity has put on weight. The point is taken, the elk is dead, the beast stops at Swindon, Chabrol stops at nothing, I’m having treatment and La Fontaine can get knotted.”

Did you notice some errors in the previous tutorial? One was fatal.  The fact that no one commented on them indicates to me that no one is actually typing in the code – naughty naughty!  Type it in.  It’s important.

The errors have been corrected now, but they were:

pickle.dump(fileObject, triviaQuestions)

(the order of the arguments is wrong, the object to dump goes first, and the file object to dump it into goes next); and
there was a stray full stop at the end of one line.

If you typed in the previous tutorial you should have received the following error:

>>> pickle.dump(fileObject,triviaQuestions)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python2.6/pickle.py", line 1362, in dump
Pickler(file, protocol).dump(obj)
File "/usr/lib64/python2.6/pickle.py", line 203, in __init__
self.write = file.write
AttributeError: 'list' object has no attribute 'write'

Or something like it – the exact error may be different depending on what version of python you are running.

If you receive an error like this you can always use the interpreter’s built in help function to assist:

>>> help(pickle.dump)
Help on function dump in module pickle:

dump(obj, file, protocol=None)

This is not entirely enlightening, but it does tell you that the order of the  arguments – the object first, followed by the file second, followed by a third, optional, argument (protocol).  We know it is optional because it is assigned a default value.

The object itself is also able to tell you about itself.  This is called “introspection”.  In English introspection means looking inward.  People who are introspective spend time thinking about themselves.   In Python, introspection is the ability of the program to examine, or give information about, itself.   For example, try this:

>>> print pickle.__doc__
Create portable serialized representations of Python objects.
See module cPickle for a (much) faster implementation.
See module copy_reg for a mechanism for registering custom picklers.
See module pickletools source for extensive comments.
Classes:
Pickler
Unpickler
Functions:
dump(object, file)
dumps(object) -> string
load(file) -> object
loads(string) -> object
Misc variables:
__version__
format_version
compatible_formats

This shows the “docstring” for the pickle module.  Docstring is a string which holds documentation about the object.   We have learnt from the docstring that pickle has methods for dumping object to strings as well as files.   Any object can have a docstring, for example, our triviaQuestions list had one [if you redo the previous tute to reconstruct it, since we haven't instantiated it this time]:

>>> triviaQuestions.__doc__
"list() -> new empty list\nlist(iterable) -> new list initialized from iterable's items"

In this case, the docstring is the same for all lists (try [].__doc__).  However, some objects, particularly classes (which we haven’t met yet) and functions, are able to have their own docstrings which are particular to that object.   A docstring can be created for an object by adding a comment in triple single quotes (”’) at the start of the object’s definition (other comment forms like single quotes work, but triple single quotes are the convention so that you can include apostrophes etc in the docstring):

 >>> def square(x):
...     '''Given a number x return the square of x (ie x times x)'''
...     return x*x
...
>>> square(2)
4
>>> square.__doc__
'Given a number x return the square of x (ie x times x)'

When you write code you should also write docstrings which explain what the code does.  While you may think you’ll remember what it does in the future, the reality is that you won’t!

How did I know that pickle had it’s own docstring?  Well, I read it somewhere, like you read it here.  However, if you ever find yourself needing to work out what forms part of an object Python has a function to do it – it’s called dir().  You can use it on any object.  Let’s have a look at it on the square() function we just made up:

>>> dir(square)
['__call__', '__class__', '__closure__', '__code__', '__defaults__', '__delattr__', '__dict__', '__doc__', '__format__', '__get__', '__getattribute__', '__globals__', '__hash__', '__init__', '__module__', '__name__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'func_closure', 'func_code', 'func_defaults', 'func_dict', 'func_doc', 'func_globals', 'func_name']

I bet you didn’t realise that the function we just defined now had so many attributes/methods!!  You can see that __doc__ is one of them.  Where an attribute starts with two underscores ‘__’ it’s got a special meaning in Python.   You can pronounce the two underscores in a number of different ways including: “underscore underscore”, “under under”, “double underscore”, “double under” and, my favourite, “dunder”.

To tell whether these are methods (think functions) rather than attributes (think values) you can use the callable() function:

>>> callable(square.__repr__)
True
>>> callable(square.__doc__)
False

If it is callable, then you can add parentheses to it and treat it like a function (sometimes you will need to know what arguments the callable takes):

>>> square.__repr__()
'<function square at 0x7f0b977fab90>'

The __repr__() method of an object gives a printable version of the object.

When something goes wrong with your program you can use Python’s introspection capabilities to get more information about what might have gone wrong and why.  Also, don’t forget to check the Python docs!

Homework:

  • go over previous tutes and identify 3 objects
  • for each of these objects:
    • re-do the relevant tute to instantiate (ie create) each of these objects;
    • look at the docstring for the object (print objectName.__doc__); and
    • look at the directory listing for the object (print dir(objectName)).
  • Extra marks:
    • find some callable methods in one listing and call them.
Follow

Get every new post delivered to your Inbox.

Join 72 other followers