The Python IAQ:
Infrequently Answered Questions

by Peter Norvig

Q: What is an Infrequently Answered Question?

A question is infrequently answered either because few people know the answer or because it concerns an obscure, subtle point (but a point that may be crucial to you). I thought I had invented the term for my Java IAQ, but it also shows up at the very informative About.com Urban Legends site. There are lots of Python FAQs around, but this is the only Python IAQ, except for the Chinese translation of this page by Weiyang Chen, the Russian translation by Alexander Sviridenko, and the Japanese translation by Akihiro Takizawa. (There are a few Infrequently Asked Questions lists, including a satirical one on C.)

Q: The code in a finally clause will never fail to execute, right?

What never? Well, hardly ever. The code in a finally clause does get executed after the try clause whether or not there is an exception, and even if sys.exit is called. However, the finally clause will not execute if execution never gets to it. This would happen regardless of the value of choice in the following:

try:
    if choice:
        while 1:
            pass
    else:
        print "Please pull the plug on your computer sometime soon..."
        time.sleep(60 * 60 * 24 * 365 * 10000)
finally:
    print "Finally ..."

Q: Polymorphism is great; I can sort a list of elements of any type, right?

Wrong. Consider this:

>>> x = [1, 1j]
>>> x.sort()
Traceback (most recent call last):
  File "<pyshell#13>", line 1, in ?
    x.sort()
TypeError: cannot compare complex numbers using <, <=, >, >=

(In Python notation 1j is an imaginary number, a square root of -1.) The problem is that the sort method (in the current implementation), compares elements using the __lt__ method, which refuses to compare complex numbers (because they are not orderable). Curiously, complex.__lt__ has no qualms about comparing complex numbers to strings, lists, and every other type except complex numbers. So the answer is you can sort a sequence of objects that support the __lt__ method (and possibly other methods if the implementation happens to change).

As for the first part of the question, "Polymorphism is great", I would agree, but Python sometimes makes it difficult because many Python types (such as sequence, and number) are defined informally.

Q: Can I do ++x and x++ in Python?

Literally, yes and no; but for practical purposes, no. What do I mean by that?

The deeper question is: why doesn't Python allow x++? I believe it is the same reason why Python does not allow assignments in expressions: Python wants to clearly separate statements and expressions. If you believe they should be distinct, then disallowing ++ is probably the best decision. On the other hand, advocates of functional languages argue that statements should be expressions. I'm with my fellow Dane, Bjarne Stroustrup, on this one. He said in The Design and Evolution of C++ ``If I were to design a language from scratch, I would follow the Algol68 path and make every statement and declaration an expression that yields a value''.

Q: Can I use C++'s syntax for ostreams: cout << x << y ... ?

You can. If you don't like writing ``print x, y'' then you can try this:

import sys

class ostream:
    def __init__(self, file):
        self.file = file
        
    def __lshift__(self, obj):
        self.file.write(str(obj));
        return self

cout = ostream(sys.stdout)
cerr = ostream(sys.stderr)
nl = '\n'

cout << x << " " << y << nl

(This document shows code that belongs in a file above the horizontal line and example uses of it below the line.) This gives you a different syntax, but it doesn't give you a new convention for printing--it just packages up the str convention that already exists in Python. This is similar to the toString() convention in Java. C++ has a very different convention: instead of a canonical way to convert an object to a string, there is a canonical way to print an object to a stream (well, semi-canonical---a lot of C++ code still uses printf). The stream approach is more complicated, but it does have the advantage that if you need to print a really huge object you needn't create a really huge temporary string to do it.

Q: What if I like C++'s printf?

It's not a bad idea to define a printf in Python. You could argue that printf("%d = %s", num, result) is more natural than print "%d = %s" % (num, result), because the parens are in a more familiar place (and you get to omit the %). Furthermore, it's oh-so-easy:

def printf(format, *args): print format % args,

Even in a one-liner like this, there are a few subtleties. First, I had to decide whether to add the comma at the end or not. To be more like C++, I decided to add it (which means that if you want a newline printed, you have to add it yourself to the end of the format string). Second, this will still print a trailing space. If you don't want that, use sys.stdout.write instead of print. Third, is this good for anything besides being more C-like? Yes; you need a printing function (as opposed to a print statement) for use in places that accept functions, but not statements, like in lambda expressions and as the first argument to map. In fact, such a function is so handy, that you probably want one that does not do formatting:

def prin(x): print x,

Now map(prin, seq) will print each element of seq, but map(print, seq) is a syntax error. I've seen some careless programmers (well, OK, it was me, but I knew I was being careless) think it would be a good idea to fit both these functions into one, as follows:

def printf(format, *args): print str(format) % args,

Then printf(42), printf('A multi-line\n message') and printf('%4.2f', 42) all work. But the ``good idea'' thought gets changed to ``what was I thinking'' as soon as you do printf('100% guaranteed'), or anything else with a % character that is not meant as a formatting directive. If you do implement this version of printf, it needs a comment like this:

def printf(format, *args): 
    """Format args with the first argument as format string, and print.
    If the format is not a string, it is converted to one with str.
    You must use printf('%s', x) instead of printf(x) if x might
    contain % or backslash characters."""
    print str(format) % args,

Q: Is there a better syntax for dictionary literals? All my keys are identifiers.

Yes! I agree that it can be tedious to have to type the quote marks around your keys, especially for a large dictionary literal. At first I thought it might be a useful change to Python to add special syntax for this; maybe {a=1, b=2} for what you now have to write as {'a':1, 'b':2}. As of Python 2.3 you can use the syntax dict(a=1, b=2, c=3, dee=4), which is good enough as far as I'm concerned. Before Python 2.3 I used the one-line function def Dict(**dict): return dict

A reader suggested that Perl has a similar special notation for hashes; you can write either ("a", 1, "b", 2} or (a => 1, b => 2) for hash literals in Perl. This is the truth, but not the whole truth. "man perlop" says "The => digraph is mostly just a synonym for the comma operator ..." and in fact you can write (a, 1, b, 2), where a and b are barewords. But, as Dag Asheim points out, if you turn strict on, you'll get an error with this; you must use either strings or the => operator. And Larry Wall has proclaimed that "There will be no barewords in Perl 6."

Q: Is there a similar shortcut for objects?

Indeed there is. When all you want to do is create an object that holds data in several fields, the following will do:

class Struct:
    def __init__(self, **entries): self.__dict__.update(entries)

>>> options = Struct(answer=42, linelen = 80, font='courier') >>> options.answer 42 >>> options.answer = 'plastics' >>> vars(options) {'answer': 'plastics', 'font': 'courier', 'linelen': 80}

Essentially what we are doing here is creating an anonymous class. OK, I know that the class of options is Struct, but because we are adding slots to it, its like creating a new, unnamed class (in much the same way that lambda creates anonymous functions). I hate to mess with Struct because it is so concise the way it is, but if you add the following method then you will get a nice printed version of each structure:

    def __repr__(self):
        args = ['%s=%s' % (k, repr(v)) for (k,v) in vars(self).items()]
        return 'Struct(%s)' % ', '.join(args)

>>> options Struct(answer='plastics', font='courier', linelen=80)

Q: That's great for creating objects; How about for updating?

Well, dictionaries have an update method, so you could do d.update(dict(a=100, b=200)) when d is a dictionary. There is no corresponding method for objects, so you have to do obj.a = 100; obj.b = 200. Or you could define one function to let you do update(x, a=100, b=200) when x is either a dictionary or an object:

import types

def update(x, **entries):
    if type(x) == types.DictType: x.update(entries)
    else: x.__dict__.update(entries)
    return x

This is especially nice for constructors:

      def __init__(self, a, b, c, d=42, e=None, f=()):
        update(self, a=a, b=b, c=c, d=d, e=e, f=f)
  

Q: Can I have a dict with a default value of 0 or [ ] or something?

I sympathize that if you're keeping counts of something, it's much nicer to be able to say count[x] += 1 than to have to say count[x] = count.get(x, 0) + 1. And as of Python 2.2, it is easy to subclass the builtin dict class to do this. I call my version DefaultDict. Note the use of copy.deepcopy; it wouldn't do to have every key in the dict share the same [] as the default value (we waste time copying 0, but the time lost is not too bad if you do more updates and accesses than initializations):

class DefaultDict(dict):
    """Dictionary with a default value for unknown keys."""
    def __init__(self, default):
        self.default = default

    def __getitem__(self, key):
        if key in self: return self.get(key)
        return self.setdefault(key, copy.deepcopy(self.default))

>>> d = DefaultDict(0) >>> d['hello'] += 1 >>> d {'hello': 1} >>> d2 = DefaultDict([]) >>> d2[1].append('hello') >>> d2[2].append('world') >>> d2[1].append('there') >>> d2 {1: ['hello', 'there'], 2: ['world']} def bigrams(words): "Counts of word pairs, in a dict of dicts." d = DefaultDict(DefaultDict(0)) for (w1, w2) in zip([None] + words, words + [None]): d[w1][w2] += 1 return d >>> bigrams('i am what i am'.split()) {None: {'i': 1}, 'i': {'am': 2}, 'what': {'i': 1}, 'am': {None: 1, 'what': 1}}

Note that without DefaultDict the d[w1][w2] += 1 in the bigram example would have to be something like:

d.setdefault(w1,{}).setdefault(w2, 0); d[w1][w2] += 1

Q: Hey, can you write code to transpose a matrix in 0.007KB or less?

I thought you'd never ask. If you represent a matrix as a sequence of sequences, then zip can do the job:

>>> m = [(1,2,3), (4,5,6)]
>>> zip(*m)
[(1, 4), (2, 5), (3, 6)]

To understand this, you need to know that f(*m) is like apply(f, m). This is based on an old Lisp question, the answer to which is Python's equivalent of map(None,*m), but the zip version, suggested by Chih-Chung Chang, is even shorter. You might think this is only useful for an appearance on Letterman's Stupid Programmer's Tricks, but just the other day I was faced with this problem: given a list of database rows, where each row is a list of ordered values, find the list of unique values that appear in each column. So I wrote:

possible_values = map(unique, zip(*db))

Q: The f(*m) trick is cool. Does the same syntax work with method calls, like x.f(*y)?

This question reveals a common misconception. There is no syntax for method calls! There is a syntax for calling a function, and there is a syntax for extracting a field from an object, and there are bound methods. Together these three features conspire to make it look like x.f(y) is a single piece of syntax, when actually it is equivalent to (x.f)(y), which is equivalent to (getattr(x, 'f'))(y). I can see you don't believe me. Look:

class X:
    def f(self, y): return 2 * y

>>> x = X() >>> x.f <bound method X.f of <__main__.X instance at 0x009C7DB0>> >>> y = 21 >>> x.f(y) 42 >>> (x.f)(y) 42 >>> (getattr(x, 'f'))(y) 42 >>> xf = x.f >>> xf(y) 42 >>> map(x.f, range(5)) [0, 2, 4, 6, 8]

So the answer to the question is: you can put *y or **y (or anything else that you would put into a function call) into a method call, because method calls are just function calls.

Q: Can you implement abstract classes in Python in 0 lines of code? Or 4?

Java has an abstract keyword so you can define abstract classes that cannot be instantiated, but can be subclassed if you implement all the abstract methods in the class. It is a little known fact that you can use abstract in Python in almost the same way; the difference is that you get an error at runtime when you try to call the unimplemented method, rather than at compile time. Compare:

## Python
class MyAbstractClass:
    def method1(self): abstract

class MyClass(MyAbstractClass): 
    pass

>>> MyClass().method1() Traceback (most recent call last): ... NameError: name 'abstract' is not defined
/* Java */
public abstract class MyAbstractClass {
    public abstract void method1();
}

class MyClass extends MyAbstractClass {}

% javac MyAbstractClass MyAbstractClass.java:5: class MyClass must be declared abstract. It does not define void method1() from class MyAbstractClass.

Don't spend too much time looking for the abstract keyword in the Python Language Reference Manual; it isn't there. I added it to the language, and the great part is, the implementation is zero lines of code! What happens is that if you call method1, you get a NameError because there is no abstract variable. (You might say that's cheating, because it will break if somebody defines a variable called abstract. But then any program will break if someone redefines a variable that the code depends on. The only difference here is that we're depending on the lack of a definition rather than on a definition.)

If you're willing to write abstract() instead of abstract, then you can define a function that raises a NotImplementedError instead of a NameError, which makes more sense. (Also, if someone redefines abstract to be anything but a function of zero arguments, you'll still get an error message.) To make abstract's error message look nice, just peek into the stack frame to see who the offending caller is:

def abstract():
    import inspect
    caller = inspect.getouterframes(inspect.currentframe())[1][3]
    raise NotImplementedError(caller + ' must be implemented in subclass')

>>> MyDerivedClass().method1() Traceback (most recent call last): ... NotImplementedError: method1 must be implemented in subclass

Q: How do I do Enumerated Types (enums) in Python?

The reason there is no one answer to this question in Python is that there are several answers, depending on what you expect an enum to be. If you just want some variables, each with a unique integer value, you can do:

red, green, blue = range(3)

The drawback is that whenever you add a new variable on the left, you have to increment the number on the right. This is not so bad, though, because if you get it wrong Python will raise an error. It's probably better hygiene to isolate your enums in a class:

class Colors:
    red, green, blue = range(3)

Now Colors.red yields 0, and dir(Colors) may be useful (although you need to ignore the __doc__ and __module__ entries). If you need control over what values each enum variable will have, you can use the Struct function from several questions ago as follows:

Enum = Struct
Colors = Enum(red=0, green=100, blue=200)

While these simple approaches usually suffice, some people want more. There are Enum implementations at python.org, ASPN, and faqts. Here is my version, which is (almost) all things to all people, while still being reasonably concise (44 lines, 22 of which are code):

class Enum:

    """Create an enumerated type, then add var/value pairs to it.
    The constructor and the method .ints(names) take a list of variable names,
    and assign them consecutive integers as values.    The method .strs(names)
    assigns each variable name to itself (that is variable 'v' has value 'v').
    The method .vals(a=99, b=200) allows you to assign any value to variables.
    A 'list of variable names' can also be a string, which will be .split().
    The method .end() returns one more than the maximum int value.
    Example: opcodes = Enum("add sub load store").vals(illegal=255)."""
  
    def __init__(self, names=[]): self.ints(names)

    def set(self, var, val):
        """Set var to the value val in the enum."""
        if var in vars(self).keys(): raise AttributeError("duplicate var in enum")
        if val in vars(self).values(): raise ValueError("duplicate value in enum")
        vars(self)[var] = val
        return self
  
    def strs(self, names):
        """Set each of the names to itself (as a string) in the enum."""
        for var in self._parse(names): self.set(var, var)
        return self

    def ints(self, names):
        """Set each of the names to the next highest int in the enum."""
        for var in self._parse(names): self.set(var, self.end())
        return self

    def vals(self, **entries):
        """Set each of var=val pairs in the enum."""
        for (var, val) in entries.items(): self.set(var, val)
        return self

    def end(self):
        """One more than the largest int value in the enum, or 0 if none."""
        try: return max([x for x in vars(self).values() if type(x)==type(0)]) + 1
        except ValueError: return 0
    
    def _parse(self, names):
        ### If names is a string, parse it as a list of names.
        if type(names) == type(""): return names.split()
        else: return names

Here's an example of how to use it:

>>> opcodes = Enum("add sub load store").vals(illegal=255)
>>> opcodes.add
0
>>> opcodes.illegal
255
>>> opcodes.end()
256
>>> dir(opcodes)
['add', 'illegal', 'load', 'store', 'sub']
>>> vars(opcodes)
{'store': 3, 'sub': 1, 'add': 0, 'illegal': 255, 'load': 2}
>>> vars(opcodes).values()
[3, 1, 0, 255, 2]

Notice that the methods are cascaded, so you can combine .strs, .ints and .vals on a single line after the constuctor. Notice the helpful use of dir and vals, and that they are free of clutter with anything other than the variables you define. To iterate over all the enumerated values, you can use for x in vars(opcodes).values(). Notice that you can have non-integer values for enum variables if you want, using the .strs and .vals methods. Finally, notice that it is an error to duplicate a variable name or value. Sometimes you want to have duplicate values (e.g. for aliases); if you need that, either delete the line that raises a ValueError, or use, for example vars(opcodes)['first_op'] = 0. If there's one thing I dislike most, its the potential for confusion between vals and values; maybe I can think of a better name for vals.

Q: Why is there no ``Set'' data type in Python?

When this question was first posed there wasn't one, and programmers mostly used dictionaries instead, but now in Python 2.4 there is good native support for the set type.

Q: Should I, could I use a Boolean type?

When this question was first posed there was no Boolean type in Python, but as of Python 2.3 there is a bool type.

Q: Can I do the equivalent of (test ? result : alternative) in Python?

Java and C++ have the ternary conditional operator (test ? result : alternative). Python has resisted this, but in the upcoming Python 2.5 it will allow expressions of the form (result if test else alternative). This undermines Python's clear distinction between expressions and statements, but it is a compromise that many people have asked for.

Until Python 2.5 arrives, what can you do? Here are some options:

  1. You could try [alternative, result][test]. Note this evaluates both alternative and result so it is not good if one is a recursive call or is an expensive computation. If test could return a non-bool, then try
  2. [result, alternative][not test] Neither of these is very readable.
  3. test and result or alternative Some find this idiomatic while others find it confusing. It only works when the result is guaranteed to be non-false.
  4. (test and [result] or [alternative])[0] avoids that restriction.
  5. [lambda: result, lambda: alternative][not not test]() gets around all the restrictions posed so far (except readability), but don't say I told you to do it. You can even package this up in a call. The approved naming convention for variables that mimic keywords is to add a trailing underscore. So we have:
  6. if_(test, result, lambda: alternative) where we define

    def if_(test, result, alternative=None):
        "If test is true, 'do' result, else alternative. 'Do' means call if callable."
        if test:
            if callable(result): result = result()
            return result
        else:
            if callable(alternative): alternative = alternative()
            return alternative
    
    >>> fact = lambda n: if_(n <= 1, 1, lambda: n * fact(n-1)) >>> fact(6) 720

  7. Now, suppose for some reason you strongly prefer the syntax "if (test) ..." over "if(test, ..." (and, you never want to leave off the alternative part). You could try this:

    def _if(test):
        return lambda alternative: \
                   lambda result: \
                       [delay(result), delay(alternative)][not not test]()
    
    def delay(f):
        if callable(f): return f
        else: return lambda: f
    
    >>> fact = lambda n: _if (n <= 1) (1) (lambda: n * fact(n-1)) >>> fact(100) 93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000L

    If u cn rd ths, u cn gt a jb in fncnl prg (if thr wr any).

Q: What other major types are missing from Python?

One great thing about Python is that you can go a long way with numbers, strings, lists, and dicts (and now sets and bools). But there are a few major types that are still missing. For me, the most important is a mutable string. Doing str += x over and over, is slow, and manipulating lists of characters (or lists of sub-strings) means you give up some of the nice string functions. One possibility is array.array('c'). Another is UserString.MutableString, although its intended use is more educational than practical. A third is the mmap module and a fourth is cStringIO. None of these is perfect, but together they provide enough choices. After that, I find I often want a queue of some sort. There is a standard library Queue module, but it is specialized for queues of threads. Because there are so many options, I won't lobby for a standard library implementation of queues. However, I will offer my implementation of three types of queue, FIFO, LIFO, and priority:

"""
This module provides three types of queues, with these constructors:
  Stack([items])  -- Create a Last In First Out queue, implemented as a list
  Queue([items])  -- Create a First In First Out queue
  PriorityQueue([items]) -- Create a queue where minimum item (by <) is first
Here [items] is an optional list of initial items; if omitted, queue is empty.
Each type supports the following methods and functions:
  len(q)          -- number of items in q (also q.__len__())
  q.append(item)  -- add an item to the queue
  q.extend(items) -- add each of the items to the queue
  q.pop()         -- remove and return the "first" item from the queue
"""

def Stack(items=None):
    "A stack, or last-in-first-out queue, is implemented as a list."
    return items or []

class Queue:
    "A first-in-first-out queue."
    def __init__(self, items=None): self.start = 0; self.A = items or []
    def __len__(self):                return len(self.A) - self.start
    def append(self, item):           self.A.append(item)
    def extend(self, items):          self.A.extend(items)

    def pop(self):
        A = self.A
        item = A[self.start]
        self.start += 1
        if self.start > 100 and self.start > len(A)/2:
            del A[:self.start]
            self.start = 0
        return item

class PriorityQueue:
    "A queue in which the minimum element (as determined by cmp) is first."
    def __init__(self, items=None, cmp=operator.lt):
          self.A = []; self.cmp = cmp;
          if items: self.extend(items)
      
    def __len__(self): return len(self.A)

    def append(self, item):
        A, cmp = self.A, self.cmp
        A.append(item)
        i = len(A) - 1
        while i > 0 and cmp(item, A[i//2]):
            A[i], i = A[i//2], i//2
        A[i] = item

    def extend(self, items):
        for item in items: self.append(item)

    def pop(self):
        A = self.A
        if len(A) == 1: return A.pop()
        e = A[0]
        A[0] = A.pop()
        self.heapify(0)
        return e

    def heapify(self, i):
        "Assumes A is an array whose left and right children are heaps,"
        "move A[i] into the correct position.  See CLR&S p. 130"
        A, cmp = self.A, self.cmp
        left, right, N = 2*i + 1, 2*i + 2, len(A)-1
        if left <= N and cmp(A[left], A[i]):
            smallest = left
        else:
            smallest = i
        if right <= N and cmp(A[right], A[smallest]):
            smallest = right
        if smallest != i:
            A[i], A[smallest] = A[smallest], A[i]
            self.heapify(smallest)

Notice the idiom ``items or [].'' It would be very wrong to do something like

def Stack(items=[]): return items

to indicate that the default is an empty list of items. If we did that, then different stacks would share the same list. By making the default value be None (a false value that is outside the range of valid inputs), we can arrange so that each instance gets its own fresh list. One possible objection to the use of this idiom in this example: a user who does

s = Stack(items)

might expect that s and items become identical, but that only happens when items is not empty. I would say that this objection is not too serious, because no such promise is explicitly made. (Indeed, a user might also expect that items remains unmodified, which is only the case when items is empty.)

Q: How do I do the Singleton Pattern in Python?

I assume you mean that you want a class that can only be instantiated once, and raises an exception if you try to make another one. The simplest way I know to do that is to define a function that enforces the idea, and call the function from the constructor in your class:

def singleton(object, instantiated=[]):
    "Raise an exception if an object of this class has been instantiated before."
    assert object.__class__ not in instantiated, \
        "%s is a Singleton class but is already instantiated" % object.__class__
    instantiated.append(object.__class__)

class YourClass:
    "A singleton class to do something ..."
    def __init__(self, args):
        singleton(self)
        ...

You could also mess around with metaclasses so that you could write class YourClass(Singleton), but why bother? Before the Gang of Four got all academic on us, ``singleton'' (without the formal name) was just a simple idea that deserved a simple line of code, not a whole religion.

Q: Is no "news" good news?

I presume you mean is it good that Python has no new keyword. It is indeed. In C++, new is used to mark allocation on the heap rather than the stack. As such, the keyword is useful. In Java, all objects are heap-allocated, so new has no real purpose; it only serves as a reminder of the distinction between a constructor and other static methods. But making this distinction probably does more harm than good in Java, because the distinction is a low-level one that forces implementation decisions that really should be delayed. I think Python made the right choice in keeping the syntax of a constructor call the same as the syntax of a normal function call.

For example, before there was a bool class, we might have wanted to implement it. Let's call it Bool to keep it distinct from the built-in. Suppose we wanted to enforce the idea that there should be only one true and one false object of type Bool. One way to do that is to rename the class Bool to _Bool (so that it won't be exported), and then define a function Bool as follows:

def Bool(val):
    if val: return true
    else: return false

true, false = _Bool(1), _Bool(0)

This makes the function Bool a factory for _Bool objects (although admittedly a factory with an unusually small capacity). The point is that the programmer who calls Bool(1) should not know or care if the object returned is a new one or a recycled one (at least in the case of immutable objects). Python syntax allows that distinction to be hidden, while Java syntax does not.

There is some confusion in the literature; some people use the term "Singleton Pattern" for this type of factory, where there is a singleton object for each different argument to the constructor. I vote with what I believe is the majority in my definition of Singleton in the previous question. You can also encapsulate this pattern in a class. We'll call it "CachedFactory." The idea is that you write

class Bool:
    ... ## see here for Bool's definition

Bool = CachedFactory(Bool)

and then the first time you call Bool(1) the argument list (1,) gets delegated to the original Bool class, but any subsequent calls to Bool(1) return that first object, which gets kept in a cache:

class CachedFactory:
    def __init__(self, klass):
        self.cache = {}
        self.klass = klass

    def __call__(self, *args):
        if self.cache.has_key(args):
            return self.cache[args]
        else:
            object = self.cache[args] = self.klass(*args)
            return object

One thing to notice is that nothing rests on classes and constructors; this pattern would work with any callable. When applied to functions in general, it is called the "Memoization Pattern". The implementation is the same, only the names are changed:

class Memoize:
    def __init__(self, fn):
        self.cache = {}
        self.fn = fn

    def __call__(self, *args):
        if self.cache.has_key(args):
            return self.cache[args]
        else:
            object = self.cache[args] = self.fn(*args)
            return object

Now you can do fact = Memoize(fact) and get factorials computed in amortized O(1) time, not O(n).

Q: Can I have a history mechanism like in the shell?

Yes. Is this what you want?

>>> from shellhistory import h
h[2] >>> 7*8
56
h[3] >>> 9*9
81
h[4] >>> h[2]
56
h[5] >>> 'hello' + ' world'
'hello world'
h[6] >>> h
[None, 9, 56, 81, 56, 'hello world']
h[7] >>> h[5] * 2
'hello worldhello world'
h[8] >>>  h[7] is _ is h[-1]
1

How does this work? The variable sys.ps1 is the system prompt. By default it is the string '>>> ' but you can set it to anything else. If you set it to a non-string object, the object's __str__ method gets called. So we'll create an object whose string method appends the most recent result (the variable _) to a list called h (for history), and then returns a prompt string that includes the length of the list followed by '>>>'. Or at least that was the plan. As it turns out (at least on the IDLE 2.2 implementation on Windows), sys.ps1.__str__ gets called three times, not just once before the prompt is printed. Don't ask me why. To combat this, I only append _ when it is not already the last element in the history list. And I don't bother inserting None into the history list, because it's not displayed by the Python interactive loop, and I don't insert h itself into h, because the circularity could lead to problems printing or comparing. Another complication was that the Python interpreter actually attempts to print '\n' + sys.ps1, (when it should print the '\n' separately, or print '\n' + str(sys.ps1)) which means that sys.ps1 needs an __radd__ method as well. Finally, my first version would fail if imported as the very first input in a Python session (or in the .python startup file). After some detective work it turns out this is because the variable _ is not bound until after the first expression is evaluated. So I catch the exception if _ is unbound. That gives us:

import sys

h = [None]

class Prompt:
    "Create a prompt that stores results (i.e. _) in the array h."
    def __init__(self, str='h[%d] >>> '):
        self.str = str;
        
    def __str__(self):
        try:
            if _ not in [h[-1], None, h]: h.append(_);
        except NameError:
            pass
        return self.str % len(h);
    
    def __radd__(self, other):
        return str(other) + str(self)

sys.ps1 = Prompt()

Q: How do I time the execution of my functions?

Here's a simple answer:

def timer(fn, *args):
    "Time the application of fn to args. Return (result, seconds)."
    import time
    start = time.clock()
    return fn(*args), time.clock() - start

>>>timer(max, range(1e6)) (999999, 0.4921875)

There's a more complex answer in my utils module.

Q: What does your .python startup file look like?

Currently it looks like this, but it's been changing a lot:

from __future__ import nested_scopes
import sys, os, string, time
from utils import *

################ Interactive Prompt and Debugging ################

try:
    import readline
except ImportError:
    print "Module readline not available."
else:
    import rlcompleter
    readline.parse_and_bind("tab: complete")

h = [None]

class Prompt:
    def __init__(self, str='h[%d] >>> '):
        self.str = str;

    def __str__(self):
        try:
            if _ not in [h[-1], None, h]: h.append(_);
        except NameError:
           pass
        return self.str % len(h);

  def __radd__(self, other):
        return str(other) + str(self)


if os.environ.get('TERM') in [ 'xterm', 'vt100' ]:
    sys.ps1 = Prompt('\001\033[0:1;31m\002h[%d] >>> \001\033[0m\002')
else:
    sys.ps1 = Prompt()
sys.ps2 = ''


Thanks to Amit J. Patel, Max M, Dan Winkler, Chih-Chung Chang, Bruce Eckel, Kalle Svensson, Mike Orr, Steven Rogers and others who contributed ideas and corrections.

Peter Norvig