Monday, August 23, 2010

And what about performance (closures and classes)?

In this post on classes and closures we entirely skipped the question of performance.
Which of the three solutions is "better"? In order to do this I wrote the following snippet and I put it at the end of the file.

if __name__ == "__main__":
    import timeit
    setup = '''import roman
numbers = range(1, 1000)
numbers += [3678, 3286, 5483, 8546, 1234, 5869, 97342]
'''
    statements = [
        '''int2roman = roman.int2roman%s
for n in numbers: int2roman(n)'''% ending
        for ending in ['_cl', '_dec', '']]
    for stmt in statements:
        print stmt, timeit.timeit(stmt, setup, number=1000)

Of course I also renamed the three functions:

  • roman.int2roman_dec is the version with decorators
  • roman.int2roman_cl is the version with callable classes
  • roman.int3roman is the version with closures
Notice that the roman.int2roman_dec needed an internal change as well, since it accesses the attribute through the "external" function name. Thus, it became (note line 15):

if __name__ == "__main__":
def with_weights(weights):
    def _aux(f):
        f.weights = weights
        f.sorted_weights_tuples = sorted(weights.iteritems(), reverse=True)
        return f
    return _aux

@with_weights({ 900 : 'CM', 500 : 'D', 400 : 'CD', 100 : 'C',
                90  : 'XC', 50  : 'L', 40  : 'XL', 10  : 'X',
                9   : 'IX', 5   : 'V', 4   : 'IV', 1   : 'I'})
def int2roman_dec(n):
    thousands, n = divmod(n, 1000)
    roman_digits = ['M', ] * thousands

    for m, s in int2roman_dec.sorted_weights_tuples:
        while n >= m:
            roman_digits.append(s)
            n -= m

    return ''.join(roman_digits)

The times are:
Python 2.7 (r27:82508, Jul  3 2010, 21:12:11) 
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin

% python roman.py
int2roman = roman.int2roman_cl
for n in numbers: int2roman(n) 4.19191408157
int2roman = roman.int2roman_dec
for n in numbers: int2roman(n) 3.88606595993
int2roman = roman.int2roman
for n in numbers: int2roman(n) 3.98288297653

So, the fastest is the class based. The version with closures is a little bit slower. And the slowest (though not very much slower) is the version with decorators. Since most of the code is cut and pasted, the difference in timings depend in the access of the sorted_weights_tuples list.

Please notice that essentially each run applied the function to millions of numbers.
Each test consists of repeating 1000 the transformation of each of the > 1000 elements in the number list.

The slowest version runs < 1.08 times slower than the fastest. With this numbers choosing one solution because of performance is clearly useless optimization. Design an algorithm using classes instead of closures because of performance, is premature optimization.

No comments: