Tuesday, September 6, 2011

Clojure: Quicksort in Continuation Passing Style and Trampolines

Introduction

In this post we are going to write a completely recursive stack-wary version of the classical quicksort algorithm in continuation passing style, making use of trampolines.
I already discussed quicksort in continuation passing style in this post. The idea is to refer to the post as much as possible and introduce here more information.
From Wikipedia:
Instead of "returning" values as in the more familiar direct style, a function written in continuation-passing style takes an explicit "continuation" argument, i.e. a function which is meant to receive the result of the computation performed within the original function. Similarly, when a subroutine is invoked within a CPS function, the calling function is required to supply a procedure to be invoked with the subroutine's "return" value. Expressing code in this form makes a number of things explicit which are implicit in direct style. These include: procedure returns, which become apparent as calls to a continuation; intermediate values, which are all given names; order of argument evaluation, which is made explicit; and tail calls, which simply is calling a procedure with the same continuation that was passed to the caller, unmodified.
Essentially there are two issues in CPS with Clojure. The first one is recursion on the stack. CPS is tail recursive by default; however, most of the times it is about mutually recursive calls. Which essentially means trampolines.
The second issue relates to the continuation themselves. First, they may occupy heap space (as the continuations stay in memory until evaluated), second, the continuations themselves need to be treated with trampolines, otherwise, when evaluated consume stack space.
However, with massive usage of trampolines, both issues can be solved.

Append

The first function we need to write is append (i.e., concat in Clojure Jargon). We need to write a new version because we need to have it deal with the continuation. And as we will see, append is a false friend.
(defn append-notramp [lst1 lst2 k]
  (cond
   (empty? lst1) (k lst2)
   :else (recur (rest lst1) lst2
                 (fn [rst]
                   (k (cons (first lst1) rst))))))
At first sight, his implementation looks correct. We can try it with simple inputs and convince ourselves that the semantics is right. Essentially if there are elements in the first list, the new continuation is a function taking a list rst and calling the continuation we received with a list where we consed the first element of lst1 on rst.
However, running with increasingly large values for lst1 results in a stack overflow.
user> (cps/append-notramp (range 100000) (range 4) identity)
; Evaluation aborted.
In this case, recur is a false friend. We are lead to think that since append-notramp is tail-recursive, we do not have to worry about stack overflow. The issue here are the chain of continuations, however. And they cannot use recur, because even though they *are* tail recursive, they are mutually recursive. Luckily enough, Clojure provides trampolines and in this case, they are also relatively easy to use.
(defn append [lst1 lst2 k]
  (cond
   (empty? lst1) #(k lst2)
   :else (recur (rest lst1) lst2
                 (fn [rst]
                   #(k (cons (first lst1) rst))))))
Here append plainly returns a chain of functions which returns functions and they can be resolved calling trampoline on them. I pass empty as the continuation because I do not want a list of some thousand of elements printed on standard output. The other function just checks the 0s are at the right places.
user> (cps/append (range 100000) (range 4) empty)
#[cps$append$fn__5392 cps$append$fn__5392@30ea3e3c]
user> (trampoline (cps/append (range 100000) (range 4) empty))
()
user> (trampoline (cps/append (range 100000) (range 4)
                              #(and (= (nth % 0) 0)
                                    (= (nth % 100001)))))
true

Partition

Partition suffers from a similar problem than append, in the sense that a naive version blows the stack.
(defn- partition-notramp [coll p? k]
  (loop [coll coll k k]
    (cond
     (empty? coll) (k () ())
     (p? (first coll)) (recur (rest coll)
                              (fn [p-true p-false]
                                (k (cons (first coll) p-true)
                                   p-false)))
     :else (recur (rest coll)
                  (fn [p-true p-false]
                    (k p-true
                       (cons (first coll)
                             p-false)))))))
The solution is also similar. When a continuation k is called, instead return a function which calls k, like #(k ...).
(defn- partition [coll p? k]
  (loop [coll coll k k]
    (cond
     (empty? coll) #(k () ())
     (p? (first coll)) (recur (rest coll)
                              (fn [p-true p-false]
                                #(k (cons (first coll) p-true)
                                   p-false)))
     :else (recur (rest coll)
                  (fn [p-true p-false]
                    (k p-true
                       (cons (first coll)
                             p-false)))))))

Quicksort

Now things become complicated. Quicksort is a doubly recursive function and is called inside the continuations. We basically have each continuation return a function which runs what would have been the continuation weren't we using trampolines. This works like a breeze: we just have to call trampoline before returning.
(defn quicksort
  ([coll less? k]
     (letfn [(qs [coll k]
                 (if (empty? coll) #(k ())
                     (let [[pivot & coll1] coll]
                       (partition coll1
                                  (fn [x] (less? x pivot))
                                  (fn [less-than greater-than]
                                    (fn []
                                      (qs greater-than
                                          (fn [sorted-gt]
                                            (fn []
                                              (qs less-than
                                                  (fn [sorted-lt]
                                                    (append sorted-lt (cons pivot sorted-gt) k))))))))))))]
       (trampoline (qs coll k))))
     ([coll less?]
        (quicksort coll less? identity)))

A word on performace

Of course, performance is rather terrible. First, quicksort in functional settings rather sucks. I find it so much imperative oriented that it is even hard to think about it in this context (notice here we are using rather complex stuff to make it functionally looking).
Then there are a number of issues:
  1. Typical optimizations like switching to other algorithms with lower multiplicative constants is not easy to do. Every self-respecting implementation of quicksort when the list is sufficiently small switches to something like insertion sort because it is just faster (say... sublists of k elements).
  2. Other optimizations like recursing first in the smaller half of the array are also hard to do (because we are using lists... we should essentially use a custom variant of partition which takes care of such nuisances).
  3. Quicksort performs poorly if the pivot element is chosen as the first element in the list. However, in functional settings, where the core data structure is likely to (or just may be) a list, taking an element in the middle has an inacceptable algorithmic complexity
  4. Moreover, partition is likely to use far more memory than what would use in imperative settings
Consequently, Clojure sort is plainly faster, considering that it should call Java sort (which now is the blazingly fast tim-sort, a difficult if not impossible to beat algorithm). Comparisons in this sense are embarassing. That is why I did not even try to code smarter in order to make this quicksort faster. And no, I'm definitely not going to turn tim-sort in CPS+trampolines by hand.
Technorati Tags: , , , ,

No comments: