I found rather unusual the apparent lack of an unfold function in Clojure. Perhaps I was not able to find it in the documentation (which I hope is the case), in this case, happy to have done a useful, although slightly dull, exercise. However, if this is not the case, here it is unfold in Clojure.
Unfold is a very powerful function which abstracts the idea of generators. I tried to make it lazy (please, feel free to break it) as we expect from clojure functions. Some examples are provided:
Here, we simply generate a list of elements from 1 to 11 (included):
(unfold #(= 0 %) identity dec 10)
The following example is more interesting: here I show that unfold is at least as powerful as iterate. It is possible to regard iterate as a simpler (and easier to use) variant of unfold.
(unfold (fn [x] false) identity inc 0)
The code is equivalent to (iterate inc 0). Now the idea is that:
- The first argument is a predicate that receives the seed (last argument). If it is true, the function returns the empty list; otherwise returns the cons where the first cell if the second argument applied to the seed and the second cell is a recursive call to unfold. All the arguments but the last one are the same. The last argument becomes g applied to seed.
- The second argument is a unary function which is applied to each individual "value" which is generated. In this case we just use identity, because we want just the plain arguments. We could have used #(* %1 %1) to have the squares of the values or whatever makes sense.
- The third argument is the function that generates the next seed given the current one
- The last argument is just the seed
Unfold is extremely general. For example the classic function "tails" that returns all the non-empty sublists of a list ((1 2 3) -> ((1 2 3) (2 3) (3))) could be implemented as:
(defn unfold-tails [s] (unfold empty? identity rest s))
The standard map, becomes:
(defn unfold-map [f s] (unfold empty? #(f (first %)) rest s))
Now, here the implementation:
Upgrade
With this update, I present an upgraded version with the tail-g parameter. The classic srfi-1 Scheme version also has this argument. Originally, I thought that in Clojure I could use it to have different sequence types. On a second thought, it can't work because of the way Clojure manages sequences (at least, not in the sense that I intended it).
On a second thought, however, it is highly useful. Consider for example the tails function I defined above. In fact it returns all non-empty substrings of a given string. This makes sense in lots of contexts, however it is different from the definition of the Haskell function.
tails :: [a] -> [[a]] tails xs = xs : case xs of [] -> [] _ : xs' -> tails xs'That could be roughly translated (losing laziness) as:
(defn tails [xs] (cons xs (when (seq xs) (tails (rest xs)))))
This function also returns the empty list. Surely, it would be possible to (cons () (unfold-tails xs)), but that would break the property that each (finite) element in the resulting sequence is longer than the following. Without the kludge, the unfold-tails function breaks the strictness property that the resulting list contains xs if xs is empty. Accordingly, tails should really be defined to include the empty list. Appending the empty list to the end of the list returned from unfold-tails would maintain both properties, but would be excessively inefficient.
On the other hand, specifying the tail-g function allows a cleaner definition
(defn tails [xs] (unfold empty? identity rest xs #(list %)))
Essentially, without tail-g it is not possible to include in the resulting sequence elements which depend from the element which makes the sequence stop.
Updates
- Substituted (if ($1) (list) ($2)) with (when-not ($1) ($2)) after some tweets from @jneira and @hiredman_.
- Added version with tail-g
- Here more functions and experiments with unfold.
- Removed HTML crap from cut and paste
Technorati Tags: Clojure, Anamorphism, Unfold, Functional Programming, Programming