Wednesday, October 29, 2008

The Y combinator in Python

I’ve been asked how would the Y combinator look like in Python (see comments of this post; comments are lost with the migration). In that post the Y combinator is also explained detailedly (and may be an interesting reading).

Python is an imperative language, however, its functions are first class citizens and can be manipulated in a quite functional way. We have closures, anonymous functions and recursion. That’s all we need to build the Y combinator in Python.

Moreover, the Y combinator in Python looks like a decorator. Quite unfortunately, using recursion in Python is usually a poor choice. Python has a rather low limit on recursive calls (and even if you can rise it, it’s not good programming practice).

In Python function calls are quite expensive and there is no tail recursion optimization.
Indeed, even though Python is great in manipulating functions (and is something I use relatively often), recursion should not the preferred method to loop. It’s much better to use iterators/generators in for loops ore as a second (usually inferior) choice, while loops.

However, this is the implementation of the Y combinator in Python (along with the traditional factorial example):

def y(f):
    def g(k):
        return f(lambda a: (k(k))(a))
    return g(g)

@y
def fact(h):
    def fact_aux(n):
        if n == 0:
            return 1
        else:
            return n * h(n-1)
    return fact_aux

print fact(5)

Friday, October 24, 2008

Anonymous Recursive Erlang Functions with and without the Y combinator

In Erlang creating recursive anonymous recursive functions is not trivial. Basically the point is that the anonymous function has no name (gee...), thus you can’t reference a function inside it’s own definition. Which makes sense, by the way. In Haskell, for example, this problem does not exist. You can write:

main =
    let fact 0 = 1
        fact n = n * fact (n-1)
    in show $ mainfact 5

and everything works as expected. If we want to write this in Erlang we can’t just write:

main() ->;
    fact = fun(0) -> 1;
              (N) -> N * fact(N-1)
           end,
    io:format("~p~n", [fact(5)]).

because fact is not bound when we examine the anonymous function body. The solution is dramatically trivial.

First think about this:

X = fun(X) -> X(X) end,
    X(X).

Ok: you got it, it loops forever (Ctrl-C is your friend). Basically you forever call X.

However, it’s not hard to see how we can implement some kind of counter:

-module(recursive).
-compile(export_all).

main() ->
    F = fun(0, F) -> io:fwrite("Stop.\n");
           (N, F) -> io:format("~p~n", [N]),
    F(N-1, F)
end,
F(3, F).

We just did recursion with Erlang anonymous functions. If you, like me, just wanted a solution to this problem, you may stop reading here. Now we will get to the Y combinator solution extending this way of thinking and then will present the Y combinator from a theoretical point of view.

We’ve got a function with 2 parameters, one of which is just a way to carry around the function itself. What happens if we decide to curry the function? What does currying mean? From wikipedia:

In computer science, currying, invented by Moses Schönfinkel and Gottlob Frege, and independently by Haskell Curry, is the technique of transforming a function that takes multiple arguments (or more accurately an n-tuple as argument) in such a way as it can be called as a chain of functions each with a single argument.

Of course, you want to reverse the parameters so that we have:

main2() ->
    F = fun(F, 0) -> io:fwrite("Stop.\n");
           (F, N) -> io:format("~p~n", [N]),
    F(F, N-1)
    end,
    F(F, 3).

Then the idea is that:

f: A x B → C

becomes:

f: A → (B → C)

so that the return value of f(x) is a function h: B → C. Thus f(x, y) becomes (f(x))(y).

Ok, nothing new under the sun. Haskell programmers curry and uncurry functions all the time. You can do that even in Python! Well, kind of.

main3() ->
    F = fun(F) ->
            fun(0) -> io:fwrite("Stop.\n");
               (N) -> io:format("~p~n", [N]), (F(F))(N-1)
            end
        end,
    (F(F))(3).

Don’t you find that using (F(F))(x) all the time sucks? I do. The first thing I would do is to define:

g(F, N) ->
    (F(F))(N).

main4() ->
    F = fun(F) ->
            fun(0) -> io:fwrite("Stop.\n");
               (N) -> io:format("~p~n", [N]), g(F, N-1)
            end
        end,
    g(F, 3).

Well, quite more readable (as soon as you find some fancy name for g). However, this approach has at least one major drawback: we can use g to obtain some number, but functional programming is about function manipulation. Being able to build a recursive function from scratch is out ultimate goal.

The idea is that we should “inject” g behaviour into “F”. That is to say we should put into F something that has g behaviour as well. A key idea could be currying g:

g2(H) ->
    fun(N)->
        (H(H))(N)
    end.

main5() ->
    F = fun(H) ->
            fun(0) -> io:fwrite("Stop.\n");
               (N) -> io:format("~p~n", [N]), (g2(H))(N-1)
            end
        end,
    (g2(F))(3).

We are getting nearer to the solution. In fact g2(F)isalmost the function we are looking for.

Unfortunately it is not the right candidate to be passed to F as well.

Basically we want to write :

fun(H) ->
    fun(0) -> io:fwrite("Stop.\n");
       (N) -> io:format("~p~n", [N]),  H(N-1)
    end
end

and H should “do the right thing”. Moreover, we want to build H from out anonymous function in a fashion similar to what we got with g2(F). The idea of g2 is (using anonymous functions):

main6() ->
    F = fun(H) ->
           fun(0) -> io:fwrite("Stop.\n");
              (N) -> io:format("~p~n", [N]),
                ((fun(K) -> fun(A) -> (K(K))(A) end end)(H))(N-1)
           end
        end,
    ((fun(K) -> fun(A) -> (K(K))(A) end end)(F))(3).

this is done “factoring out” (fun(K) -> fun(A) -> (K(K))(A) end end). We leave H in its place and pass (fun(K) -> fun(A) -> (K(K))(A) end end) from the caller. Code is easier than the explanation.

main7() ->
    F = fun(H) ->
                fun(0) -> io:fwrite("Stop.\n");
                   (N) -> io:format("~p~n", [N]), H(N-1)
                end
        end,
    ((fun(K) -> F(fun(A) -> (K(K))(A) end) end)
        (fun(K) -> F(fun(A) -> (K(K))(A) end) end))(3).

Now things are quite messy and you have almost as many parentheses as if you were coding in Lisp. However, this way we were able to put the recursion logic out of F. And we are almost done.

Let’s give the (fun(K) -> F(fun(A) -> (K(K))(A) end) end) a name (G) and rewrite the messy code:

main8() ->
    F = fun(H) ->
            fun(0) -> io:fwrite("Stop.\n");
               (N) -> io:format("~p~n", [N]), H(N-1)
            end
        end,
    G = (fun(K) -> F(fun(A) -> (K(K))(A) end) end),
    (G(G))(3).

We notice that:
  1. iteration logic is not in F
  2. G does not substantially depend from how F is done.
  3. We can define a function that takes F and using G returns the G(G) function

y(F) ->
    G = (fun(K) -> F(fun(A) -> (K(K))(A) end) end),
    G(G).

main9() ->
    F = fun(H) ->
                fun(0) -> io:fwrite("Stop.\n");
                   (N) -> io:format("~p~n", [N]), H(N-1)
                end
        end,

Recursive_F = y(F),
    Recursive_F(3),
    Recursive_F(7).

Well... the function we called yis the Erlang version of the Y combinator. Basically Y can be used to make recursive any erlang function with one argument. Besides, there are versions which can deal with functions with more formal parameters. However, the construction is quite the same as the one I presented for this version.

Tuesday, October 21, 2008

CMake (Part 2 - External Libraries)

After the introductory article on CMake, it is time to tackle with libraries, one of the first problems not easily solvable with plain Makefiles.

While quite often these libraries are installed in default places (which means no need for -L and -I options), sometimes they are not (for example in a given system these libraries may be placed in /opt), for example because developers may want to link against development versions built with debugging symbols.

Moreover, there can be version mismatches, and even though both this and the former problem may be solved if the library comes with some *-config program (which answers question about the library), some don’t.

Generally speaking, things were even worse before Linux relegated most proprietary unices to smaller market shares. Still, even today, something like the autotools is needed to solve dependency problems swiftly.

Luckily enough, CMake is even easier to do. CMake comes with a lot of pre-packaged “finder” scripts which are able to detect installation places for libraries and set appropriate environment variables.

Suppose you are writing C++ software, and you are wisely using Boost. Then you want to include Boost headers and perhaps link against some boost libraries (most Boost packages are header only, since rely heavily on templates and extern templates are yet somewhat experimental).

FIND_PACKAGE(BOOST REQUIRED)

That’s it. Variables ${BOOST_INCLUDE_PATH} and ${BOOST_LIBRARIES} are set. The next step is to tell cmake that some headers are in ${BOOST_INCLUDE_PATH} and some libraries we may link are in ${BOOST_LIBRARIES}. This is accomplished through the INCLUDE_DIRECTORIES and LINK_LIBRARIES commands:

INCLUDE_DIRECTORIES(${CMAKE_SOURCE_DIR} ${BOOST_INCLUDE_PATH}) 
LINK_LIBRARIES(${BOOST_LIBRARIES}) 

The REQUIRED keyword means that cmake should not proceed if the library is not found. Suppose Boost is an optional dependency; for example you only use shared_ptrs, which are in tr:: as well. Thus, if you don’t find boost, you simply use tr:: and the build can go on. Then the command is:

FIND_PACKAGE(BOOST) # (without REQUIRED)

The ${BOOST_FOUND} variable is true if and only if Boost was found. You can test with the IF command:

IF(BOOST_FOUND) 
# do something with boost 
ELSE(BOOST_FOUND) 
# do something without boost 
ENDIF(BOOST_FOUND) 

Notice that cmake requires the condition to be repeated in ELSE and ENDIF clauses to improve readability. This can be disabled. However, I quite like it and I prefer to leave the default on.

Friday, October 17, 2008

Python considered harmful?

Yesterday morning I started writing a simple tool which should compile all C++ files in a given directory tree. Ok, simple enough, it’s just a matter of a simple os.walk and some subprocess.call.

After that, I thought that it would be nice if the tool did something more. For example, I could write down a function that was able to “make clean” the executables I built. This is simple as well. You just have to be smart since on *nix executables do not have extensions. And I could make the system Windows friendly. It was a matter of a couple of functions more.

However, at that point I needed some kind of interface. A simple command line tool was nice. But... well, I remembered using a Prolog tool which did some heuristic to get wrong commands right (zsh does this as well)... so, why not? A levenshtein distance + some more checks could do. And it did. Nice.

But... what if I just want to compile one single file? Well, I had already most of the infrastructure. And why not letting me specify a single file to clean? As easy as breathing. Done. By the way, the system at that point supported building and cleaning object files as well.

And I was already wondering that I left out C files. After all I needed to process C++ files, but the tool would be surely more useful if I could use it with C files too. And why not Pascal? Ok, I have the right design which could support kind of configuration to map file types to compilers... and I could parametrize compiler options as well. Somewhere in my mind an automated compiler detection system was already lurking.

I refactored the tool. It became a package, with a lot of useful functions... But well... why rebuilding all the sources? I need only to rebuild executables which are older than their source file. This is easy, a couple of stat...

At that point I realized that if only I added some kind of file dependency discovery I would have basically reinvented make. That is to say I was reinventing a square wheel. One of these days I’ll get the very first script and modify so that instead of compiling files generates a CMakeLists.txt, calls cmake and then Make.

The end of the story is developing in Python is too fast. :P

BTW: the author is not suggesting Python should not be used, just the opposite!

Wednesday, October 15, 2008

CMake (Part 1)

autoreconf, autoconf, automake, libtool... We are talking about the autotools. Everybody (almost) uses autotools, and it seems that everybody has to use them for good.

Luckily enough, this is not the case. There are alternatives. There are many alternatives. scons and cmake just to name two of the most used.

The problem the autotools solve is not an easy one. Unfortunately the solution is unnecessarily complex. Using autotools for simple projects is just a mess and often their full power is not needed. Consequently I looked for alternatives.

Even though I’m quite a Python enthusiast, I chose to learn cmake first (over scons).

Basically a cmake build script can be as simple as:

project(project_name)
add_executable(foo foo.c bar.h bar.c)

And building a library (shared or static) is not any more difficult:

project(mylib) 
add_library(mylib SHARED foo.c bar.h bar.c) 

Put the lines above in a CMakeLists.txt and run cmake. You can generate unix makefiles, code::blocks projects, xcode projects, visual studio projects, etc.

Nice. Moreover, writing a CMakeLists.txt is quite easier than writing the corresponding Makefile (and the repetitive stages of writing install and clean targets is automated as well).

What about more detailed instructions? Ok, ok, you are right. So...

Suppose you have this project (directories are followed by ‘:’, while files... well, you should recognize them ;) ):

foo:
    bar:
        bar.c
        bar.h
        barbar.c
        barbar.h
    foo.c
    opts.h
    opts.h

If we run ls -R, we get:

% ls -R
CMakeLists.txt bar            foo.c          opts.c         opts.h
./bar:
CMakeLists.txt bar.c          bar.h          barbar.c       barbar.h

Actually you want to create a libbar shared library and the main foo executable (which links libbar).

% cat foo.c
#include <bar/bar.h>
#include <bar/barbar.h>
#include "opts.h"

int main () {
    bar(1);
    barbar();
    opts("ciao");
    return 0;
}

Notice that <bar/bar.h> and <bar/barbar.h> are not local includes. Now we write the CMakeLists.txt for the foo directory. We start examining the main CMakeLists.txt:

% cat CMakeLists.txtPROJECT(foo) 
CMAKE_MINIMUM_REQUIRED(VERSION 2.6)# bar must be processed 
ADD_SUBDIRECTORY(bar)# sets variable SOURCES to the project source files. 
SET(SOURCES foo.c opts.c opts.h) 
# puts the bar directory in the headers search path 
INCLUDE_DIRECTORIES(${CMAKE_CURRENT_SOURCE_DIR}) 
# puts the bar directory in the dynamic libraries search path 
LINK_DIRECTORIES(${CMAKE_CURRENT_BINARY_DIR}/bar) 

# links with libbar 
LINK_LIBRARIES(bar) 
# creates the executable 
ADD_EXECUTABLE(foo ${SOURCES}) 

I added comments in order to explain statements as you read them. Basically the ADD_SUBDIRECTORY command tells cmake to look for another CMakeLists.txt in the specified directory. Then we set the SOURCES variable. SET sets variables. If you need the value of a variable ${VARIABLE} is the way to get it (remember Make?).

${CMAKE_CURRENT_SOURCE_DIR} and ${CMAKE_CURRENT_BINARY_DIR} are variables set by cmake in order to simplify configuration.

Another useful command is MESSAGE (a ‘print’ command which can be used for debugging purposes).

Now, it is time for the CMakeLists.txt in bar:

% cat bar/CMakeLists.txtPROJECT(libbar) 
SET(SOURCES bar.c bar.h barbar.c barbar.h) 
ADD_LIBRARY(bar SHARED${SOURCES}) 

We are done!

% cmake . 
-- The C compiler identification is GNU 
-- The CXX compiler identification is GNU 
-- Check for working C compiler: /usr/bin/gcc 
-- Check for working C compiler: /usr/bin/gcc -- works 
-- Detecting C compiler ABI info 
-- Detecting C compiler ABI info - done 
-- Check for working CXX compiler: /usr/bin/c++ 
-- Check for working CXX compiler: /usr/bin/c++ -- works 
-- Detecting CXX compiler ABI info 
-- Detecting CXX compiler ABI info - done 
/Users/riko/src/foo/bar 
-- Configuring done 
-- Generating done 
-- Build files have been written to: /Users/riko/src/foo 

% make 
Scanning dependencies of target bar 
[ 25%] Building C object bar/CMakeFiles/bar.dir/bar.c.o 
[ 50%] Building C object bar/CMakeFiles/bar.dir/barbar.c.o 
Linking C shared library libbar.dylib 
[ 50%] Built target bar 
Scanning dependencies of target foo 
[ 75%] Building C object CMakeFiles/foo.dir/foo.c.o 
[100%] Building C object CMakeFiles/foo.dir/opts.c.o 
Linking C executable foo 
[100%] Built target foo 

More info on CMake can be found in these posts:
  1. CMake and Libraries

Tuesday, October 14, 2008

New tab in current directory (Terminal.app)

One of the most annoying things is when you have to open a new tab
in the current position. Say the shell in the current tab has x
as the current working directory and you want another shell already
in that directory. 
It seems there is no way to tell Terminal.app "open new tab"
from apple script. Mixing solutions found on the web, I came up with:

tell application "Terminal"
    set oldMem to (do shell script "pbpaste")
    set theTab to selected tab of the front window
    (do script "pwd | pbcopy" in theTab)
    set theText to (do shell script "pbpaste")
    tell application "System Events" ¬
        to tell process "Terminal" ¬
            to keystroke "t" using command down
    do script with command ¬
        "cd "& theText & ";clear "¬
        in selected tab of the front window

    do shell script "echo \""& oldMem & "\" | pbcopy"
end tell

Ugly, but works.

Monday, October 13, 2008

Sunday, October 5, 2008

Terminal.app ANSI Colors

Since Leopard, I did not have a good application able to change ANSI colours in Terminal.app.
Even though it does not seem a particularly pressing issue, the defaults are quite bad: the standard blue is quite too dark to be used on a dark background, so you have to use a light one. Unfortunately enough, some applications assume a dark background (for example, ipython). Of course, these applications can be personalized, but using a dark background can also be just a matter of taste.
Luckily enough, today I found TerminalColoreopard. It works well, it is easy to use, and solved my problem.

iPython and the less pager

When I develop software in Python, one of the most valuable tools is iPython. Among the first functions one learns to love, there are the ? and ?? commands.

Putting a ? after an identifier, gets help for that identifier, putting ?? shows the source. Lets use a couple of examples.

Friday, October 3, 2008

iCab

Today iCab is discounted on mupromo (50%).

Here it is the wikipedia article and here the website of the author.

iCab is a browser for the Mac. It is much more: it is a piece of WWW and Mac history. For a long time it has been the only relatively functional browser for MacOS Classic (after Moz dropped support for the platform).

It's a one man affair and once it used to have its custom rendering engine (until version 3). Now it's a nifty commercial browser with some nice features. Of course WebKit is great and means good support, but I kind of regret it does not use its own engine anymore. Diversity is good...

What has iCab that Firefox hasn't? Nothing. It has a great affective value. Of course, if you've got a MacOS 9, 8 or 7.5 system, it is also the only browser (version 3 from MacOS 8.1 and version 2.x from MacOS 7.5).

Still, its a good tool. It has some nice features, for example the smiley which signals when a website (or CSS) is not standard and opens a windows with the errors.

It has got some relics of when browsing was costly and slow: an offline mode and an acoustic signal triggered when it finishes to load a page.

And it's the only tabbed browser for older macs (but this isn't technical, is it?).

It's quite fast, indeed. And it has a very good filter system. Filters are quite general personalization tools: for example they can be used to add a "download" link to YouTube videos or create an AdFilter. iCab has a very good session management, too.

And that's it: it's not rocket science. It's a little old browser. But at $12.5 it's a bargain!

Thursday, October 2, 2008

The Year of the Kludge

Some days ago, I put back on this blog. Today I dedicated about half an hour to the forum we used to run here on akropolix.

This means reading a lot of PHP. More PHP than I like to see in a whole year. I've yet to recover.

It's exciting to see this big PHP applications work after you have read their code: I just couldn't believe my eyes. The application kind of works, with all its bugs and vulnerabilities, but does work. I'm actually using it. And as long as I don't have to see its code... After all, I have no guarantees on the code quality of most proprietary desktop applications I use (not that open source applications are necessarily better as some suggest). Why bothering about PHP web applications after all?

The point is that their code seems a kind of ordered set of kludges that lead to some approximation of a working application. It's the way most PHP applications are developed.

Many people write PHP code without the slightest clue of software engineering good practices. They often do not know typical web design patterns as well, let alone "classical" design patterns. They just write the code and fix it until it kind of works. The point is that they can do that: PHP allows (and encourages, in my opinion) this strategy.

Using frameworks like Django, you need to write sensibly less code. Most functionalities are already implemented in the framework. But you have to study the framework. You just can't improvise on a dubious architecture gluing together code snippets. And surprisingly the "good" strategy is perceived as harder. You have to be tidy and precise. You have to think. You have to design. You are strongly encouraged to test your software. You are led towards good software practices. And most people do not want to learn. They just want to get their job done. No matters they could save a lot of time doing things properly.

On the other hand, in standard PHP you can just write some messy code and fix it until it seems to work. This is the very idea on which Visual Basic was created. You don't need to know how to do things. You can just stammer code. But the point is not PHP or Visual Basic. There's plenty of languages that allow code stammer. You can write horrible kludges in C, C++, Java, Perl, whatever. And many people just do that.

Moreover, they tend to favour technologies that do not make it blatantly clear how poorly skilled and unwilling to learn they are. The result is that inferior technologies proliferate as incompetents do. By the way, this does not imply that every PHP developer is incompetent, I'm just talking about technologies and how people perceive their ease of use.