Wednesday, February 16, 2011

Is removing parentheses worth it?

Introduction

The first objection to lisp is "too many parentheses". It is roughly the same objection that programmers do to Python (well, not "too many parentheses" but the fact that you have to follow minimum decency rules for indentation). And Clojure claims to have reduced parentheses... let's go into this.

Back to the indentation vs. parentheses thing, the objections are similar because they share a single pattern: people who do not use the language feel that it is a major drawback, people using it usually like it and people who have to use it without particular enthusiasm usually are agnostic about it. That is to say, the language feature under examination looks "worse" from the inside than from the inside.

In fact Python programmers usually justify saying "you would indent it that way in any case"; which is true to some degree (programmers may use a slightly different style, but essentially it would not be that different to make anybody uncomfortable). Beside, as someone having to read and fix code from complete novices (being an assistant for a basic programming undergrad course), most of the times the first thing I have to do is running the programs through a beautifier (and most of the times even the novices spot the error as the program is properly indented -- but that is another story, and I keep telling them about indenting properly).

Lisp programmers usually answer objecting that in fact they do not even see the parentheses when typing (in fact I do see them) and that they are automatically inserted by Emacs (which is true). As a consequence, they do not have to type more. I add that Emacs is very good at using parentheses to understand the program structure and gets the indentation right. I have to say that the major drawback of Python approach is just that automatic indenters cannot do their job properly (not that I ever felt that it was a real issue).

Use all the parentheses!

Common Lisp basically uses only () because language implementors feel that the other kind of parentheses are free for the language user to abuse in defining custom languages. I understand that this may an advantage in some situations. On the other hand I also feel like having the parentheses in the language also allows interesting possibilities.

For example, I quite like that in scheme R6RS the possibility of liberally using [] can improve readability, at least to my eyes. For example I often use them in case expressions and to hold let bindings. In general when I have multiple S-expressions in the form ((LHS1 RHS1 ...) ... (LHSN RHSN ...)) I often write them as ([LHS1 RHS1 ...] ... [LHSN RHSN ...]), especially in cond/let forms, which is, as far as I can tell, an usage explicitly sanctioned in the C appendix of [1].

The good part is that you are free to use square brackets or not to use them. The bad part is that you can choose to use square brackets or not to use them. That is to say: there is a non normative suggestion, but you are not forced to follow it. As a consequence code can legitimately use or not use square brackets. Moreover, their usage is not standard in R5RS; consequently if the code is meant to work on both implementations, using square brackets is not an option.

In this, I praise Clojure choice. Using square brackets and braces is a good idea for a new lisp dialect, especially for one that is not meant to win Common Lisp programmers, but to be used in Java environments and to bring object oriented people into a more functional world.

I really like the idea that vectors are just [foo bar ...]. Even thought it was not particularly hard to write #(foo bar) instead. However, the syntax for hash maps is really a plus, in my opinion. And so it is the one for sets. It is not something you leave Common Lisp for, but it is a legitimate choice and a rather wise one: once you decide that you are definitely not going to let user customize the language with reader macros, then you better use the symbols you have! Moreover, considering that the JVM is more optimized for array like stuff, I believe that wider usage of vectors [] also justifies their new syntax. Where a scheme programmer uses a list of atoms, a clojure programmers uses a vector of symbols.

I am quite agnostic on using [] to delimit binding forms. I don't think that more than two lines should be written on the subject (even though depending on the width of your screen, these may be far more than two lines). I don't feel it is less lisp because we write (let [...] ...) instead of (let (...) ...).

Readability counts

On the other hand, I feel the removal of parentheses a drawback rather than an advantage. But I may be biased: for example in languages with infix syntax, I tend to use additional parentheses when precedence issues are slightly more complicated than trivial (which basically happens as soon as we use more than the elementary arithmetic operations).

The question is that using less parentheses decreases the redundancy of code (which would be a good thing, if it wasn't for the fact that redundancy means better error correction). For example, consider this code:

(let [x 1 y 2] (+ x y))

First, I do believe that having the paramethers all together makes code less clear. Syntax can be used to visually aid the programmer. In this case while it is obvious for a machine to match arguments pairwise, the eye is not helped at all. Besides, if I do indent it properly, I also feel some sense of irresoluteness because of the way the parameters are divided [notice that parameters should be lined up; however it seems that something between blogger and scribefire randomly destroys proper indentation, sometimes, if it happens it was not what I meant]:

(let [x 1 
           y 2] 
  (+ x y))

I believe that this is far more readable (but it may be my bias, as I said):

(let ([x 1]
           [y 2])
  (+ x y))

because parentheses logically group arguments. Notice that in C if I could write int foo 8;. I would still prefer int foo = 8. Ok that syntax would be ambiguous (consider int foo (8); would be a valid function declaration and a valid variable definition, as I can place () around expressions... in a way which resembles me of the classic error mytype foo(); which most beginners do -- that is a function definition, not a variable declaration).

Anyway... back to clojure. Consider the following code:
(let [x 1 y] (+ x y))

It is an obvious mistake (well, not so obvious, IMHO). And the compiler says:
java.lang.IllegalArgumentException: 
  let requires an even number of forms in binding vector 
  (NO_SOURCE_FILE:0)

Good. But what if we are type hinting freaks and write:
(let [^Integer x 1 y] (+ x y))

I know that in this case it would be better to use (int 1) but, that's just an example. Ok... the compiler still complains (rightly):
java.lang.IllegalArgumentException: 
  let requires an even number of forms in binding vector 
  (NO_SOURCE_FILE:0)

Now I should check what is a "form in a binding vector". Perhaps the definition explicitly excludes type hints, so the compiler is not lying. But the heck... the message is extremely unclear. I believe that lack of parentheses is rather annoying in this case. But I've seen worse.

The cond statement is not very good either...
(cond
  (instance? Integer x) :int 
  (instance? String x) :string
  :default :dontknow)

Once again, a few more parentheses could group things more tightly. But this is just a matter of taste.
A discussion about the condp form is here.

References

  1. M. Sperber, R. K. Dybvig, M. Flatt and A. van Straaten, "Revised6 Report on the Algorithmic Language Scheme - Non-Normative Appendices -.


, , , ,

4 comments:

Mike Klein said...

Don't forget the option comma-as-whitespace provides you:

(let [x 1, y 2] (+ x y))

Also, I think something funky has gone on with the indentation of the next few examples. Shouldn't the x and y line up vertically? I'd demonstrate but I can't seem to get code or pre tags to work in the comments.

Unknown said...

Well, x and y should of course be lined up vertically. I kind of fixed this. However, sometimes blogger just screws it up [ it is far worse with Python samples, though... ].

And no, I don't think you can use code or pre in the comments. As far as I know few tags are allowed and I don't think I have control over this. If there is some blogger plugin or something like that I'm more than happy to use it.

I agree that the comma improves readability, especially when we are writing one-liners. Still, I prefer an additional set of braces. But to me it does not add much readability when we split parameters (as in the example we would like to write but cannot in the comments... ;) ).

fogus said...

I prefer an additional set of braces

Clojure is nothing if not flexible ;-)

https://gist.github.com/831788

Unknown said...

However, such characters are simply ignored. Which means that, for example, cannot be used when building/taking apart code in macros.

However, this is the same old discussion between having more/less parentheses.

Although I generally agree that some parentheses are actually redundant, in some cases I just like them: what happens for condp is an example of this.