Sunday, August 26, 2012

Does Object-Oriented Programming really suck?

I recently read Armstrong essay on how much OOP sucks. While such opinions would probably have considered pure blasphemy a few years ago, nowadays they are becoming more popular. So, what’s the matter with OOP?

The first thing that comes to mind is that OOP promised too much. As already occurred to other paradigms and ideas (e.g., Logic Programming and Artificial Intelligence) sometimes researchers promise too much and then they (or the industry) cannot stay true to their promises.

OOP promised better modularity, better code-reuse [0], better “whatever”. Problem is, bad developers write crap with every single programming paradigm out there. With crap I really mean “utter crap”. A good programmer using structured programming and a bad programmer using OOP, is that the good programmer’s structured program would probably look “more object oriented” than the other one.

Another problem occurred in the OOP community: OOP “done right” was about message-passing. Information Hiding, Encapsulation and the other buzz-words are a consequence of the approach (because object communicate with messages, their internal state is opaque) not a goal of OOP. Objects are naturally “dynamic” and the way we reason about code is separating concerns in objects having related concerns and having the objects communicate in order to achieve the task at hand. And I’m a bit afraid of talking about “OOP done right”: I really just have my vision. OOP is very under-specified, so that it becomes very hard to criticize (or defend) the approach.

However, OOP becomes “data + functions”. To me data is simply *not* part of OOP. It’s an implementation detail of how objects maintain state. As a consequence, I do not really see data-driven applications as good candidates to OOP. Once again, OOP was sold as universal. It is not. Consider the successes that functional programming is achieving (performance and concept-wise) in this area.

This “data + functions” comes from C++ being the first commercially successful OOP platform. The great advantage was that the transition from structured programming to OOP was quite more easier, that programs could be far more efficient (read “less runtime”) than message passing dynamic OOP systems, at least back in the day. However, there was so much missing from the original idea!

Since back then classes were considered good (and C++ had — and has — no separate concept of interface) + static typing and a relatively verbose language with no introspection, it became rather natural to focus on inheritance. Which later was proved to be a bad strategy. Consequently, there is less experience in building OO software than one would think, considering that for a large part of its history OOP was dominated by sub-optimal practices.

So, what is really bad with OOP? Following Armstrong:

Data structure and functions should not be bound together

Agreed! But I believe that Data Structures are not *truly* part of OOP. Let me explain.

Data is laid out in some way. There is a “physical layout”, for example. You can express that very precisely in C with extensions, to the point of exactly specifying the offsets of the various datas. A phyisical layout also exists in high level languages such as Python. However, it is not part of the Python model. Of course with the struct module or ctypes you can fiddle with it (but they are libraries), however, for the most part, it is just outside the conceptual language used to describe the problem.

Data is also laid out in logical ways. Computer Science defined many data structures and algorithms over them. You can use OOP to express such structures and such algorithms. Still they are not “part” of OOP, they are just expressed in an object oriented way (or maybe not, and remain at structured programming level).

One very good practice in OOP is not using together in the same body of code objects reasoning at different levels. So, for example, you have your low level business logic code that is expressed in terms of data structures, high level business logic expressed in terms of low level business code… and that’s it. Functions is not really together with data. You don’t have “data”. You have some layers of objects.

The point is that OOP is about creating languages at semantic level (you usually do not get to change the syntax). That’s it. If the language is good, well… good. If the language is bad, ok, we’ve got a problem.

Is this suboptimal? In a way, yes. All the indirection may be very expensive. And since abstractions, well… abstract, you may find yourself with a language that is not expressive enough (at least not at the expenses of additional performance costs). Still functions and data structure is not bound together. You just have objects and messages. No “functions” and no “data”. Please notice: this may not be the right thing to do. Still, the problem is not with data + functions, is just related to applying OOP to a specific domain that is ill suited for the approach. OOP is not for every task in the world. But the same objections applies to every system that is built as a stack of layered abstractions.

Please also consider the Qc Na story! Objects are not necessarily opposed to functional programming (you can see them as closures that make different actions available). Objects are just about state + behavior + identity.

Everything has to be an object

And this is something that I don’t think is a problem. It is like saying that Scheme is bad because “everything is a list” (which, strictly speaking, is not true, there are atoms and lambdas etc). If you do not want to program OO, then don’t. If you want, you probably want that everything is an object.

The only issue with everything is an object is, sometimes, performance. From a variety of points of view, such strategy may kill performance. Objects usually have indirections (read pointer). 64 bit pointer for every integer is bad. So we have hybrid stuff like Java and then we have to deal with boxing (either manually in the past or automatically right now). Performance issues can be somewhat “fixed” using proper JIT systems or other optimization techniques. Or providing libraries that do the right thing (see Numpy, for example).

Other than that, I prefer objections in the line of “this thing that should be an object is something else” than those requesting that something that is an object really is. So, I can favorably consider an objection that says that, in Python, “if” is not an object (true). Even though I’m convinced that having if as a statement is not bad either.

And the whole objections with “time”… well, time (and dates) are a bitch to handle. But I find that Python datetime module gets as close to perfection as humanely possible. I really can’t see describing time with a bunch of enums + some structures as an improvement. On the other hand, it looks to me as one of the cases where it is *easy* to see that the OO approach works better.

Consider the related problem of representing a date in a locale. If you introduce a new representation of time, you need either to modify the original function or create a new function (maybe one that works with both things). Creating an interface, on the other hand makes it very easy to introduce new representations of time.

Objection 3 - In an OOPL data type definitions are spread out all over the place


And once again, in OOPL languages data type definitions are not spread out all over the place. They are in C++ and perhaps in Java. And even in Java, if things are done properly, you reason in terms of interface, i.e., in terms of typed messages, not in terms of data structure.
“”“In an OOPL I have to choose some base object in which I will define the ubiquitous data structure, all other objects that want to use this data structure must inherit this object. Suppose now I want to create some “time” object, where does this belong and in which object…
””“
Here it is pretty clear what he has in mind. I think that it is clear that, conceptually, time is an interface and that there is really no need for it to define “data structure”. And if your language is duck typed, the interface is implicit and you have finished before you started.

Objection 4 - Objects have private state


And here I have to agree: state is the root of all evil. Problem is that entirely stateless systems are simply unpractical. We reason in terms of state. We can describe lots of things as stateless, but somewhat we have state. We have files. We have documents. We want them saved and retrieved.
So, we have to deal with state. Agreed. But: 1. imperative programming is about state2. object oriented programming is not, strictly speaking
You can use a “functional” style in OOP, for example. Most of the times, you don’t do it because the code becomes harder to write. But in many situations, on the contrary, you do it because it makes it easier. Often I write immutable objects that cannot be modified: they are not more stateful than a parameter in a function.
Sometimes this is not practical. Ok. And surely using a very stateful style of programming is bad. State in OOP is expressed as behavior. That’s it. Sometimes is done properly, sometimes it is not. Some strategy to deal with state are extremely nice (STM, as an example). Others are not. Also consider how “modern” functional languages really merge concepts from OOP (without being OOP) and how “modern” OOP languages merge concepts from FP (without being functional).
Examples: Python from the OOP side, using generators, lazy evaluation, list comprehensions, closures, etc etc etc.Clojure from the FP side… and all the features it has to express what resembles interfaces, STM and so on…

——

[0] someone once told me they feel lucky when they can actually use the code effectively once, let alone reusing it.

Sunday, August 19, 2012

Java 7 for Mac OS X

And it was about time...

Download Page

Luckily now Java for OS X should not lag to much behind.