Monday, October 11, 2010

Design of Sorting in Java and Python (pt. 2)

In this post I presented a pattern which arises very quickly when doing comparisons among objects. The pattern is essentially:


def name_cmp(fst, snd):
 return cmp(f(fst), f(snd))

In fact I see huge repetition in this kind of code, because we write code in that shape for many different "f"s (that is to say, getter or more complicated computation, but it boils down to that pattern).

For example, let us define a Person object in Java and also define comparators to sort a list of persons according to both the number and the phone number.

public class Person {
    private String name;
    private String phone;

    public Person(String name, String phone) {
        this.name = name;
        this.phone = phone;
    }

    public String getPhone() {
        return phone;
    }

    public String getName() {
        return name;
    }

    static final private PhoneComparator PHONE_COMPARATOR = 
     new PhoneComparator();
    static final private NameComparator NAME_COMPARATOR = 
     new NameComparator();
    
    static private class PhoneComparator implements Comparator<Person> {
        public int compare(Person o1, Person o2) {
            return o1.getPhone().compareTo(o2.getPhone());
        }
    }

    static private class NameComparator implements Comparator<Person> {
        public int compare(Person o1, Person o2) {
            return o1.getName().compareTo(o2.getName());
        }
    }

    static Comparator<Person> ageComparator() { return PHONE_COMPARATOR; }
    static Comparator<Person> nameComparator() { return NAME_COMPARATOR; }
    
    // ide generated toString, equals etc. here\
};

I see huge repetition in this kind of code. But even in Python, defining all the compatators is boring and repetive. Of course in Python something like:

def cmp_buider(f):
 def cmp_(fst, snd):
  return cmp(f(fst), f(snd))
 return cmp_

would come quite natural. For the ones who do not like functionally flavoured code, this is the Java variant (which is longer and uglier).

public interface Key<K extends Comparable<K>, E> {
    public K index(E obj);
}

public class KeyToComparator {
    public static <K extends Comparable<K>, T> Comparator<T> build(final Key<K, T> key) {
        return new Comparator<T>() {
            public int compare(T o1, T o2) {
                return key.index(o1).compareTo(key.index(o2));
            }
        };
    }
}

and then we use it like:

static public void sortWithKey(List<Person> lstPerson) {
    Collections.sort(lstPerson, KeyToComparator.build(
            new Key<String, Person>() {
                public String index(Person obj) {
                    return obj.getName();
                }
            }));
}

I hope I'm not the only one feeling that this stuff is ugly as hell. However, it is quite general and powerful. And it lets you avoid code repetition. At the very least, the code to write a key is way shorter than the one to write a Comparator. Of course, if methods where first class objects...

No comments: