Thursday, September 12, 2013

Lazy Lists

Let's talk about laziness. No, let's not just talk about being lazy - let's do something about it. Let's be lazy.

There's a common pattern that I use in Android programming where I will create objects lazily, especially when these objects will not necessarily or frequently be needed at runtime. This is especially true for code that I write for the Android framework, where we like to be as lean as possible and leave more memory available for the system and the application.

Often, these objects are collections. My personal favorite is ArrayList (ever since I moved on years ago from the original, but somewhat crusty, Vector class), although I am also known to use Hashmap for those key/value pair situations (especially after I got used to ActionScript's paradigm of everything-is-a-hashmap).

Of course, you can't simply start poking into a lazily-created collection willy-nilly, unless you're a big fan of NullPointerExceptions. So you need to first check whether the thing is null, and then create it as necessary. Similarly, if you're removing an item from the collection, you might want to check if it's now empty and reset the field to null again.

None of this is difficult... but it is tedious. And it tends to bloat the code that you have to read and write and maintain all over the place. And it's exactly the kind of thing that is easy to get wrong because your mind just blurs over the repeated pattern like it's yet another pile of dirty laundry next to the bed.

So I wondered if there was a way I could encapsulate the behavior I wanted to make my laziness easier, simpler, and more readable. And it turns out there was such a way (or else this would be a very short article that would have ended right around... now).

But first, let's look at the problem a bit more, to understand what we're trying to fix.

I have a class called LazyLists with a couple of List fields and some methods for adding and removing items of various types. First, there are the fields:

List<Integer> intList = null;
List<Float> floatList = null;

Then there are add/remove fields for the int/float types I care about:

public void addItem(int item) {
    if (intList == null) {
        intList = new ArrayList();
    }
    if (!intList.contains(item)) {
        intList.add(item);
    }
}

public void removeItem(int item) {
    if (intList != null) {
        intList.remove((Object) item);
        if (intList.isEmpty()) {
            intList = null;
        }
    }
}

public void addItem(float item) {
    if (floatList == null) {
        floatList = new ArrayList();
    }
    if (!floatList.contains(item)) {
        floatList.add(item);
    }
}

public void removeItem(float item) {
    if (floatList != null) {
        floatList.remove(item);
        if (floatList.isEmpty()) {
            floatList = null;
        }
    }
}

There are a few things to notice about these methods:
  • There's all of that boilerplate code I mentioned before that lazily creates and nulls out the appropriate list based on the state of the list at the time. This is what we'd like to clean up, since this code is repeated as many times as we have to access these list fields.
  • I run a uniqueness check in the addItem() methods because it suits me; I only want to add unique items, not the same items over and over. That's kind of a detail that's specific to my situation, but produces more boilerplate that I'd love to get rid of.
  • There an interesting nuance to the int variation of removeItem(). Do you see it? It's the cast to Object prior to removing the item from intList. This is because of the awkward crossover between primitive types (int, float, etc.) and Object types (Integer, Float, etc.) in Java. There are actually two remove() methods on List, one that takes an int and one that takes an Integer. The one that takes an int removes the item at that index, whereas the Integer variant removes that item itself. That's a pretty huge distinction. And maybe it's well-known to you if you've worked with Lists and ints, but I hit it when working on this example, and thought it was interesting enough to call out.
Anyway, moving on.

We can call the methods above and produce lists that dynamically change with the items that we add/remove. For example, this code creates the class, adds items to the two lists, and removes those items:

LazyLists lists = new LazyLists();
lists.addItem(0);
lists.addItem(1f);
lists.removeItem(0);
lists.removeItem(1f);

Adding a bit of tracing code gives us this output:

starting lists = null, null
populated lists = [0], [1.0]
ending lists = null, null

So there's not too much cruft above, but I figure the second time I'm repeating the same code, I should think about refactoring it in a way that avoids the repetition. And it's easy to imagine that there might be several places in real code that wants to add/remove items, or several different types going into several different types of collections. Then it's easy to see the little bits of repeated code above bloating into more than you might want to manage in the long haul.

There are various ways that you could do this, depending on the collection(s) you want to support, the extra logic you'd like (like my requirement for uniqueness in the lists), and stylistic elements about static methods, etc. But here's what I wound up with:

public class LazyListManager {

    public static  List add(List list, T item) {
        if (list == null) {
            list = new ArrayList();
            list.add(item);
        } else if (!list.contains(item)) {
            list.add(item);
        }
        return list;
    }

    public static  List remove(List list, T item) {
        if (list != null) {
            list.remove(item);
            if (list.isEmpty()) {
                list = null;
            }
        }
        return list;
    }
}

This simple class has two static methods on it to support adding and removing from an arbitrary List object. As needed, it will create a new List (actually, an ArrayList, but that's an implementation detail). It will check for uniqueness in the add() method, check for nullness in the remove() method, and null out an empty list in remove() as appropriate.

There is an important piece here that makes this work - callers must supply their target list as both a parameter to the function and as the recipient of the return value; this is what makes it possible for these utility methods to allocate or null-out the list as appropriate (since they do not have access to the original list, but only have a reference to it).

Given these two static utility methods, we can now write new addItem() and removeItem() methods that are a significantly better (you can't get less than one line of code, unless I missed that part in my CS education):

public void addItemBetter(int item) {
    intList = LazyListManager.add(intList, item);
}

public void removeItemBetter(int item) {
    intList = LazyListManager.remove(intList, item);
}

public void addItemBetter(float item) {
    floatList = LazyListManager.add(floatList, item);
}

public void removeItemBetter(float item) {
    floatList = LazyListManager.remove(floatList, item);
}

Calling these methods looks remarkably similar to what we saw before:

lists.addItemBetter(0);
lists.addItemBetter(1f);
lists.removeItemBetter(0);
lists.removeItemBetter(1f);

and results in exactly the same output (which shouldn't be a surprise. If the results were different, this approach wouldn't be a utility as much as a bug):

starting lists = null, null
populated lists = [0], [1.0]
ending lists = null, null
populated lists = [0], [1.0]
ending lists = null, null

The LazyListManager class has taken out all of the tedious boilerplate related to null checks, uniqueness, allocation, and nullification, and has left us with just one line of code to write whenever we want to add or remove items to/from one of our lists. That's just about the right amount of code for me to write without making a typo or a copy/paste error along the way.

If this were public API, I could envision the manager class offering various different kinds of collections and options, or maybe wrapping more of the capabilities of collections classes (like isEmpty() or contains()). But for now, it's a nice little internal class that can help me simplify my code whenever I need to use this lazy-allocation pattern.

All of the interesting code is inline above, but if you want the two source files, you can download them here.


7 comments:

Gulshan Singh said...

Why didn't I ever think of this? This is great!

I think you missed some capitalization here:
List intList = null;
List floatList = null;

Gulshan Singh said...

And in my previous comment, I was referring to the template parameters, which don't show up since they get parsed as HTML.

Chet Haase said...

@Gulshan: Yep, thanks for catching it. I hand-edited that piece since my pasted code really confused blogger's html engine.

Dan Rice said...

It seems that you really want a Set instead of a List. At them moment, inserting N elements will take N^2 work. Not very lazy!

Andrew Kelly said...

For extra type safety in your LazyListManager you could have specified the type for the list on the method definition to make sure you're not adding floats to the intList.

Andrew Kelly said...

Grrr <T> could have been added to the method definition...my previous comment seemed to remove this important bit due to HTML tags.

Daniel said...

Here's a gist without the template params accidentally removed: https://gist.github.com/dlew/6554307