Sunday, August 1, 2010

Varargh!

Varargs, a feature of the C language since roughly the late Victorian era, was introduced into the Java language in JDK1.5.

I love varargs. They allow me to declare a function flexibly with the ability to take zero, one, or several parameters. This ability is useful when the user may have an unpredictable number of things to add to some data structure, eliminating the need for them to create some collection to pass the parameters into your function. It makes for a nice API. And I'm all about nice API. (And donuts).

But sometimes varargs don’t work so well. I suppose it’s because I expect too much of them. Like when you meet someone that you really like, so you call them every few minutes or so, then hang around outside their apartment and workplace and friends' houses until they get a restraining order on you. It’s not that they weren’t really cool and worth getting to know, but that your needs weren't necessarily compatible.

I was working recently on some API improvements for a library I'm writing. I had some constructors that users would call with exactly two values of different types, like this:

  public MyObject(int value1, int value2) {}

  public MyObject(float value1, float value2) {}

This worked well, vectoring the code path off in different directions depending on the types of the values that the user passed in.

But sometimes, the users might have just one value to pass in. Or three. Or nineteen. Rather than having variations for all of these situations, multiplied by the number of types that I wanted to support, I wanted something simpler that took variable numbers of parameters.

So I rewrote my API with varargs. It looked something like this:

  public MyObject(int... values) {}

  public MyObject(float... values) {}

This new version was awesome. Now the user could call my functions with the appropriate values and it would do the right thing, no matter what the types were or how many values they passed in. I compiled the code for the library, wrote my test code:

MyObject obj1 = new MyObject(1f);
MyObject obj2 = new MyObject(1);

and

FAIL

I got a compiler error. Specifically, Eclipse gave me the following completely unhelpful message for the test code: “The constructor MyObject(float[]) is ambiguous” for the second line of that test code (MyObject obj2 = new MyObject(1)).

Never mind the fact that the error was telling me that the constructor takes an array instead of varargs; I know that varargs is syntactic sugar for a construct that gets turned into an array. The real problem was that my code wouldn’t compile.

Apparently, varargs cannot handle the method overloading situation and make the correct type inference and just fails to compile. Although the compiler can make a good decision about the correctness of a constructor with float vs. int, it can’t make that decision for float[] vs int[]. So it bails.

Meanwhile, I discovered another nuance of using varags; you can’t pass them around between constructors willy-nilly. One of the things I wanted to do in my rewrite was to record the type that the user wanted to use and send it to a private constructor like this:

private MyObject(Class valueType, Object... values) {}

Meanwhile, I wrote a single type-specific constructor (to avoid the previous compiler error for now) that called this more generic constructor:

public MyObject(float... values) {
 this(float.class, values);
}

Again, my code compiled fine. And when I eliminated one of the float/int varargs combinations, I was able to write a test that successfully called my varargs constructor:

MyObject blah = new MyObject(1f, 2f, 3f);

This code called the private Object-based constructor above. But then I encountered bugs at runtime where the number of values was not what I was expected. In particular, the number of values being stored by the private constructor was 1, not 3.

But, but, but, ...

It turns out that the call to the private constructor turned a perfectly fine threesome of parameters of type float into a single parameter of type float[]. That is, my varargs had just turned into the array that it was pretending it wasn't when I first wrote the code.

At this point in an article, the climax as it were, I would typically say, “And here’s how you work around these issues.” I would love to do that, because I like neat tricks and workarounds. And I’d love to actually have that workaround in my code.

Unfortunately, I have no such workaround. I frankly don’t know how to make this work in any reasonable way. You can be more clear with the compiler, by telling it things like new MyObject(new float[](1f, 2f, 3f}) to get around the compiler error. And maybe you even like the way that that looks for an API. If you are sadistic. Or hallucinating. You could also find a different way to store the individual parameters in the varargs rather than having the VM think that it’s just a single array item. But at this point, my original attempt at a more attractive API and flexible implementation was looking like a bad prototype or an industrial accident.

So I did the only thing that any responsible captain would do after his vessel had hit an iceberg and was going down with all hands: abandon ship on the first lifeboat.

For my situation, I could limit the flexibility to just one or two parameters for the main case, and then have a completely different constructor with a custom data structure for the more flexible, and less common, case of having several parameters. So I walked away from varargs completely for my code with nary a look back. The second problem of calling the private constructor with varags then magically disappeared because I no longer had to pass the varargs around.

For other situations that may benefit from varags, but may run afoul of method overloading, I can only recommend that you know and understand the limitations of the nifty varargs feature. I searched around on the web for answers to my compiler problem and didn’t find anything immediately, so I thought it might be worth noting for posterity. Or at least for the pleasure of venting.

Varargs: great when they work. But when then don’t? Nobody knows. Maybe that's why the syntax uses ellipses. It's like a little language joke that says, "Sure, use varargs, Let's see what happens..."

13 comments:

Deluxe said...

If you have at least one mandatory parameter, maybe you could try:
public MyObject(int mandatoryValue, int... values) {}
public MyObject(float mandatoryValue, float... values) {}

I didn't test it (no time =( ), but the type of the mandatoryValue could help distinguish between the two constructors.

Chet Haase said...

@Deluxe: That was worth a try, but it fails similarly; now the compiler complains about (float, float[]) being ambiguous. I think I'll stay in my lifeboat.

Unknown said...

Hi Chet,
Yes I had the same fun some time ago. Good and entertaining article. Thanks for that.
By the way your java.net blog header still states that you work on flex.

Have fun,
- Rossi

Unknown said...

Hi Chet,

You can try something like this :

public class MyObject {

public MyObject(Integer... values) {

for (Integer i : values){
int i1 = i;
System.out.println(i1);
}
}
public MyObject(Float... values) {

for (Float f : values){
float f1 = f;
System.out.println(f1);
}
}
}

and for the test :

public class Main {

public static void main(String[] args) {

MyObject obj1 = new MyObject(1f,2f);
MyObject obj2 = new MyObject(1,3);
}
}

a user said...

Hi Chet,
Your Java posts are as interesting as the Flex ones, just by curiosity, will you still be experimeting with Flex or have you definetly left the Flex side?

Unknown said...

Did I miss something?

public class CuriousChetTest {

private CuriousChetTest(T... vars) {
// satisfy Chet's curiosity
}

public static void main(String[] args) {
CuriousChetTest c1 = new CuriousChetTest(1,2,3,4);
CuriousChetTest c2 = new CuriousChetTest(1f,2f,3f);
}
}

Cheers,
Jan

Unknown said...

oh, this crappy commenting filter striping of html ... see the full code at http://pastebin.com/UpP5ZsT2

Chet Haase said...

@Julien: My Flex output will definitely decrease simply because I don't have time to do it all; my work on the Android team promises to keep me pretty busy for some time. But there will certainly be some Flex content off and on, especially with the book coming out soon and related articles and presentations appearing now and then.
Beyond that content, I have to decide between working even more outside of a busy job or staying married. I think I'll choose the latter.
But just like my Flex scribblings and videos, I hope the content I post her goes beyond the single platform and language I happen to be writing about or working on, and is generally useful to graphics geeks.

Chet Haase said...

@Jan
@Champion:
Thanks for the suggestions.
Something like these could work. I like the ability to use varags by using the object types, and I like the further simplification of the multiple type-specific constructors by using generics.
Unfortunately, there are some problems with my specific requirements. For one thing, I use the input type to find methods on other objects through reflection. And sadly, foo(Float) is not the same as foo(float), so my reflection logic will have to become more flexible to find all real possibilities.
Also, the current APIs tend to use primitive types, so these object type equivalents stand out oddly. And I don't relish all of that auto-boxing happening to convert from what will surely be primitive types in user code to object types in these constructors.
Finally, in the generics approach, the user code just became uglier. That is, I'd like the developer to be able to write "Foo obj = new Foo(1, 2, 3)" and have the right thing happen. To have them now write "Foo obj = new Foo(1, 2, 3)" seems like, er, what's the word for "opposite of improvement"? I like the convenience of generics, and their ability to reduce the number of methods/classes in an API, but they don't result in terse or beautiful user code. At least not in this case.

Cyril said...

I would have used static factory method like fromInts(int ... values) but your "For one thing, I use the input type to find methods on other objects through reflection" indicates it's not an option for you.

Unknown said...

There is always the possibility for a nicely named static factory method. Then you could use different names which resolves the ambiguity.

Unknown said...

Oh no you already wrote that. Looks like blogger needs Ajax reload of incoming comments ;-)

Chet Haase said...

@Cyril & Patrick: In fact, I did end up with static factory methods in the final API, e.g., ObjectAnimator.ofFloat(...).