Thursday, August 23, 2012

Thanks for the Memories

(This tip and anecdote is specifically about Android, but the same technique applies to every platform I've ever worked on, which at this point is quite a few. In fact, it's a technique that Romain and I have stressed in our book and in most graphics/GUI talks that we've given over the years, such as this one from Devoxx in 2008) .
I was debugging an application recently (names withheld to protect the completely and utterly guilty) and discovered that the source of a serious performance bottleneck was simply the size of the bitmaps involved.

The application's job is to display lots of pictures, so using bitmaps is a given. And the size of the bitmaps being loaded and displayed is signficant, so there were going to be issues around memory and performance anyway. But it was the way in which the application was treating the source and destination sizes that was at the root of the problem.

In particular, the application was loading each image into a bitmap of size X x Y. Meanwhile, they wanted to draw that bitmap into a destination rectangle half that size, .5X x .5Y. This is easy to do; you just call the appropriate Canvas.drawBitmap() method with the relevant source/destination rectangles, or specify a scale on the Canvas object, and we'll take care of the details.

Simple.

But.

Wrong.

That is: Scaling on the fly is easy, and it works. But asking us to do your work for you on every frame in which you draw that bitmap might cost you significantly in performance and memory when there's a very easy way for you to do this once and simplify all future operations with that bitmap.

Here's the right thing to do: pre-scale the bitmap to exactly the size you need. Then when you need to copy it into the destination, you call Canvas.drawBitmap(left, top, Paint) (the version that doesn't take a dest rectangle) and then all that Android needs to do is copy the bitmap. Much simpler. And what's more: it requires potentially far less memory than downscaling to a smaller size.

This is probably (hopefully? please?) obvious when you're running in a software-rendering situation (e.g., all releases prior to 3.0, or any app targeting pre-4.0 releases and not specifying hardwareAccelerated="true" in the manifest); having the framework scale the image every time it's drawn means going through a much slower path than a simple 1-to-1 copy entails.

But what about on GPUs, with our wonderful new hardware-accelerated world of Android apps as of version 3.0+? Aren't GPUs supposed to be faster at stuff like this? What are we paying them for, anyway?

Yes.

But.

Here's the problem: the actual scaling operation is quite cheap on a GPU, even negligible. But that's not all that you should be concerned about as a mobile developer. Mobile developers should always worry about memory. You should profile your application. You should think about memory consumption at night when you can't sleep. You should bring it up on first dates*, and fester on it while on vacation.

If you're displaying several images per frame, you want to be very aware of how much memory those bitmaps are soaking up. This is true for the bitmaps in CPU memory, but also true for bitmaps that we upload to textures. Just because it's cheap for a GPU to scale a large texture into into a small space on the GPU doesn't mean it's fast to upload it into texture memory, or cheap for the GPU to have that large image sitting around in memory. Memory is a constrained resource and should be treated as such. I'm sure your date will tell you as much (possibly as they leave the date in search of more interesting prospects).

If you're going to scale from a 2k x 2k image into a 32x32 icon, wouldn't it make more sense to pre-scale it once, chuck the original one, and thenceforth deal with only the smaller version instead?

Of course, if you really need full-size images, then go ahead and do what you need to do. And if you're animating an image's size (such as zooming in on it), then pre-scaling to each intermediate size probably doesn't make much sense. But if you know that you're going to be using a smaller version for a while, then you should probably pre-scale to that size rather than drag around the memory and performance baggage associated with the original version, no matter what the hardware acceleration situation on the target device is like.

* The advantage of discussing memory consumption on first dates is not only that it will help you keep it in mind at all times, but also that this will inevitably lead to more first dates on which you can continue discussing it. Or it will at least result in less second dates.