I've been contracted by
Train Player International to help port their TrackLayer model railroad design software to OS X. It has been going quite well, and I'm grateful the proprietor, Jim, has allowed me to apply my theories of cross-platform development, and create a first class Cocoa application. But it is a work in progress, and there are both bugs and unimplemented features.
Regardless of its fragile state, Jim sent our development version out to select users for feedback and to prove it isn't completely vaporous. And feedback we got. Some of it positive, some negative, but the one thing everybody hated was the launch time; it was abysmal on older hardware and not particularly peppy even on my MacBook. I will not tolerate intolerable performance in any app I write, so I've made it my first priority to fix.
It didn't take long to track down the culprit, the reason TrackLayer was taking 13 seconds to launch on my MacBook. I was pre-loading 83 small image files representing train car tops and sides. These files were 24 bit BMP files with neither masks nor alpha channels; they were chroma keyed--a color was set aside to represent blank space--and my code was responsible for converting it to an a form with an alpha channel which could be composited with other images. Here's an example file of the same format:
My code replaced the pink area with transparency. The problem was with my original code to manipulate BMP images. I had even written a comment to myself to say it would be slow and inappropriate for heavy use. The major problem is that NSImages are not easily manipulated; their API is extremely limited in that regard. Basically, there is colorAtX:Y and setColor:atX:y which I had been using to change the chroma key color one pixel at a time:
// don't do this, it is slow
NSColor* clearColor = [NSColor clearColor];
int width = (int)originalSize.width;
int height = (int)originalSize.height;
NSBitmapImageRep *repWithAlpha = [[[NSBitmapImageRep alloc]
initWithBitmapDataPlanes:NULL
pixelsWide:width
pixelsHigh:height
bitsPerSample:8
samplesPerPixel:4
hasAlpha:YES
isPlanar:NO
colorSpaceName:NSCalibratedRGBColorSpace
bytesPerRow:0
bitsPerPixel:32] autorelease];
for(int x=0; x < width; x++)
{
for(int y=0; y < height; y++)
{
NSColor* originalColor = [(NSBitmapImageRep*)anImageRep colorAtX:x y:y];
if([originalColor isEqual:chromaColor])
{
[repWithAlpha setColor:clearColor atX:x y:y];
}
else
{
[repWithAlpha setColor:originalColor atX:x y:y];
}
}
}
It's slow just looking at it, what with the creation of an NSColor object for each pixel. And I needed to come up with something much faster. The problem is there is not a lot of documentation on doing what I want with either NSImages or CGImageRefs. There was this
short section of the Quartz 2D Programming Guide, which recommends using the CGImageCreateWithMaskingColors function new to OS X 10.4, but isn't very clear on its use, and that's why I'm writing this blog entry to clear it up.
In order to create a masked image from my BMP, I had to:
- Create an NSImage from the BMP file
- Use colorAtX:y to get the color at the top left corner which I assume is the chroma key
- Ask Quicktime to create a CGImageRef from the file
- Use CGImageCreateWithMaskingColors to create a masked copy
- Convert the CGImageRef to the NSImage I can use
That's right, I create 4 separate versions of the image to get what I need. And it is still more than 10x as fast as the original.
NSColor* chromaColor = [(NSBitmapImageRep*)anImageRep colorAtX:0 y:0];
size_t bitsPerComponent = ::CGImageGetBitsPerComponent(originalImage);
float maxComponent = float((int)1 < < bitsPerComponent)-1.0;
float redF = rintf([chromaColor redComponent]*maxComponent);
float greenF = rintf([chromaColor greenComponent]*maxComponent);
float blueF = rintf([chromaColor blueComponent]*maxComponent);
const float maskingMinMax[] = { redF, redF, greenF, greenF, blueF, blueF };
CGImageRef maskedImage = ::CGImageCreateWithMaskingColors(originalImage, maskingMinMax);
Using this new code dropped the launch times on my MacBook from 13 seconds to under 2; pretty peppy.
I've put together a
sample project: ChromaKeyTester.dmg which contains all the code for this, so I hope you Google searchers found what you sought. This only works with the variety of BMPs I needed to open, but it could be made more general.
[UPDATE]
I received an e-mail from a coder with evidently more experience than I at manipulating images in Cocoa. He made a number of suggestions, none of which I found worthwhile pursuing because I'm just not going to get my launch times much faster. I've gotten the launch time on my MacBook down to 1.5 seconds (and this is the worst case, where an image populated document is automatically re-opened, forcing the main thread to wait for the image loading thread to complete). Even if his conversion methods took zero time they could not push the launch time below 0.9 seconds. I realize 0.6 seconds on my MacBook might be 1.2 seconds on a PowerBook G4, but still, it's fast enough.
Still, other programmers might have a more dire need for squeezing performance, so here are a summary of his suggestions with my parenthetical reasons for rejecting them:
- Use a reasonable format like PNG instead of BMP (Can't do this for backwards compatibility. That was the first thing I asked.)
- Skip the reading top/left corner for chroma key color step. (Can't, different images use different colors.)
- My method leaves traces of the chroma key in the anti-aliasing (Sorry, that's just the test image. The actual production images are hand tweaked pixel by pixel with no anti-aliasing. And I could always expand the range of colors masked by CGImageCreateWithMaskingColors. )
- Do the conversion with a Quartz Composer script. (This is a cool idea, but sounds complicated to setup, and could you really load the Quartz Composer framework and execute a script in less than 0.3 seconds or so? Maybe, the Quartz Composer application launches in less than a second, and can merrily load and display my BMPs at a rate limited 60 fps and 7% rendering load.)
- Use a NSCIImageRep--which is a Cocoa wrapper around a Core Image Image. (I had not been aware of this very useful looking object, and this seems the most practical suggestion. If I have the time for extra speed tuning, I will explore this.)
- Get some C or C++ BMP reading code and read and convert the raw data myself. (This would have to be a big win performance wise for me to have to deal with the vagaries of the BMP format myself. And there would be no guarantee of a big win, as Apple's converter code is presumably optimized for Altivec or SSE3 or whatever.)
I will remind myself to read up on using NSCIImageRep and the Core Image framework. It sounds extremely useful.
[Another Update: Turns out there is code in Apple's examples for dealing with converting arbitrary bitmap formats to a reference format: either RGBA integer, or RGBA float. This is in the Tableau example for the Accelerate framework. Look for the message "constructIntegerReferenceImage". You can easily refactor this code to run through the reference image buffer and do your chroma keying before taking the step to create an NSBitmapImageRep].