Thursday, January 24, 2008

Microsoft Office Supports PDF on the Clipboard, And Why That is a Big Deal

For months, I've wanted to know whether Microsoft Office 2008 supported copying and pasting PDF data from the OS X clipboard. I couldn't find out, and it's not like I didn't ask. Two days ago, Office 2008 appeared on my chair, and the answer is yes.

Backing up a moment. When you copy content in one application and paste it in another, you are using a system service to transfer the data be it via the old style Carbon Clipboard manager, the Cocoa NSPasteboard class, or the newish Pasteboard framework available to the non-Objective C crowd. The two applications must agree on the format of the data exchanged, so typically only widespread standards are used. For text, as I've outlined before the RTF format is preferred. For bitmap images, a good choice is lossless TIFF. Vectored images, however, were a quandary.

Vectored images are pictures composed of individual drawing operations such as MoveTo, LineTo, AddToPath, FillPath, etc.. Because they are not limited in resolution like a bitmap, they look good on screen and tend to print out with lovely crisp lines. They also tend to be smaller than bitmap files. Every application on Classic Mac OS used a convenient format called PICT which is basically a recording of the QuickDraw operations needed to generate the onscreen display. PICT is a primitive format, something more in tune with the computers of 1984 than 2008. Off the top of my head, it lacks fractional coordinates, paths, Bezier curves, gradient fills, pagination, is limited to QuickDraw fonts, has limited (to rotating text) coordinate matrix manipulations, poor support in Cocoa applications and its ugly. The only two good things I can say about it is it does allow for high quality printing via embedded PostScript, and you can squirrel away your own data in it in case the same PICT gets copied and pasted back into your application.

When OS X arrived, legacy Carbon applications kept on generating their PICT clipboards for both bitmap and vectored material even though the superior PDF format was available and universally used by newer Cocoa applications. QuickDraw became obsolete and onscreen drawing is most often done with Quartz calls, and yet applications still maintain ways to generate PICT clipboards at great expense of maintenance and design. Why?

Because Microsoft Office didn't support PDF, and if you want to sell business applications on the Mac, you have to share data with Office, and that content had better print nicely from within Office. I know from personal experience the aggravation of maintaining the portion of an application which renders content into QuickDraw PICTs; ugly, cludgy QuickDraw PICTs when I could be easily generating PDF clipboards; beautiful lightweight, lithe, PDF files.

To illustrate what I mean I created this pdf in an application which supports creating EPS files but not putting PDF on the clipboard. I opened it in the OS X Preview application (a Cocoa App):

Copied the image into TextEdit (another Cocoa App) from Preview:

TextEdit has a bug where it doesn't re-render embedded PDFs when it zooms.
Copied the image from TextEdit and pasted it back into Preview and zoomed in on a detail:

Now compare with a zoomed detail when using the PICT version from the clipboard of the original application (a Carbon app) (ignore the checkerboard) pasted into Preview:


Presumably, OS X could provide a service where it would extract embedded PostScript from PICTs (if available), and generate a pleasing PDF pasteboard, but it doesn't and I doubt that Apple wants to encourage developers to keep on using PICT.


Getting back to the big news, there was Office 2008 on my chair. Install. Draw a moon:

Copy. Launch Pasteboard Peeker and see this output (... means omitted content):
PasteboardRef: 1116096 ItemCount: 1
Index: 1 item ID: 1112493904
...
"com.adobe.pdf"
"Apple PDF pasteboard type"
'PDF ' P_____ 21056 PDF-1.3 4 0 obj << /Length 5 0 R /Filter /FlateDecode >> stream x

"com.apple.pict"
"Apple PICT pasteboard type"
'PICT' P_____ 409198 >n C 0 H
...


Yay.

And notice how svelte the PDF (21,056 bytes) is compared to the PICT (409,198 bytes). Rendering a gradient fill in QuickDraw is not pretty.

Go back to Word and add a star:

Copy and paste into Pasteboard Peeker and:
PasteboardRef: 1116096 ItemCount: 1
Index: 1 item ID: 1112493904
...
"com.adobe.pdf"
"Apple PDF pasteboard type"
'PDF ' P_____ 24935 PDF-1.3 4 0 obj << /Length 5 0 R /Filter /FlateDecode >> stream x

"com.apple.pict"
"Apple PICT pasteboard type"
'PICT' P_____ 498658 H
...
The PICT version bloats by 97K while the PDF gets a mere 4K. Not that size matters any more with RAM and hard prices the way they are.


Word in Office 2004 had a clipboard which looked like this:
PasteboardRef: 3323920 ItemCount: 1
Index: 1 item ID: 1112493904
...
"com.apple.pict"
"Apple PICT pasteboard type"
'PICT' P_____ 2222 U , , f
...


Open up the pasted PDF in Preview and Zoom:

Look at that beautiful shadow detail! Try to do that in a PICT!

Draw something in Sketch copy paste, yep, there it is, a PDF pasted into Word. Yay again.

But what does this all mean? It means that once Office 2008 sees widespread adoption, the rest of the Mac content creation software industry can rip out every last QuickDraw call in their application. It means we can build 64-bit versions of our applications. It means we had best start putting PDF on our own clipboards. It means Cocoa apps can generate content and with no extra effort have it look great inside Office apps. It means there will be a new, higher minimum quality for interchanged content. It means we can forget everything we ever knew about Classic Mac programming.

The future is finally here.
[Update: One fly in the ointment is that Word has a bug wherein if you paste a PDF graphic into a Word document, and subsequently copy and paste it from Word and into another application (such as Preview), it loses its vectored quality:


But while the graphic is still within Word, it scales, prints, and zooms beautifully, so presumably this is just a bug in the copy code and not a design flaw. The vectored PDF is being maintained internally in some vectored form.]