Acrobat Plugins
Volume Number: 16
Issue Number: 6
Column Tag: Programming Techniques
How to Write Plug-Ins for Adobe Acrobat
by Kas Thomas
The Acrobat SDK provides a bewilderingly diverse
assortment of tools for manipulating PDF files
Adobe's Portable Document Format (PDF) is the premier electronic document format
for "rich content" on the web, not just because of its ability to convey the appearance
attributes of a paper document (which it is very good at, since the underlying
technology is Postscript-based) but also because of its annotation features, hyperlink
embedding, support for form widgets (and JavaScript), Unicode compatibility,
security (encryption) handlers, and accessibility features. (PDF is also evolving to
become more adept at encapsulating document metastructure and private data types.)
Access to all these types of functionality is available to creators of PDF documents via
Adobe Acrobat, the commercial version of Reader. Unlike Reader, the full Acrobat
product (street price: $200 or so) lets you mark up a PDF document in various ways,
add links and form fields, attach JavaScript to documents, etc. - and save the modified
PDF file to a storage device.
Most of the kinds of functionality you'd ever want to see in a PDF file can be added
using Acrobat or created in the source file at the time of the document's creation (via
MS Word, FrameMaker, or whatever). But in the PDF world, as in the world of
Photoshop, Quark, After Effects, etc., there are times when the functionality you need
just isn't available from the host application. In such a situation, it makes sense to
consider a plug-in.
As with its other flagship products, Adobe has incorporated a plug-in architecture into
Acrobat to enable third-party developers to extend the functionality of the core
product. Quite a few of the key features of Acrobat (features most users think of as
being "core functionality") are, in fact, implemented in the form of plug-ins. For
example, form-editing and form-fillout functions are done as plug-ins. Acrobat also
uses plug-ins to implement weblinks, annotations, text touchup, Quicktime movie
attachments, scanning ("paper capture"), image-import, and quite a few other
functions. (Check the plug-ins folder of your Acrobat installation.)
Extending Acrobat's capabilities via custom plug-ins is a powerful (and popular) way
to give users extra control over PDF documents. But this capability comes at a cost.
Learning to use the Acrobat plug-in API (one of Adobe's most complex APIs) can be a
formidable challenge. There are well over 1,000 functions in the Acrobat API; just
finding the ones you need to carry out a specific task can be daunting. Suffice it to say,
this is one API you won't learn in a weekend.
Because of the sheer enormity of the API, there is no way I can give you more than a
very abbreviated introduction to Acrobat plug-ins in the limited space available here.
What I can do is briefly outline the available tools, talk about a few of the most
common idioms of Acrobat programming, and point you toward resources for further
study. In addition, just so you won't think writing an Acrobat plug-in is totally beyond
reach, we'll walk through the code for a small plug-in designed to do something at least
vaguely useful: namely, add a text-based "watermark" to a page, in response to a menu
command.
Getting Started with the SDK
You can download a copy of the Acrobat software developer's kit (SDK) from
http://partners.adobe.com/asn/developer/acrosdk/acrobat.html. The SDK is also
available on CD-ROM to members of the Adobe Solutions Network Developer Program.
(See http://partners.adobe.com/asn/developer for details.) The complete SDK
requires over 30 megabytes of disk space, including 27 megs for documentation alone.
Example code comes in the form of Code Warrior projects for two dozen sample
plug-ins and a handful of AppleScripts showing how to control Acrobat via IAC
(Inter-Application Communication). The code is well-packaged, thoroughly
commented, and well-summarized in a separate docfile, Samples.pdf, which gives a
brief overview of what each example is meant to demonstrate.
The documentation that accompanies the SDK is so overwhelming that Adobe
thoughtfully includes a "Start Here" document as well as a flowchart-style roadmap
diagram in which every box is a clickable link taking you to the appropriate docfile.
The roadmap occurs at the top of nearly every docfile in the package, which is a nice
touch. Just by clicking on various portions of the roadmap, you can jump from doc to
doc. Of course, all the docs are PDF files with navigation bookmarks.
The 213-page Acrobat Core API Overview (Tech Note No. 5190) is a good document to
start with, since it serves as a kind of giant FAQ for developers, explaining how the
various components of the API work together. But the two most important docfiles in
the package - the two you'll spend the most time with - are The PDF Reference Manual
and the Acrobat Core API Reference. The 513-page PDF Reference Manual, also known
as the PDF Specification, contains the formal language specification for PDF. It spells
out all the keywords, data types, architectural schemas, and syntax constructs that
make a PDF file a PDF file. Just as it would be hard to write a Netscape browser
plug-in without understanding how HTTP and HTML work, it's inconceivable that you
would write an Acrobat plug-in without knowing a fair amount about the low-level
organization of PDF files. The PDF Reference Manual gives you this information. Plan
on spending many hours reading it.
Your primary reference manual for the plug-in API itself will be the 2,795-page
Acrobat Core API Reference (Tech Note No. 5191). Yes, I said 2,795 pages. But don't
worry, that's sure to get bigger as Acrobat evolves. (Version 5.0 is set to appear
sometime within the next nine months.)
The Acrobat Core API Reference is not a book that you sit down and read cover to cover;
rather, it's a reference you consult as the need arises. Inside it you'll find descriptions
(calling syntax, etc.) of the 1,100-or-so core routines that make up the Acrobat API.
You should understand that one reason this API is so voluminous is that it comprises
the bulk of the library routines from which Acrobat itself was (and is) written. Adobe
uses these same routines internally for Acrobat development.
What You Can and Can't Do
It's possible to write a plug-in for Reader as well as for Acrobat. (There are also
mini-APIs for other products in the Acrobat suite: Capture, Distiller, Catalog, etc.)
By default, any plug-ins you write using Adobe's SDK will run under Acrobat (the
commercial product) with no problem. Writing plug-ins for Reader is another matter.
To get Reader to load your plug-in, you will need to obtain an Acrobat Reader
Integration Key. Getting a key is a matter of filling out the Acrobat Reader Integration
Key License Agreement (available on Adobe's developer web site), signing it, and
submitting it - along with a $100 payment (in the form of credit card only) - to
Adobe's Klamath Falls, Oregon group (which administers the developer programs).
When you sign the agreement, you are agreeing to some fairly serious limitations as to
what you will and won't try to do in your plug-in. Specifically, you are agreeing never
to include any of the following kinds of functionality in your Reader plug-in:
• Anything that would allow Reader to permanently modify, save, or write
files, including PDF files, annotations for PDF files, form data, etc. Included
in this is any functionality that would enable another process (on a server,
say) to save such data.
• Anything that opens encrypted documents by bypassing normal security
measures.
• Anything that would display a PDF file in the window of another
application.
• Accepting navigational commands from an application other than Reader
itself.
• Making use of any function call in the Forms HFT (host function table).
• Implementing a replacement file system for Reader.
• Anything that would remove the menu item that calls up the "About
screen for Reader.
In other words, forget about making permanent changes to files (or doing disk writes)
in a Reader plug-in. The whole idea is that Reader must remain read-only. (Otherwise
it wouldn't be called Reader!) If your plug-in does any of the things listed above, it has
to be written as an Acrobat plug-in, targeting the commercial product only (which
does modify and save files). This cuts the audience for your plug-in down substantially
(probably to one percent or less of the audience of Reader). But there's still a sizable
market left. In fact, the market for Acrobat plug-ins is surprisingly brisk, reflecting
the product's widespread use in prepress and enterprise settings.
Other Limitations
Not everything you might want to do to a PDF file can be done using the Acrobat API
anyway, it turns out. It's important to realize up-front that some things just can't be
done. For example, let's say you wanted to implement a new image-compression codec.
(PDF already implements JPEG, LZW, CCITT, and Flate compression filters for image
streams.) Unfortunately, the Acrobat plug-in API does not expose this kind of
functionality - purposely, to ensure maximum interoperability of PDF files. On the
other hand, you can write an entirely new security handler, should you want to
implement a different kind of encryption or passwording than is available by default in
Acrobat. With security, interoperability is not a consideration.
Another kind of functionality you'll have trouble modifying is changing the way
Acrobat draws things to the screen. For example, if you want to change the default
order in which various objects are drawn, apply a different kind of antialiasing, or
improve the blitting performance - you're basically out of luck. These kinds of
functionality aren't exposed in the API. That's not to say you can't try going outside the
API; but in that case, maybe what you're trying to do would best be implemented as a
helper app, system extension, WDEF, or something other than an Acrobat plug-in per
se.
Layers in the API
Figure 1 shows how the Acrobat plug-in API is organized into a number of layers,
designed to provide functionality at high and low levels. At the highest level is the
Acrobat Viewer (AV) Layer. In this layer are hundreds of routines for manipulating
the Viewer's windows, menus, toolbars, etc. as well as for querying the runtime
environment. This is where, for example, you can call
AVAppRegisterForPageViewClicks() if you want to track mouse hits. Most menu
commands can be programmatically driven at this level. That means you can control
the page view, open and close windows, control the cursor's appearance, and drive just
about any GUI-related action. But you cannot do file I/O, inspect or modify stream
contents, or perform low-level operations on document internals from this layer.
Figure 1. Layers in the Acrobat API.
The Portable Document Layer implements the PDModel, which is an assortment of
objects allowing access to individual components of the PDF file, such as annotations,
bookmarks, links, or text selections. The objects in this model are generally opaque,
which is to say they are neither pointers to data structures nor pointers to pointers.
Access to the contents of objects that live at this level occurs via special accessor
functions. For example, to get an annotation you would use PDPageGetAnnot(); then you
might inspect its flags with PDAnnotGetFlags()and change the flags with
PDAnnotSetFlags().
At the very lowest level, the Cos Layer allows direct access to stream contents and page
resources. (Cos is a recursive acronym for "Cos Object System.") Cos objects come in
seven main varieties, mirroring the data types upon which PDF is built: booleans,
numbers, names, dictionaries, streams, strings, and arrays. Cos methods go by names
like CosArrayGet() and CosArrayInsert(). They allow "subatomic particle" access to
the PDF file. Of course, as with other low-level programming tools, operating at the
Cos level brings risk as well as rewards. As Adobe says in the Core API Overview:
"Unlike the AcroView and PDModel layers, it is very easy to produce an invalid file
using Cos methods. For this reason, they should not be used unless necessary, for
example to add private data to portions of a PDF file that cannot be accessed in other
ways." In other words, don't go here unless you know darn well what you are doing. And
even then, look out for land mines.
Cutting across all three major layers of API support is one additional layer, known as
the Acrobat Support Layer. You can think of this as a "utility" layer where you can find
platform-independent functions for doing file I/O, memory allocation, etc.
PDFEdit Layer
One "mini-layer" we didn't talk about (which is shown in Fig. 1 as a small layer
residing midway between the PD Layer and the Cos Object level) is the PDFEdit Layer.
In a nutshell, this layer was created in order to provide easier (and safer) access to
page contents than would otherwise be obtainable. At the PD Layer level, you have
access to page content objects, but the objects are opaque and not always easy to modify
using the available accessor functions. At the Cos level, on the other hand, you're
dealing with raw streams consisting of arbitrary mixtures of page operators,
numbers, and text. To manipulate those items, you have to "parse them out" yourself,
which is not always easy. The PD Layer offers some parsing and enumeration methods,
but in general the objects returned are not easily modified. The problem is even worse
than that, however. Resource handling is difficult because stream contents and
resources are treated as disjoint entities. (A given piece of text is not easily associated