Segment DAs
Volume Number: 5
Issue Number: 5
Column Tag: C Workshop
Segmenting DA's
By Tom Saxton, Bellevue, WA
Note: Source code files accompanying article are located on MacTech CD-ROM orsource code disks.
Segmented DAs: Breaking The 32K Limit
In a variety of contexts on the Macintosh we find ourselves running up against
“the 32K limit.” One area that has been particularly limiting and persistent is the
32K limit on resources containing code. The reason for this has to do with the way the
MC68000 addresses memory; specifically one can address code within ±32K (and the
minus is generally ignored) of some base address very efficiently, so this mode is used
for jumping to subroutines and for building jump tables. In particular, Lightspeed C
limits the code-containing resources it builds to 32K. This isn’t a big problem with
applications, since one can break up the code in an application into multiple segments.
This segmentation is actually nice in itself, since it allows programs that are too big to
fit entirely in memory to run by swapping code resources in and out as needed.
But Desk Accessories and stand-alone code resources (INITs, FKEYs, etc.) are a
different story. The Mac OS has nothing built in to handle segmentation for anything
but applications, so we’re stuck with a 32K size limit. Fortunately, this can all be
fixed up with a little inline assembler. I have been working with these sorts of things
on several different projects, and have developed a fairly painless way of segmenting
either DAs or code resources. This is all done with stand-alone code resources, which I
will call PROC’s. “Fairly painless” means:
1) One shouldn’t have to write and debug a bunch of assembly code every time a
PROC or routine is added to the project.
2) A PROC should be allowed to have multiple entry points. Further, accessible
routines should be able to have different numbers of arguments. (A routine with a
variable number of arguments is a bit trickier, but not an unreasonable variation on
the code given below.)
3) Lightspeed’s ability to check arguments types via function prototypes should
be left intact.
3) Calling a procedure in an external PROC should be completely invisible to the
calling C routine.
4) It should be possible to have multiple entries into a code resource. That is, it
should be possible to jump in and out of a given PROC, even to the point of allowing
recursion between two PROCs. In short, there should be no more limitations in calling
routines than exist in a normal segmented application.
5) There should be a mechanism for loading and unloading PROCs the same way
application code- resources are handled, or at least a reasonable approximation
thereto.
The biggest drawback of doing your own segmentation is that you have to handle
part of the link yourself. That is, when you make a change to one of the PROCs in a
project, you have to make it accessible to the “parent” code either by copying the
PROC into the parent’s resource file, or by having the parent explicitly open resource
file(s) containing the PROC(s). At the moment, I am working on a ShareWare DA that
will make this process considerably less painful.
The LSC documentation (2.01) supplement shows how to write “glue” to handle a
single-entry PROC. Their code breaks down if you want to either call different
routines within the PROC, or if there is a possibility of the PROC being called from two
different places in the same call chain; more on this below. Working with all of this, I
found out several interesting side effects of some ToolBox calls. For instance,
PrJobDialog() and PrStlDialog() both go to the trouble of redrawing any dialog
windows in the window list that need updating. This can really hose things if you have
user item-drawing routines in a PROC that isn’t set up to handle being called again
before it returns from the current call.
The Big Issues
There are three basic problems to solve. The first is how to handle multiple
entry points. This is the easiest. The Mac PACK resources do this by tacking on an
routine selector to the arguments for a routine within the PACK. Upon entry into the
PACK, a bit of glue pulls the selector off of the stack and jumps to the right place. A
variation on this theme will serve our needs.
Next, LSC (and other environments such as Aztec C) allow non-application code
resources to have globals by addressing off of register A4 instead of A5 (which is
reserved for applications and QuickDraw). This means that upon entry into a PROC (or
a DA) the value of A4 has to be set to the “correct” value to address the PROC’s globals.
This is relatively easy; the tricky bit is to restore the old value when you exit so that
whoever called you can still find their globals. The obvious solution (which LSC
describes) is to store the original value of A4 in a fixed location (actually, they use a
pc-relative location, which can run afoul of the instruction cache in a 68020, but
that’s another story). This fails if the PROC gets re-entered from somewhere having a
different value of A4, since we can only store one.
The third problem has to do with handling the first two. The simple schemes for
handling the first two problems involve three pieces of glue. The first handles putting
the routine selector onto the stack and jumping to the (locked!) PROC. The next chunk
is at the head of the PROC, it sets up A4, then uses the routine selector to jump to the
desired routine. Finally, when the called routine is done, it can’t just jump back to the
caller; we must go through a last bit of glue to restore the callers’s A4. Since, we
would like to avoid tacking on this glue at every exit point of every externally
accessible routine, we want to doctor the return address to trick the called routine into
returning to a third bit of glue which handles the clean-up. This means we have to pull
the real return address off of the stack and store it somewhere. Storing it in a fixed
place leads to the same problems we had with A4 above.
The Solution
To avoid storing the two values which need to be saved and recovered in fixed
locations, we will store them on the stack. We can’t store them below the arguments to
the called routine, since then the routine will think it is getting two extra (long word)
arguments, and this is supposed to be invisible to the C code. So instead, the first bit of
glue shuffles the routine’s arguments down on the stack to make room for the 8 bytes
of storage we need. We then replace the old return value with the address to the third
bit of glue (which does the clean-up). We then jump to the second piece of glue which
lives at the entry point to the PROC. It sets up A4 then pulls the selector off of the
stack and jumps to the target routine. When the target routine “exits” the clean-up
code retrieves the saved values, restores A4 then jumps back to the original caller.
The details of implementing these ideas mainly have to do with doctoring the stack
so that both the original caller and the target routine act exactly as if the usual JSR and
RTS instructions were used to call and return from the target routine. (Remember this
is all supposed to be invisible to the C code.) Care has to be taken to not attempt to
address globals while A4 is holding the “wrong” value. We also cannot molest D0 on
the way back since that is where any return value is stored.
There are a pair of routines which load and unload a PROC in the spirit of
LoadSegment and UnloadSegment. To count as loaded, the PROC resource must be in
memory, locked and marked unpurgeable. Unloaded is unlocked and purgeable (don’t
actually ReleaseResource, in case we want it again). If these PROC’s are to be “owned”
by a desk accessory, then we have to calculate the resource’s actual ID from the
driver’s ID number and the sub-ID of the PROC. This is all handled by FLoadProc() and
UnloadProc().
The following example is a stupid little DA which calls three routines in an
external PROC with differing numbers of arguments and return values. Before doing
so, it calls FLoadProc() which loads the PROC and stores a pointer to the PROC in a
location passed to the routine. FLoadProc() calls MoveHHi() on the resource handle to
reduce the possibility of causing fragmentation problems. When the DA is closed
(which is the only thing that can be done to or with it), it calls UnloadProc() to unlock
and free up the space used by the PROC resource. While debugging, it would be wise to
set the pointer to the PROC to -1 when it is unloaded, so that if someone tries to use the
unloaded PROC, an address error is immediately generated. This is really only an issue
if PROCs are to be loaded and unloaded while the program is running to conserve
memory (not an unlikely occurance, but remember that under LSC, globals are
effectively re-initialized when a PROC is released from memory and read back in from
disk).
The second source is for the DA. It contains a bare bones DA with routines needed
to load and call three routines in a PROC. There are also defines for the procedure
selector numbers. These must begin at zero and occur in the same order as the jump
table in the PROC. The glue to call a routine in a PROC consists of a “call” to an
assembly language macro that does all of the work. Notice that the macro wants the
number of WORDs passed as arguments to the routine, not bytes. Note also that this
does not count the return address which also gets passed to the routine; the macro takes
that into account.
The first source file is for the PROC resource. It contains the entry glue for the
PROC. Every routine that is to be externally callable has to have an entry in the
JumpTable; see the note about routine selectors above.
To build the PROC, create a new LSC project, add UselessProc.c, set its project
type to “Code Resource,” its resource type to ‘PROC’ and its ID to -16000 (sub ID
zero owned by DRVR ID 12). Once created, the resource file is named
UselessDAProj.rsrc so that it will be copied automatically into the DA file. In general,
the PROC resource would have to be copied into the DA’s resource file. Note that you
don’t have to include MacTraps with this project (it does not call any routines defined
there).
After building the PROC, and giving its output file the specified name, create a
new project called “UselessDAProj”, add the first source file and MacTraps to the
project, and build the DA. Now Run the project (or open the DA file with Suitcase), and
open the DA. The DA will then put up a window (the first PROC call), write something
funny in the window (the second PROC call). It will then wait until you close it, at
which time it will get rid of its window (the third call to the PROC) and go away. Note
that it does not handle even so much as activate or update events. Like I said, useless.
For the Non-Hackers
If assembly language is not your second language already, you’re likely
thoroughly lost. This should not prevent you from using the trickery here. Below is all
you need to know to use the segmenting code.
In the project that calls an external PROC, create a source module which contains
the macro definition and a routine for each external procedure you wish to call.
Pattern them after the routines OpenWindow(), DrawWindow() and KillWindow() in
the DA source file. Each just calls the horrid assembler macro. Be sure to count the
words of arguments correctly, or things will die horribly. If you’re not sure how to do
this, consult your local guru. Next set up a main() in the PROC project patterned after
the main() in UselessProc.c. Make one entry in the “JumpTable” for each routine you
want to access. Then build a set of defines for the routine selectors (in the same order
as they appear in the jump table).
Potential Pitfalls
The easiest things to mess up are counting the words in the argument list.
Remember that Pointers, Handles, Points, pointers to structs and arrays, and longs all
count 2 words. Ints, shorts, Booleans and chars all count 1 word (a char is only half a
word, but gets passed as a word on the stack, this can be tricky and is probably
compiler-dependent). Counting words in a struct or union can be tricky since bytes
can get packed together, use sizeof() when in doubt. Don’t use sizeof(char), for the
reason above. Next is mismatching the order of the routine selectors, check that one
twice, too.
Finally, the assembler macro assumes that the glue routine begins with a “LINK”
instruction. This is true for any routine that has arguments or local variables. If you
want to call an argument-less routine in a PROC, give its glue routine a dummy local
variable to force a LINK instruction (LSC talks about LINKs on page 9-2 of the original
manual). Alternatively, one could create another version of the macro with the UNLK
statement removed for functions with no arguments, but this would leave two chunks of
assembler code to maintain.