68020 Programming
Volume Number: 2
Issue Number: 11
Column Tag: Technical Notes
68020 Programming Considerations 
By By Our Readers
Dave McClain
Senior Engineer
The WSM Group
The WSM Group provides a Hyper-C and asm development system for the
Macintosh with the unique feature that the complete source code to the system is also
available at a very reasonable price. Contact them at (602) 298-7910. This tech note
is valuable as we canticipate the next Mac family of 68020 based systems.
As one of a fortunate class of Macintosh programmers, I have had the enjoyable
opportunity to run a 68020 MPU chip in my Macintosh. Initial investigations show,
however, that a number of incompatibilities exist between the 68000 and the 68020
in spite of the claim that the 68020 is "upward compatible" with its ancestors. This
paper addresses a few of these of these claims and offers some suggestions for the
future, as well as some patches for the present...
Welcome to the Future
The exceptional processing of the 68020 is considerably more complex than that
of the 68000. The stack frame produced as a result of calling a TRAP or receiving an
interrupt includes a format word beneath the saved PC (and possibly a bunch more
information as well...). The RTE instruction of the 68020 expects to see this format
word.
Many of us have written code which saves the current status register contents on
the stack on entry to a called routine, and as a shortcut, we execute an RTE which
restores the old status register and performs an effective RTS all at once...well, it used
to... Now, with the requirement for the format word beneath the saved PC, this trick no
longer works - the stack can get out of sync and a format exception can be generated. So
unless you are responding to an actual exception condition, don't play this RTE trick
anymore!
Along the same line, many programmers attempt to augment the instruction set
of the 68000 by making use of the TRAP instruction. On receipt of the exception, they
simply add 2 to the stack pointer to remove the saved status register, then proceed as
though the routine were simply called by a JSR. This does not work any longer because
the TRAP leaves a word or frame format information beneath the saved PC. MacWrite
4.5 is a particular offender here, as is MacFORTH. There may be others as well - we
have all used this technique at one time or another - in the future, be sure to take
account of the format of the frame, or don't use this technique.
Several popular programs (including the Mac ROMs) make explicit or implicit
assumptions about the speed of execution of various code patterns, either for time delay
loops or for SCC accessing which has internal setup delay requirements. The 68020
has an internal cache of 128 words and a pipelined architecture. Both of these, coupled
with the varying overhead of memory accesses at different byte alignment boundaries,
makes the timing of 68020 instructions impossible to predict, as well as being much
faster than the older 68000 for the same clock frequency.
This means that you should neither use instruction sequences nor loops to
produce timing delays, e specially if the delay has a critical lower bound as in SCC
accessing! Even if the internal cache is disabled, you still have a pipelined architecture
which overlaps instruction execution, thereby increasing the speed of execution. The
68020 also completes each bus cycle in 3 clocks, instead of the 4 clocks which was
characteristic of the 68000. MacinTalk and many sound generation programs behave
poorly here.
In general, you should not assume that the exception vectors for the system are
located in the page beginning at absolute address zero. They always were for a 68000,
but the 68020 allows them to be located anywhere since it maintains a vector base
register internally. If you need to intercept an exception, you should first locate the
vector page by reading the VBR with a MOVEC instruction. (But don't do this until you
conform to the aforementioned exception protocol requirements.) Read below about
current patches.
Surviving with the Past
Because of their daily importance to many of us, the older programs such as
MacWrite 4.5 must be handled with care in a 68020 environment. MacPaint and
MacDraw appear to be OK. Here is the solution which we found to work for this one
example program. Other programs such as MacFORTH may have different
requirements.
MacWrite 4.5 uses TRAP instructions to enhance its instruction set. In
particular it uses TRAPs 0..4/6..9. In all except the case of TRAP 9, it assumes that it
can simply remove the saved status register by incrementing the stack pointer by 2,
then continuing as if called by a JSR. In the case of TRAP 9 it behaves properly by
executing an RTE at the end of a very short instruction sequence. (The rationale for
this one escapes us. Apparently they need a special instruction to increment a single
register and force an alignment of some sort.)
The 68020 Solution
Fortunately, the 68020 allows us to intercept these poorly handled TRAP calls
by allowing us to generate another vector page. MacWrite alters the TRAP vectors in
the original zero-based vector page. By creating a second vector page, and changing the
internal VBR of the 68020, we get a first shot at all exceptions - including TRAPs. For
the sake of MacWrite, our interception code for the TRAP 0..8 should adjust the stack
frame produced by the exception so that it looks just like the ones generated by the
68000. Once this is done, we can permit MacWrite to continue by vectoring through its
vectors in the page 0 table. (Fortunately, MacWrite does not alter the vector handling
on the fly from one which ignores the exception to one which returns with an RTE, or
vice versa).
All other exception vectors in this new vector page should point to intercept code
which in turn vectors through the vectors in the page 0 table. This allows any
alterations to the I/O interrupt vectors to be accommodated without requiring the
programs to know about our new table.
The one vector which should be handled directly is the 1010 exception vector
used to access all ROM routines. Direct handling of this one seems safe since it never
(?) changes and any additional vectoring indirection would cause undesirable runtime
overhead.
This solution works well for us. It is a tribute to the architecture of this superb
chip that a solution is even possible. As far as we can tell, all other instructions are
upward compatible with the older 68000.
We do not yet have a solution for programs which generate sound such as
MacinTalk and music synthesis programs. The voice and tones are very garbled and
gravelly. We have found, however, that they sound better (but not correct) when the
68020 internal cache is disabled.
Congratulations Apple
Apple is to be commended for their efforts to look to the future. The 128K ROM
code appears to be safe in all areas except for mouse SCC interrupt handling. It is
remarkable that Apple did not shortcut the ROM code with respect to other exception
types, expecially 1010, in light of the need to maximize performance. We look
forward to a recoding of the ROM to take into account the bitfield instructions of the
68020, effectively the inner guts of Bill Atkinson's Blitter in the chip. Graphics
should really scream then.
In the case of the SCC handling, the Apple ROM code made some implicit
assumptions about the speed of execution of their ROM interrupt handler for the SCC.
As it happens, this code executes on the hairy edge of being too fast on some
Macintoshes, while on others, it is definitely too fast!
Our solution was to install a new interrupt handler during boot time which
assures that SCC accesses cannot happen closer in time than 2.2 usec. The way we did it
was to force a RAM data memory access with a "MOVE.L (SP),(SP)" instruction placed
at strategic spots in the interrupt handler. The internal cache of the 68020 does not
cache data, it only caches instructions. The RAM timing of the Macintosh assures that
this will take at least about 2.2 usec. to execute 1 read cycle/1 write cycle.
The MC68020 with its 68881 math coprocessor are a welcome addition to the
Macintosh. Initial benchmarks show that the internal instruction cache can make as
much as a 2:1 speed difference when switched on vs. when switched off. Even with it off,
the 68020 is definitely faster than the 68000.
We have run Smalltalk under the 68020 with quite favorable results, and it's
quite pleasant to use now. We have yet to take full advantage of the 68881 math
coprocessor. We expect a result several thousand-fold faster than SANE in software...
P.S. You should see Life run with this chip! Our thanks to Jeff Brooks of Spectra Corp.