Precise timing
Volume Number: 6
Issue Number: 6
Column Tag: Assembly Lab
Related Info: Time Manager
Mac II Timing
By Oliver Maquelin, Stephan Murer, Zurich, Switzerland
Note: Source code files accompanying article are located on MacTech CD-ROM or
source code disks.
Precise timing on the Macintosh II
[Olivier Maquelin and Stephan Murer are both re searchers and teaching
assistants at the Swiss Federal Institute of Technology in Zurich, Switzerland.
Currently they are involved with dataflow multiprocessor project and working
towards their Ph.D. thesis. The working environment at the institute consists of about
80 networked Mac IIs including five Appleshare fileservers, some Laserprinters, a
Scanner, two MicroVAXes and some communication hardware. We are programming in
Pascal and Modula-2 under MPW and make use of many other Mac applications.]
The Problem
Determining time or measuring the duration of some process from within a
program is a task most programmers have had to face at least once in their careers.
For that reason, most operating systems, including the Mac OS, offer services to
determine the current time and date. Unfortunately, in some cases the resolution or the
accuracy of the system clock is not sufficient to solve the task at hand. We had that
problem lately, as we wanted to develop a profiler to test programs written in
P1-Modula-2 under MPW. The Time Manager provides only delays with 1 ms
accuracy, which is much too long to measure the execution time of small procedures.
The timer we want to describe here is accurate to a couple of microseconds, depending
on how it is used.
The Idea
A straightforward way to measure time on a Mac is to use the global variable
Ticks, which is incremented during each Vertical Blanking interrupt, that is every
16.63 ms. A more complicated, but much more precise way to do it is to use one of the
hardware timers, which are decremented every 1.2766 µs. The Mac Plus and SE have
two such timers, which are used by the Sound Manager and the Disk Driver. The Mac II
has four of them, two being used by the Sound Manager and the Disk Driver as in the
older machines, one being used to generate the Vertical Blanking signal, and the last
one being currently unused by the Mac OS.
We could have used the fourth timer, but that would have meant installing an
interrupt routine in the VIA dispatch table and setting up the VIA, and there was the
risk of someone else using that timer. We decided instead to use the already set up
Vertical Blanking timer in conjunction with the global variable Ticks. Because we
don’t need to modify the configuration of that counter, multiple applications can use
our Timer module at the same time without interfering with another. A minor
complication in doing so is that the timer does not directly generate an interrupt.
Instead, each time it reaches zero, bit 7 of VIA2 buffer B is inverted. This bit is used
as an output and drives the CA1-pin of VIA1, an interrupt being generated at each
transition from 0 to 1. For that reason, the state of VIA2 buffer B has also to be taken
into account.
Determining Time
To determine the current time, four different values must be read: the low and
high bytes of the Vertical Blanking timer, that must be read separately from the VIA
(vT1C and vT1CH), the state of the Vertical Blanking signal (vBufB bit 7) and the
global variable Ticks. The Vertical Blanking timer is set up to count repeatedly
downwards from hex $196E (= 6510) to zero. In fact, due to a peculiarity of the
6522 VIA, zero is first followed by hex $FFFF (= -1), and then only by hex $196E,
adding a supplementary step in the counting process. Each timing period lasts thus for
6512 cycles, which leads to the following formula to calculate the time in
microseconds since startup:
{1}
viaVal = (vBufB bit 7) * 6512 - vT1CH * 256 - vT1C
time = (2 * 6512 * (Ticks + 1) - viaVal) * 1.2766µs
Unfortunately, because all these values are constantly changing, it is not
sufficient to simply read these values and apply the formula. Consider the following
two examples, where the high byte of the counter is read first, then after about two
microseconds the low byte:
counter value (hex): $0228 value read from vT1CH (hex): $02
counter value (hex): $0226 value read from vT1C (hex): $26
counter value (hex): $0200 value read from vT1CH (hex): $02
counter value (hex): $01FE value read from vT1C (hex): $FE
In the first example everything went well. The resulting hexadecimal value is
$0226, which corresponds to the last counter value. In the second example however,
the resulting hexadecimal value is $02FE, which is much different from either $0200
or $01FE. Such errors always occur when the high byte of the counter changes
between the two reads.
Different solutions to that problem exist. Our solution, shown as Pascal code
below, relies on the fact that the time between two changes of the counter is relatively
long. The values needed for the future computations are read once and a test is done to
check if the high byte of the counter changed during that time. If it did, all the values
are read a second time and should be valid. The variable hib also has to be read once
more, in case the first read was from the special timer value hex $FFFF. Interrupts
are disabled to make sure that all these operations are done without interruption. The
variable Ticks can be read safely as long as interrupts are disabled, because it is
incremented by the Vertical Blanking interrupt handler.
{2}
DisableInterrupts;
hib0 := vT1CH; (* read the high byte a first time at *)
buf := vBufB; (* read all the values needed *)
lob := vT1C;
hib := vT1CH; (* read the high byte a second time *)
(* if the high byte changed in between... *)
IF hib <> hib0 THEN
BEGIN
(* read all the values once more *)
buf := vBufB; lob := vT1C;
(* in case first read of hib was $FF *)
hib := vT1CH;
END;
(* the Ticks can be read safely here *)
myTicks := Ticks;
EnableInterrupts;
A last problem occurs when the Vertical Blanking signal becomes high after
interrupts have been disabled and before the timer has been read. In that case, the state
of the VIA reflects the beginning of the new timing interval, while the Ticks variable
still contains the old tick value. This can be handled by testing if the value read from
the VIA is within a small number (i.e. 10) of cycles from the beginning of the
interval, and incrementing the number read from the Ticks variable by one if this is
the case. Such small numbers can not be read after the Vertical Blanking interrupt,
because of the execution time of the interrupt handler.
The Unit Timer
The unit Timer exports procedures to initialize, start and stop software timers
and allows any number of them to be active (i.e. started but not yet stopped) at the
same time. When stopped, they contain the measured time as a 64 bit wide number of
cycles (32 bits allow only measurements up to 1.5 hours). They can be started and
stopped repeatedly and will then contain the total time they have been running. A
constant to convert the 64 bit format into an extended real value in milliseconds is
provided for convenience.
Because we want to use these timing routines in a profiler, they should not only
be accurate, they also should not disturb the temporal behavior of the code they are
timing, even if many measurements are being done at the same time. Because the
execution time of a procedure can be very short, this is only possible if the routines
execute very fast (a few microseconds) or through some kind of compensation. In our
case, the execution time of the routines is about 35µs and a compensation is needed.
For that purpose, a counter tracking the total time spent in the routines StartTimer
and StopTimer is maintained. In addition, the processor cache is disabled during these
routines in order to keep the execution time as constant as possible and to reduce the
influence on other parts of the code.
It is interesting to note that a single number is sufficient to contain the state of a
timer during its whole existence. To implement the compensation, a single global
counter is needed, that contains a running total of the time spent in the routines to be
compensated for. The algorithm used here is in fact very simple. As can be seen below,
StartTimer subtracts the current time from the timer value and adds the current
compensation value, while StopTimer adds the current time to the timer value and
subtracts the compensation value. Before doing that, both procedures add their
expected execution time to the compensation value. After calls to InitTimer,
StartTimer and StopTimer in sequence, timer contains thus the value: 0 - Time1 +
35µs + Time2 - (35µs + 35µs) = Time2 - Time1 - 35µs, which is the time
difference between the two calls minus the compensation.
{3}
InitTimer (timer):
timer := 0;
StartTimer (timer):
totalComp := totalComp + 35µs;
timer := timer - ActualTime + totalComp;
StopTimer (timer):
totalComp := totalComp + 35µs;
timer := timer + ActualTime - totalComp;
Using the unit Timer
Consider the following example that shows the usage of the Timer unit. The main
program contains two FOR-loops that are both executed 100 times. The first loop does
nothing and the second calls repeatedly the empty procedure Dummy. Three timers are
used in that example. Timer t1 measures the execution time of the first loop, timer t2
does the same for the second loop and timer t3 measures the total execution time.
{4}
PROCEDURE Dummy; BEGIN END;
...
(* Initialize the three timers *)
InitTimer (t1); InitTimer (t2);
InitTimer (t3);
StartTimer (t3);
StartTimer (t1);
(* First loop *)
FOR i := 1 TO 100 DO END; StopTimer (t1);
StartTimer (t2);
(* Second loop *)
FOR i := 1 TO 100 DO Dummy END; StopTimer (t2);
StopTimer (t3);
...
The following table shows the resulting timer values with and without
compensation and with the processor cache enabled or disabled. In the compensated case
the value of timer t3 is roughly equal to the sum of t1 and t2, as would be expected
from an ideal timer. In the uncompensated case the execution time of StartTimer and
StopTimer is added once to the value of t1 and t2 and five times to t3 (about 175µs).
This example also shows that in this case using the processor cache leads to a speed