IO Routines
Volume Number: 1
Issue Number: 6
Column Tag: FILE SAVING, OPENING
I/O Completion Routines
By Steve Brecher
I/O Completion Routines
Application I/O requests are ultimately performed by low-level operating
system trap routines: _Read, _Write, _Control, _Status, _Open, _Close. The program
passes the address of an I/O parameter block to the trap routine. The parameter block
specifies the device driver (e.g., Sony floppy driver, serial driver) and the
particulars of the request such as (for _Read or _Write) the number of bytes and the
address of the program’s buffer.
The operating system passes the parameter block on to the device driver; or, if
the driver is currently busy, the parameter block is put onto the driver’s queue of
pending work.
Mac I/O operations can be either synchronous or asynchronous. A synchronous
operation will be completed before control is returned to the invoking program. An
asynchronous operation will be initiated -- started upon by the driver, or put into its
queue-- and then control will be returned to the program regardless of whether the
operation is yet complete.
Asynchronous I/O is the heart of multi-tasking; indeed, the only rationale for
multi-tasking is the overlapping of I/O and computation. Programs can imp- lement a
primitive form of multi-tasking within themselves by using a multi-buffer I/O
scheme. For instance, a program can request that a buffer full of data be written, and
then -- before that write request is completed -- immediately start filling another
buffer with data.
Completion routines can be a convenient mechanism for a program to manage
multi-buffered I/O. The com- pletion routine is called asynchronously with respect to
the rest of the program; it can mark the buffer associated with the just-completed I/O
request as free (written) or ready for processing (read). If another buffer is ready to
be read into or written from, the completion routine can initiate the next I/O request.
Completion routines can also be used to hide low-level I/O details from the
mainline logic of a program. I attempted to use the Mac’s completion routine facility
for that purpose in a program doing input from the modem port at high baud rates. The
program started off by issuing an asynchronous read request for one byte, and
specifying a completion routine. The completion routine looked like this:
-- Put the received byte into program’s input buffer;
-- Issue a _Status call to find out whether the serial input driver has more data in
its own buffer;
-- If the driver has more data available, issue a _Read to transfer it to the
program’s input buffer;
-- Re-issue the one-byte asynchronous _Read.
The idea was for the completion routine to handle the low-level details of
managing input into the program’s buffer. The mainline of the program would then be
freed from concern with those details, and could just take data from the buffer as it
became available.
But this scheme didn’t work. Following is an explanation of why it didn’t, which
relates to a severe limitation in the Mac’s implementation of completion routines.
Asynchronous I/O is requested by setting bit 10 ($0400) in the trap value.
There are two ways for a program to detect the completion of asynchronous I/O. When
an I/O operation is initiated, the ioResult field of the parameter block is set to 1; when
the operation completes, ioResult is set to 0 (if no error) or to a negative error code.
So the program may test ioResult to determine whether the operation is complete. Or,
the program may specify the address of an I/O completion routine in the ioCompletion
field of the parameter block. If ioCompletion is not nil (zero), then when the
operation is complete the OS will JSR to the address contained in ioCompletion.
If bit 10 of the trap word is clear -- a synchronous I/O -- the Device Manager
will clear ioCompletion before initiating the I/O; and the OS will test ioResult in a
tight loop, waiting for it to become zero or negative before returning from the trap.
Hence, you don’t need to clear ioCompletion prior to a synchronous I/O; but for
asynchronous I/O you must either either clear ioCompletion or set it to the address of a
completion routine.
When a completion routine is entered, register D0.W contains the ioResult value
(and the condition codes reflect a TST of its value); A0 contains the parameter block
address; 4(SP) -- just above the return address -- contains a pointer to the driver’s
Device Control Entry (DCE); and registers D0-D3/A0-A3 may be altered by the
completion routine. The I/O queue element (a.k.a. the parameter block) will have been
unlinked from the driver’s queue. And -- here’s the catch -- interrupts may or may
not be disabled.
The OS initiates an I/O operation by calling the appropriate entry point of the
device driver. Let’s consider, for example, a read request to the serial input driver.
The OS calls the serial input driver’s “Prime” (read) routine. There are two
possibilities: either the request will be completed immediately by the Prime routine,
or it will be completed later as the result of an interrupt.
If the number of characters in the driver’s buffer is equal to or greater than the
number specified in the I/O request, the Prime routine will fulfill the request
immediately by transferring characters to the caller’s ioBuffer, and then jump to the
OS ioDone routine, which will call the completion routine. Interrupts are enabled at
entry to the completion routine, and it can therefore do anything that any other part of
an application can do. In this case, when the completion routine is entered the original
_Read trap has not yet returned and no other application code has been executed.
Therefore the effect is similar to that of the application executing the following:
_Read ;synchronous
JSR CompletionRoutine
If the completion routine initiates another I/O request to the same driver,
specifying the same ioCompletion (i.e., itself), that request may also complete
immediately in which case the completion routine will be re-entered.
If there are insufficient characters in the driver’s buffer to satisfy a read
request, the Prime routine will return to the OS, which will return from the trap.
Later, the driver’s received-character interrupt service will determine that the
request is complete, and jump to the OS ioDone routine, which will call the completion
routine. Now, however, the completion routine is entered with interrupts disabled; in
effect the completion routine is part of the interrupt service.
The completion routine can do no significant processing, because that would hold
off interrupts for too long a time -- for example, subsequent incoming serial data
might be lost. This virtually excludes the possiblity of intitiating a new I/O operation
from within the completion routine. Also, the completion routine (at interrupt level)
cannot do anything that would alter the heap configuration -- the interrupt might have
occurred while a handle was being dereferenced.
Since there is no point to doing asynchronous I/O unless it is assumed that the
request will not complete immediately, the programmer must assume that a serial I/O
completion routine will be executed at interrupt level. This makes the completion
routine facility rather useless for serial I/O. The routine could set a flag indicating
the completion -- but such a flag is already available in ioResult.
Wish list item... To make completion routines useful, the Mac OS would have to be
enhanced to implement what the DEC PDP-11 operating systems refer to as “fork
processes.” Fork processes are serialized and executed synchronously after all
(possibly nested) interrupts have been dismissed, but before control is returned to
the point at which the first interrupt occurred. This enables completion routines to be
inter- ruptable, and gives them time to do useful work.
MS BASIC programmers: Stuff it!
The QuickDraw StuffHex routine is a fast way to convert lengthy hex machine
language data to binary code in MS BASIC programs. The trick is to prefix each string
of hex digits with a character whose ASCII value is equal to the number of hex digits:
this character is thus a length byte, making the string a Pascal string which can be
passed to StuffHex. Note that the address of the string data is not given by the address
of the BASIC string variable. The string variable is actually a data structure
containing information about the string, including its address. The address can change
each time the string is assigned or read into. We get the address of the string data by
PEEKing into the string variable. Example:
‘
‘ Machine language interface to StuffHex:
‘
DIM SH%(2)
SH%(0)=&H245F : SH%(1)=&HA866
SH%(2)=&H4ED2
‘ Array that gets the binary machine
‘ language:
DIM CODE%(SomeAdequateSize)
‘ Declare all scalars before getting array
‘ addresses:
CODEPTR!=0! : HEXLINE$=”” : STRINGPTR!=0!
P!=0!
‘ Get addresses of arrays and the string
‘ variable:
STUFFHEX!=VARPTR(SH%(0))
CODEPTR!=VARPTR(CODE%(0))
P!=VARPTR(HEXLINE$)
‘ Read lines of hex data, convert to binary:
READ HEXLINE$
WHILE HEXLINE$<>””
‘ Get address of first byte of string data
‘ (the length byte):
STRINGPTR!=(PEEK(P!+2!)*65536!)
+(PEEK(P!+3!)*256!)+PEEK(P!+4!)
‘ Convert the line’s hex data to binary:
CALL STUFFHEX!(CODEPTR!,STRINGPTR!)
‘ Adjust pointer into CODE! array for next
‘ data (if any):
CODEPTR!=CODEPTR!+(ASC(HEXLINE$)/2)
READ HEXLINE$
WEND
‘
‘ The ASCII value of the first character
‘ of each string must be equal to the
‘ number of characters in the rest of the
‘ string. Thus the number of hex digits
‘ in each string must be at least 32
‘ (ASCII space) and no greater than
‘ 126 (ASCII ~) so that the length is a
‘ displayable ASCII character. But the
‘ length must be other than 34 -- 34 is
‘ the ASCII code for quotation mark!
‘ Note the following DATA line has 36 hex
‘ digits, and that the first character of the
‘ string is “$”, i.e., CHR$(36).
‘
DATA”$0123456789ABCDEF0123456789ABCDEF0123
‘
‘ More lines of hex data would go here
‘ Empty string marks end of data:
‘
DATA “”
Reports from Miss Elaine E.
Due to a QuickDraw bug, DrawText in srcCopy mode will erase four or five
character positions after the position of the last character in the string. It’s OK to use
srcXor if no character position will be overwritten; but if you need srcCopy, a
workaround is to set the right side of the port’s clipRegion to the right edge of the last
character before calling DrawText. If the font is not monospace, this implies calling
TextWidth first to find out the screen width of the string to be drawn.
The QuickDraw ScrollRect routine will be slowed by a factor or 3 or 4 if you
include the borders of the GrafPort in the rect that is scrolled. Make sure the rect
passed to ScollRect is inset at least a couple of pixels from the port’s border(s).
Thanks to Steve Hanna for this tip on the alternate screen buffer... To have
QuickDraw draw in the alternate screen buffer, change the BaseAddr field of the
GrafPort to $72700. When you use QD calls with that GrafPort as the current port,
the drawing will be done in the alternate screen buffer. To flip the display between the
main and alternate screen buffers, complement bit 6 of VIA buffer A, which is mapped
to address $EFFFFE. (Steve Hanna also notes that the low-order 3 bits of buffer A
control sound volume [0..7]. Bit 7 is tied to the SCC -- see below). The following
instruction will flip the display to the other screen buffer:
Bchg.B #6,$EFFFFE
Thanks to Dennis Brothers for pointing out that the MacsBug HD (heap display)
command is a way find your program in memory -- the first CODE resource in the
heap is most likely the first (or only) segment of your program.
The reason the ROM serial driver doesn’t support input flow control is that a D0
was coded where a D3 should have been; a two-bit error. And the reason mouse
interrupts will be lost if you close the ROM serial driver (without immediately
opening the RAM serial driver) is that the ROM serial driver neglects to finish its
cleanup by setting the master interrupt enable bit in the SCC chip -- a one-word
omission from a table. (Mouse movement signals come in through the DCD pins of the
SCC and generate SCC external status interrupts.)
If you want to program the SCC yourself, ask your local Zilog office for a copy of
the “Z8030/Z8530 SCC Serial Communications Controller Technical Manual.” The
memory-mapped addresses of the the SCC are in the MDS equate file SysEquates.Txt.
The SCC WAIT/REQUEST (asserted low) pin state is brought over to bit 7 of VIA (6522
chip) buffer A, which is mapped to memory address $EFFFFE. The serial driver’s
Open routine configures the SCC to assert WAIT/REQUEST when an input character is
available in the SCC buffer. This enables the Sony floppy driver to feed incoming data
to the serial driver’s input buffer while the Sony driver has disabled interrupts.
When the byte at $EFFFFE is positive (bit 7 clear), the Sony driver fetches the
character waiting in the SCC and calls code in the serial driver interrupt service
routine.
Inside Macintosh says of the serial status ctsHold flag (Serial Driver, p. 13), “If
output has been suspended because the hardware handshake has been negated, ctsHold
will be nonzero.” The conditional clause could be misleading: provided that the output
driver has been opened, ctsHold always reflects the state of the CTS (asserted low) pin
of the SCC (pin 7 of the DB-9 connector) regardless of the status (or existence) of any
output request. If CTS is asserted, ctsHold is zero.
Need a quick (and dirty!) test of whether any OS events are pending? Address
$014C contains the OS event queue header (a pointer); if it’s nil (zero), the queue is
empty.
Think you have a good (homemade) backup of Macintosh Pascal? Make sure you
can click the mouse 101 times in the source edit window. On the 101st click, Mac
Pascal goes to the disk to check that the master is there; if it doesn’t like what it finds,
it abrubtly quits to the Finder (trashing any of your unsaved work). In my book, this
qualifies as a “worm” and, since it’s not documented, is tasteless at best and unethical
at worst. If a publisher wants to frustrate users who attempt to backup a product or
use it on a hard disk without inserting the master, the program should either quit at
the outset or put up a dialog box demanding the master. The more experienced the
programmer, the more violent his aversion to being forced to use distribution media in
production work.
Consulair Mac C users who want all string constants to be compiled in Pascal
format (length-byte prefix) can include the following at near the top of the source
file:
#asm
STRING_FORMAT 3
#endasm
If there’s no semantic difference between pre-incrementing and
post-incrementing a variable in your Mac C program (i.e., you can choose either ++i
or i++), use pre-increment -- it generates more efficient code. Same applies to
decrements. If you use post- inc(dec)rement, the generated code will inc(dec)rement
the variable in memory, then offset that operation by dec(inc)rementing it in a
register in preparation for its previous value being used in an expression -- even if
it’s not so used.
A Mac C update is expected to be available this spring with floating point,
register variables, structure assignments, and slicker code generation. Look for the
Greenhills C compiler under the Apple name on the Mac later this year -- it’s being
ported from the Lisa along with the Workshop. I haven’t used it, but it’s reported to
generate slick code.