Record Definitions
Volume Number: 5
Issue Number: 2
Column Tag: Forth Forum
By Jörg Langowski, MacTutor Editorial Staff
“Record definitions in Mach2”
Record structures and arrays are not part of standard Forth implementations.
More than two years ago, in V2#7, I had given an example how to implement records.
Mach2 has evolved since then, and so have ways of implementing new data structures,
as you can see in the Object Forth project by Wayne Joerding that we recently
discussed. For those of you who do not want a full object-oriented system, but still
ways of defining data structures in an easy way, I have found two examples on the GEnie
bulletin boards. Those examples show two fundamentally different approaches to deal
with record definitions.
‘Local’ field names - method 1
The problem in setting up the Forth compiler to deal with record definition in a
proper way is somewhat similar to implementing an object-oriented programming
system. That is, just like a message is local to an object, and the same message may
cause different effects on different objects, a field name should be local to a record. In
the Pascal record definitions
\1
rec1 = record
x: real;
i: integer;
y: real;
end;
rec2 = record
y: real;
j: integer;
x: real;
end;
the field x would create a different offset into a record of type rec2 than for a rec1
type; and rec1.i, rec2.j would be valid while rec1.j, rec2.i would not. So if we define a
field name as some kind of Forth word, this word should be in some ‘local vocabulary’
that belongs to the record definition and is only visible while the field reference is
resolved.
The other requirement is that we should be able to pass a record as a parameter
to a routine, so that given the pointer to a record on the stack, a Forth definition would
know how to resolve the field reference. In a strongly typed language like Pascal this is
easy; field references into record formal parameters can be resolved at compile time
because the procedure arguments are of defined type. In Forth, typically the address of
a data structure would be passed on the stack. However, at compile time there is no way
we can restrict the type of argument that this address might later point to at run time!
This problem could only be solved by type checking built into the record definition and
deferring the resolution of the field reference to run time, some sort of ‘late binding’.
The first method of record definition (Listing 1), written by Waymen Askey of
Palo Alto Shipping (I added some minor modifications, like floating point and array
support), creates a local dictionary for each record template in the Forth dictionary
space. When a record template is defined, using the syntax
\2
template rec1
:real x
:word i
:byte c
tend
its field names x, i and c are compiled into the dictionary together with relevant
information for resolving the references. At the end of the template declaration, the
dictionary links are changed in such a way that the ‘local’ names are skipped when the
dictionary is searched. Let’s declare a record:
A field of this record is later accessed by using the structure fetch/store words,
s@ and s!.
myRec x s@ will put the value in field x of myrec on the floating point stack, and
myRec i s@ will put the word value of field i on the stack. The trick Waymen used was
to build some intelligence into the fetch/store words. When the record and field words,
myRec and x for example, are executed or compiled into a definition, field type and
offset are determined and kept in global variables. The s@ word will check these
variables and know how to access the field, whether - in immediate execution - to do a
byte, word or long word fetch, addressing into an array, or a ten-byte fetch onto the
floating point stack for a real number; or at compile time create code that will do these
things later.
The drawback of this approach is that field references can only be resolved at
compile or immediate execution time. If we wanted to write a word that operates on a
record whose address is passed on the stack, we couldn’t use the field names that were
defined in the record template - they are only valid right after a record name was
executed or compiled. Therefore, a definition like
\4
: getX { myRec -- } myRec x s@ ;
must fail because myRec is a local variable, not a record name.
An example how to use this method of record declaration with various field types
is given at the end of the listing. You see the drawback: Even though the record fields
wavelength, temperature, and angle are all themselves structures of the same type
parameter, there is no way to factor out the common code in
5
cr curve1 wavelength name s^ count type .” = “
curve1 wavelength value s@ f.
curve1 wavelength unit s^ count type
cr curve1 temperature name s^ count type .” = “
curve1 temperature value s@ f.
curve1 temperature unit s^ count type
cr curve1 angle name s^ count type .” = “
curve1 angle value s@ f.
curve1 angle unit s^ count type
by using a word that would just print name, value and unit of any given
parameter. If this problem was resolved, the record compiler would almost be perfect.
‘Global’ field names - method 2
Listing 2 shows a much simpler approach to structure definitions that does not do
type checking. I downloaded this code from the Forth Roundtable on GEnie, and
unfortunately have not the slightest idea who the author is. All I could find out was that
the original code was probably posted on the East Coast Forth Board.
However, since this code solves one of our problems, record passing as formal
parameters, I’d like to print it here. Its strategy is much more like that of the
structure words built into MacForth Plus. Here, a record template is defined like
\6
RECORD Rectangle
Global SHORT: Top
Global Short: Left
Global Short: Bottom
Global SHORT: RIght
ENd.RECORD
Variable myRect Rectangle 4 - VALLOT ;
so the record name, when executed, simply leaves the record length on the stack for
later ALLOT or VALLOT. The field names are words which add the field offset to an
existing address on the stack, so they can be used in any context. We have to check
ourselves whether the address is a valid record address and whether the field
referenced actually exists in that record (if we care at all). All field names are global,
and therefore must be unique; no two different record declarations can have fields of
the same name at different offsets.
This approach is not so different from the very basic one that I used in most of
my examples, where I simply defined field names as constants and added the offset to
the record address.
What the Macintosh Forth world needs is really a combination of the two
approaches, with type checking at compile time and local field names for convenience,
and a possibility to resolve field references on record addresses at compile time
without too much overhead. If one knew the type of the record passed on the stack ahead
of time (which is usually the case), one could probably define some ‘field reference
resolution word’ which computes an offset given a template and a field name. I hope I
can show you an example in one of my next columns.
Upcoming: an update to Wayne Joerding’s Object Forth, and a review of
PocketForth, a public domain 16-bit Forth that comes as an application and a desk
accessory. Stay tuned.
Listing 1: Structure definitions with local field names
\ STRUCTUREs 2.5 for the Macintosh MACH2
\ Jan 3, 1987 by Waymen Askey
\ edited, floating point & array addition by
\ J. Langowski @ MacTutor
\ This MACH2 extension is released for the public good;
\ however, for those planning commercial use of this code,
\ please notify me so that I might know of its intended use.
\ Waymen Askey @ PASC
\ also GEnie MACH2 RoundTable.
only mac also sane also forth definitions
( VARIABLES used in STRUCTURE 2.5 )
decimal
variable current.template
variable op.type
variable A5offset ( holds the A5 offset to a structure )
( CODE word utilities used in STRUCTURE 2.5 )
code var.link ( -- a | variable link pointer )
lea $F7F8(A5),A0
move.l A0,-(A6)
rts
end-code
code a5@ ( -- a )
move.l A5,-(A6)
rts
end-code mach
code get.field ( a1 a2 -- a3 -1 or 0 | searches templates )
( a1=template, a2= pad, a3=field pointer, 0 if not found )
move.l (A6)+,D2