All Databases MacTech Vol 04-1988

Object Forth

Volume Number: 4

Issue Number: 8

Column Tag: Forth Forum

Object Forth Project

By Jörg Langowski, Wayne Joerding

Object-Oriented Extension to MACH2TM Forth

by Wayne Joerding

[This is the first of - hopefully - several upcoming articles dealing with Object

Forth extensions. I introduced you to some of the features of Wayne’s implementation

in my last column. Now, here comes the real stuff. I checked out Wayne’s code several

times and it seems to work very well. For those of you who play with the source code

disk before reading the article I should point out that you have to turn vocabulary

hashing OFF before you try out the code; this is also explained in the article.

Wayne Joerding is an associate professor of economics at Washington State

University in Pullman, WA. He uses the Mac - with Mach2 and his object extensions -

in his research on stock market efficiency issues. He can be reached on GEnie -

W.JOERDING.

Wayne went through great lengths in the implementation to make sure that the

code works with WORKSPACE, NEW-SEGMENT, and TURNKEY. Because of the way

dictionary links are changed by his code during class compilation, this is not at all

trivial. Read his notes carefully under this aspect. - JL]

Introduction

How could any programmer resist “object-oriented programming,” with its

promise to reduce program development time through reusable code and improve

revisability by the use of self-contained modules that encapsulate data structures and

their procedures that operate on the structure. Then to find out that the whole thing

works by passing messages to something called an object, it was too much to resist.

But, I didn’t want to give up my experience with Forth, its fast code, extensible

compiler and interactive programming style. Fortunately, it is possible and desirable

to build an object-oriented extension of the Forth language, NEON is perhaps the best

known example.

Brad Cox’s excellent book Object-oriented Programming: An Evolutionary

Approach, promotes the advantages of using a hybrid language derived by extending a

traditional language to accommodate the object-oriented approach. Object-oriented

Forth by Dick Pountain actually shows how to add abstract data types (objects) to

Forth. Armed with these two books, I have developed an object-oriented extension to

MACH2 Forth. This article describes the results. My approach is not the only one,

NEON has already been mentioned and there are postings on GEnie from others who are

making object-oriented extensions to FORTH. However, each approach has unique

features and limitations, thus an additional objective for this article is to discuss some

of the problems and tradeoffs I faced in the process. Here is an abbreviated version of

a summary sheet like that used by Kurt Schmucker in Object-oriented Programming

for the Macintosh.

ObjectFORTH

Background Information

Programming Environment : MACH2 FORTH

Toolbox Access : Yes

Object-Oriented Information

Instance Variables and Methods : Yes

Class Variables and Methods : Yes

Multiple Inheritance : No

Unique Instance Methods : No

Number of Classes in library : 1

MacApp access : No

Dynamic Objects : No

First, let’s clarify some jargon used in the literature. Objects are an abstract

construct composed of a data structure and associated procedures. Objects are

organized into child-classes of a root class, each child-class having more specifically

defined methods and data structures than its parent-class. For example, “stacks” are

a child-class of “ordered lists” which are a child-class of “arrays”. Each class is

composed of instances of the class. For example, a particular stack can be an instance

of the class “stack”. An object’s data structure consists of named fields called

instance variables. The procedures are called methods and are like subroutines. A

program requests an object to execute a method by sending the object a message which

is used by a “selector” routine to select the appropriate method. The set of messages

to which an object responds is called a protocol.

There are two essential features of object-oriented programing, encapsulation

and inheritance. Encapsulation refers to the isolation of an object’s data and methods

from other parts of the program. Encapsulation is enforced by restricting access to an

object’s data or procedures to a standardized message interface used by each object.

Objects are presented a message to which they respond by selecting a method to

execute. The specific method is an internal matter for the object and depends on the

implementation. For example, a link-list can be implemented as an ordered array or

an unordered array with pointers. The importance of encapsulation is that a user of an

object does not need to know how the object is implemented. In a sense, encapsulation

is nothing more than a strong form of good modular programming practice. However,

object-oriented programming not only enforces modular code but also bundles the data

structures with the procedures.

Inheritance refers to the ability of objects in a child-class to inherit the

structure and procedures of the parent-class. Furthermore, objects in the same class,

called instance objects, share the same code. The advantage of inheritance is that data

structures and procedures can be reused and extended.

In addition to meeting the above requirements, I also had several design goals for

the extension. The most important influence on the extension described here is its

pedagogical nature. I was trying to learn about object-oriented programming from the

extensions and so I made the logical and physical data structures as similar as possible.

The logical structure adopted is that given in Cox’s book and is a natural choice for a

FORTH programmer because of its reliance on linked lists.

A second design goal was to enforce a strong degree of encapsulation by requiring

messages to be simple character strings, without any procedural capability of their

own, and by hiding all forth words which identified methods or instance variables.

Thus, trying to execute the name of a method or instance variable will only get you a

beep and a question mark. An alternate approach is to have method names put an

identifier on the stack and then have the object use that identifier to find the

appropriate method. A disadvantage in this approach is that forgetting to use the object

does not immediately result in an error. A disadvantage of my approach is that it

requires rather major surgery on the links of the dictionary, a risky process which

slows compilation.

A third design goal was that every construct in the extension should be an object

in its own right, each object searching the input stream for a message. This means

that the creation of subclasses is a method of a parent-class object. As a consequence,

objects are divided into two groups, “class defining objects”, CDO’s, and “instance

objects”, IO’s. Instance objects are members of a class that is defined by a class

defining object. CDO’s are the factory objects of Brad Cox, their main purposes are

(i) specify the data structure and methods that define the class and are used by

instance objects, (ii) make instance objects, and (iii) create new child-classes. I will

discuss each of these capabilities in turn.

The data structure and methods of an object are implemented as normal forth

words which are sealed from the rest of the forth dictionary by redirecting the links of

their headers, more on this later. Access to the data structure and method names is

controlled by the class defining object. A CDO gives a new instance object a pointer to

the link list of instance methods and instance variables that define the class.

Only CDO’s can make instances of a class. Each instance can respond to a protocol

of messages as specified by its class definition. For example, if Forth’s “Pstack”, for

parameter stack, and “Rstack”, for return stack, were defined in an object-oriented

language then they would both be instances of the class “stack”, and as such would

respond to the message “pop” because that is a message in the protocol of all instances

of “stack”. The instance “ Pstack” would contain all the characteristics peculiar to

the parameter stack, such as data, while the instance “Rstack” would do the same for

the return stack. The CDO “stack” defines the structure of the data contained in the

instances “Pstack” and “Rstack” and the methods to which these instances respond.

Only CDO’s can create new classes. It is the creation of child-classes that

implements the inheritance of methods. A child class inherits the methods and data

structure of its parent-class. For example, “stack” is a child-class of the class

“list”, thus “list” passes along its data structure and methods to its child-class

“stack”. Consequently, if instances of “list” respond to the message “size”, instances

of the class “stack” also respond to “size”. If the programmer wants instances of the

class “stack” to respond to “size” in a different manner than instances of the

parent-class “list”, then the “size” method must be overridden by specifying a new

method for the message “size”. This sounds great in theory, but practice reveals one

of life’s tradeoffs. Take our example, one could code a “size” method which works

ineffieciently for all possible child classes, or one could code a method which is

optimized for a “list” type object and later override the “size” method with one which

is optimized for a stack type object. The class creation procedure also provides a way

to add additional methods to the child class, such as “pop” for the “stack” class.

There are important advantages to requiring all constructs to be objects. An

obvious advantage is the uniform handling of objects. Making instances and subclasses

is just the same as accessing data in an instance object. A much more important

advantage is that instance and child-class creation are methods, methods which can be

be overridden by child-classes. In a previous example it was pointed out that the

“size” method of a parent-class can be overridden by the child-class, similarly, the

instance and child-class creation methods can be overridden. This ability to override

the methods of a class defining object is analogous to the extensibility of the forth

compiler, and has the same importance. One of the code examples shows the use of this

feature to alter the instance creation method so that it forms instances of variable size.

Another example uses this feature to implement a C-style data structure.

A fourth design goal was to make the compiled code as fast as possible. Slowness

has been a criticized characteristic of object oriented languages, although the speed

sacrifice can be quite small (see Cox’s book for a discussion and references regarding

this issue). The desire for speed led to a binding scheme like that used by Pountain.

Binding refers to the point in time when a label is associated with a value. For

example, the definition “ : myDUP [‘] DUP EXECUTE ; “ would bind early the CFA of

DUP to the label myDUP. On the other hand, the definition “ : PLEASE ‘ EXECUTE ; “

and used as PLEASE DUP would be an example of late binding of the label PLEASE to the

CFA of DUP. My extension uses early binding in the sense that methods are chosen at

compilation time rather than run time. It uses late binding in the sense that variable

names are associated with addresses at run time. The essential overhead involved with

this approach is that a reference to any object instance variable in a method definition

requires a number to be read off a special object stack and added to an offset number on

the regular FORTH parameter stack. Any implementation of a data structure as

complex as an array would require adding some offset value, so the extra fetch from

the object stack is the actual overhead. Please see Pountain’s book for a discussion on

the whether this overhead is worthwhile.

Finally, a fourth design goal was to make the extension itself use abstract data

structures in order for the code to be as easy to change and maintain as possible. Thus,

the data structures in class defining objects are accessed by their names as much as

possible, not by calculating an offset which requires unchanging knowledge of the

format of the data structure layout. The only fixed address is the field used by a CDO to

store the LFA of its first method, this variable must be in the first long word of a

CDO’s data set. A consequence of this goal is that the code sometimes uses the trick of

temporarily pushing an object’s address on the object stack and then popping it off

when no longer needed.

Syntax and Usage

This section describes the syntax used to send an object a message, create a new

class, and make an instance of a class. Additionally, I describe some of the built-in

methods and tips on using ObjectFORTH.

Send any object a message:

Object.1 Msg

Messages cannot be larger than 32 characters. Capitalization matters! These

restrictions are due to the simplicity of the selector routine used to find methods.

Make an instance with the identifier INS.1:

CLASS.1 Make.Instance INS.1

Make.Instance is a method of the class defining object CLASS.1, it simply uses the

character string INS.1 from the input stream to create a dictionary header for the new

instance.

Create new class with the identifier CLASS.2:

CLASS.1 Define.Child.Class

:Class

I.Var

Hide

:M ;

;Class

k Constant

Variable

:Instance

Variable

I.Var

n Ins.Array

Hide

:M ;

;Instance

CLASS.1 Name.Child.Class CLASS.2

Each CDO has two sections, a class section which contains the methods and

variables of the object and an instance section which holds the methods and data

structure for instances of the class. The words :Class and ;Class start and end

definition of the class section of a child class, similarly for :Instance and ;Instance.

Each of these parts of the definition are optional. Thus, most CDO’s have no need to add

methods to the class section of the object. The order of the class and instance section

definitions does not matter.

The word I.Var creates a long word integer variable which is inherited by each

child. That is, an I.Var in the class section is inherited by any CDO which has CLASS.2

as an ancestor, and an I.Var in the instance section is inherited by each instance of the

class defined by CLASS.2 and by each instance that has CLASS.2 as an ancestor. The

word Ins.Array is similar but creates an array of n bytes. :M starts the definition of

an object method. In this example, :M defines a new class method named . The

method is used by sending the message to the object CLASS.2.

The word Hide pr events previous words in a section from being available as

methods. For example, use of Hide in the class section pr events the instance

variable from being available as a method name. In this way the user of an object

cannot gain direct access to the memory location of by sending as a

message. Hide is usually used in this way, to hide an object’s data structure, but it can

also hide methods. Thus, one could define a method before Hide and it would not be

available as a public method, only internally to the object. Hide is optional.

Note that normal forth constructs, such as Constant, can be used in the definition.

How these constructs are used depends on their placement. If placed outside the section

defining words (:Class and ;Class or :Instance and ;Instance), such as , then the

word is not available in any fashion after the object definition is completed. But the

word can be used in method definitions. This is a good way to define words used in

method definitions but which are not subsequently desired as methods. This should be

contrasted with the effect of Hide discussed above. For example, the variables

and both provide a” class variable” for CLASS.2. (Class variables are

variables shared by instance objects of the same class, which can be used for passing

data between instances.) The difference between the two is that is not

available to descendents of CLASS.2, while would be available to new methods

of CLASS.2 descendents. That is, methods defined in children of CLASS.2 could use

as a variable. However, is not passed on to a child class in the manner

of an instance variable. If forth definitions are placed inside the section defining words

but after Hide then the construct is available as a method by its name.

Referenced by (5):