November 92 - Comparing C-Based Object Systems
Comparing C-Based Object Systems
Gary Odom
Preface
This rather dry martini of an article presents a comparison of C-based object
systems. To provide a basis for that comparison, the articles begin with a perspective
about why OOP is important, what the important issues are with object orientation,
and a brief mention of other object systems from which designers of C-based object
systems have drawn their inspiration.
Software with Class?
Programming practices have evolved with one primary motivation: improve the
ability to maintain and reuse software. Object-oriented programming came about in an
attempt to make code more modular and easier to reuse. The ideas behind object
orientation are simple. To make code more modular, combine data structures and
functions together: create software modules that are inherently self-contained. In
other words, prefabricate software so that code functionality comes in one piece. Each
module becomes a class.
To improve the ability to reuse software, allow classes to inherit from one another.
This way, a new class can get all the functionality it needs from other classes, with the
exception of the behavior that makes the new class unique. So, the two big concepts
behind object orientation are 1) modular software components (by combining data
structure and behavior), and 2) inheritance. While modularity, achieved through the
use of classes, simplifies code maintenance, inheritance provides the sizzle of easy
reuse by letting a programmer incrementally modify and expand class behavior.
The continuing challenge of software development is managing complexity.
Object-oriented programming, with its inherent high degree of modularity, helps. But
class libraries by no means ensure that software complexity is well managed. Quite the
contrary, there is a disturbing trend in object-oriented software construction towards
class libraries with hundreds of classes, but without the integration between them to
simplify usage or maximize productivity. It may well turn out that the OOP
development tools that stand the test of time are not those that try to offer everything
through diversity, but those that integrate well the basics that most applications need.
But class library architecture is the subject another article. This one is about object
systems.
Object orientation is implemented by an object system. Most often, an object system is
built into a new language, or becomes an extension to an existing procedural
programming language. Object-oriented language extensions often require a new
compiler, though C-based object systems are usually implemented using a
preprocessor to a C compiler. The disadvantage to using a preprocessor is that
source-level debugging is no longer possible.
The success of any programming language is the ability to give a programmer range
and flexibility in implementing software designs in the most straight-forward
manner. C usurped FORTRAN and Pascal in popularity largely because C allows an
programmer greater range (with such features as permitting a variable number of
function arguments, and built-in bit twiddling), and flexibility in expression.
There has been heightened interest in recent years with visual programming
environments. Just as creative people have historically chosen between literary or
graphic artistic expression, perhaps we are beginning an age where software
developers will have a similar choice in their medium. While this article is focused on
written OOP languages, the same evaluation criteria may be applied to visual OOP
tools.
Inheritance
Because inheritance is one of the key concepts behind object orientation, one way to
judge the quality of an object system is how flexibly inheritance can be specified.
Multiple inheritance is the ability of a class to inherit from multiple classes. With
multiple inheritance, a new set of methods (behavioral functions) can easily be added
to an existing class. So, for example, you could attach a debugging class to another class
without introducing new sequential links in the inheritance chain; the debugging class
could verify the validity of data in objects of the target class. Most often, multiple
inheritance is used to create a subclass that combines a class with primary
functionality with another class that adds some ancillary characteristics. For example,
a text graphic class (cTextGraphic) would inherit from a text class (cText) for text
processing, plus inherit from a graphic class (cGraphic) to allow a user to treat the
text as a graphic object (as in an object-oriented drawing program). Because it is so
convenient, most current object systems support multiple inheritance.
An advanced object system allows inheritance and methods to be defined dynamically,
while a program runs. This is called dynamic definition.
Class-Object Schizophrenia
The theory of object orientation makes a clear distinction between classes and objects.
Classes are object factories, templates that exist only in source code. A class specifies
object data structure, while a class itself has no data. A class just has methods, so that
objects can take function calls. Objects alone exist as dynamic entities in memory as a
program runs. An object can't have methods separate from the class it inherits from,
and a class can't have its own data.
This fundamental distinction between classes and objects can be blurred to
considerable benefit. Smalltalk and Objective-C allow classes to have their own
methods (class methods), apart from the methods an object that inherits from the
class has.
In an advanced object system using flexible class-object construction, classes may
have their own data structure (class variables) and their own methods (class
methods), and objects may have their own methods (object methods), separate from
the class methods they inherit. These capabilities provide flexibility in software
design and implementation, as well as giving conceptual consistency to working with
objects. While theory may put a wall between classes and objects, eliminating
class-object distinctions gives a developer great practical flexibility in meeting
design requirements.
There is another aspect of object orientation that defines the quality of an object
system: dispatch control.
Dispatch Control
Object orientation introduces a rather strange concept: calling a function without
knowing exactly what function is going to be called. This happens because different
classes can use the same function name. For example, to draw an object, you might
write draw(self), where self is the object to be drawn. A dispatch mechanism is used
to find the right method to call based upon the class inheritance of the self object. The
technical term for this function-calling shell game is polymorphism (Greek (to me)
for "multiple shapes"). Polymorphism is great because it lets code become very
general: you can draw all objects on a page by calling draw(self) in a loop, where the
loop assigns self from a page object array.
Polymorphism necessitates finding the right class method to dispatch to at run time
(dynamic binding), rather than binding a function call at compile time (static
binding) (what linkers do for a living). Dynamic binding has two drawbacks. First, a
dispatch error can occur at run-time if a method wasn't defined for the function call
made. Second, it takes time to find the proper method to dispatch to based upon class
inheritance. Method dispatch is the overhead imposed by object orientation. These
drawbacks are the price paid for quicker development time, smaller code size,
flexibility in using prefabricated software, and easier maintenance. Hybrid languages,
such as C++, let a programmer go back to procedural programming for time-critical
code, whereas this is not an option with a pure object-oriented language such as
Smalltalk.
Just as flexibility in specifying inheritance is important, so is flexibility in dispatch.
Features of dispatch flexibility are being able to call multiple methods by a single
function call (multiple dispatch), controlling which methods are called and in what
order (dispatch control), being able to dispatch to a specific method, and dispatching
based upon multiple arguments (called multi-methods).
The essence of high-quality dispatch in an object system is being able to call multiple
methods in a single function call (multiple dispatch), and being able to control method
call order (dispatch control). Imagine a resource-based picture class (cPicture),
which inherits from a resource class (cResource). To draw a cPicture object
(draw(picture)), you want to first make sure the picture resource is in memory. The
cResource is used as a before-method, to check and load the resource if it has been
purged. The cPicture draw method, which draws the picture, is an after-method.
An advanced object system allows dispatch control using before- and after-methods. A
truly flexible object system lets dispatch control be altered dynamically, while a
program runs, as part of dynamic definition.
Another aspect of dispatch control is being able to dispatch to a specific class method,
rather than accepting the default dispatch. For example, you may want to draw just the
handles on a graphic object by calling dispatch_to(cGraphic, draw, self), rather than
calling draw(self), which draws an object and its handles. Almost all object systems
offer this capability.
Multi-methods, which is being able to dispatch to a method based upon multiple
arguments, is the object-oriented equivalent of a function being able to take variable
arguments.
Object Links
One of the problems with procedural programming is that it takes effort to build
self-contained, reusable software modules. But is it easy to link data structures
through functions.
In a role reversal to procedural programming, the modular, decentralized nature of
object orientation presents an interesting design decision: how best to link and
integrate related objects (and classes). A significant challenge with object-oriented
programming is designing and implementing the links between objects of different
classes. While object links are the basis for object-oriented databases (OODB), they
are also a necessary ingredient of any object-oriented application. Because object
links are a structural element of any object-oriented application, a good object system
should offer built-in support for object links.
Object System Survey
Smalltalk
Dating back to 1967, Simula was the first object-oriented language. But, because of
its looming influence, Smalltalk is the grandmother of object-oriented programming
languages. Smalltalk was designed as part of an object-oriented environment, with
hundreds of classes, where everything is object-oriented. There is no class-object
distinction with Smalltalk. Using the Smalltalk environment is a "deep immersion
experience in a land of objects, which is why it has been such an inspiration.
Smalltalk has a surprising limitation: it does not support multiple inheritance.
Because even the simplest message uses dynamic binding (even the + in C = A + B),
Smalltalk is slow.
The phraseology of "sending messages to objects" is a holdover from Smalltalk, where
the syntax is object-verb (such as 'thisOval draw' to draw thisOval), rather than the
more typical function-calling paradigm of verb-object (such as draw(thisOval)). As
with most languages, the verb-object function call model is used in this article.
CLOS
The Common Lisp Object System, known as CLOS, is the ANSI-standard language
extension to Lisp that adds object orientation. CLOS is noteworthy because, in a sea of
tug-boat object systems, CLOS is a luxury liner. CLOS supports multiple inheritance,
dispatch control, multi-methods, flexible class-object construction, dynamic binding
and dynamic definition. To simplify the application programming interface (API),
CLOS consistently uses generic functions. Generic functions are polymorphic
functions, such as draw and act. The object orientation that CLOS allows is
tremendously flexible and expressive, but because Lisp has a limited domain, namely
AI and list/language processing, CLOS, like Lisp, will never become a mainstream
language.
C Object Systems
Because of its simplicity, flexibility, efficiency and range, C has become the industry
choice for systems and application software development. It is natural to extend C into
the object-oriented realm. A few interesting attempts have been made.
Objective-C
An early attempt to make C object oriented was Objective-C. Objective-C adds
Smalltalk-like object orientation using a strict superset of C. Objective-C adds a class
definition mechanism, an object data type, and a message expression type. In
Objective-C, each class is defined by two files: an interface file, and an
implementation file. The interface file specifies the class programming interface:
class and superclass names, along with instance variable (object) declarations and
method declarations. The implementation file has class method code.
Objective-C supports multiple inheritance. Like Smalltalk, Objective-C permits class
methods. Like C++, Objective-C provides ways to enforce data hiding and restrict
method access. Objective-C lacks dispatch control or dynamic definition.
Included in the NeXT operating system environment is a set of classes written in
Objective-C for application development. Because these classes are native to the
platform, NeXT application development is relatively easy, especially compared to the
complex nightmare of the Macintosh Toolbox. Though there is little marketing of the
product, Objective-C is available on the Macintosh as an MPW C preprocessor.
Commercially, Objective-C was ahead of its time. Its corporate sponsor, Stepstone
Corporation, was near financial death before being resuscitated by adoption for the
NeXT line of workstations.
C++ is a another language extension to C. Only part of the C++ extensions have to do
with object orientation. Operator overloading, for example, adds flexibility to C, but
has nothing to do with object orientation per se.
The object-oriented part of C++ implements a limited version of object orientation.
Multiple inheritance is supported, as are class variables, class methods,
multi-methods and optional dynamic binding, but C++ lacks dispatch control or
dynamic definition. C++ classes have automatic initialization and deallocation methods.
C++ class constructs provide three levels of enforced information hiding. Access to
data or methods can be private, protected or public. Restrictions can be overridden
(by friend classes). The information-hiding features require new language syntax that
complicates what was (in C) a lean language definition. Further, this feature sits in
odd contrast to C's celebrated openness with typecasting, data manipulation and the free
use of function pointers.
MacApp, now written in C++ after previous incarnations in Object Pascal, currently
provides the only commercial set of C++ classes for Macintosh application
development. MacApp has become a sitting duck for its successor, Bedrock. Bedrock's
design is based upon the THINK Class Library (TCL), but with many more classes.
Bedrock development is taking the kitchen sink approach of providing over a hundred
classes, but, at the time of writing, it lacks a streamlined architecture or a high
degree of integration between classes. Bedrock is still under development, so it is
premature to assess its adequacy.
OOPC
OOPC (pronounced "oop-see") is an acronym for "Object Oriented Programming in C".
OOPC has an unusual implementation, in that it is not an extension to the C language,
but rather a set of functions that turns C into an object-oriented language.
Object-oriented programs written in OOPC look like standard C code, because they are
just that. This consistency with C simplifies learning and using OOPC.
The look and feel of OOPC, while simple, is deceiving. OOPC has all the features of
CLOS: multiple inheritance, dispatch control, flexible class-object construction, and
dynamic definition. Multi-methods can be simulated. Plus, OOPC comes with built-in
support for object links and object-oriented database (OODB) development.
Like C++, OOPC provides automatic initialization and deallocation methods. OOPC also
implements a form of garbage collection to prevent an object from being released
while it is still linked to any other object. Unlike Objective-C or C++, OOPC does not
enforce data hiding or restrict method access.
OOPC always uses dynamic binding, but this overhead can be mitigated by creating a
dispatch table, which essentially results in the same efficiency as static binding while
still allowing dispatch control options.
To simplify the programming interface, the OOPC consistently uses verb functions.
OOPC verb functions are the same as CLOS generic functions: verbs used as
polymorphic functions.
Although it has existed for years, OOPC has only recently been released to the public as
a commercial product. The OOPC object system is only part of the product currently
sold for Macintosh application development. OOPC includes a set of code libraries for
rapid application development. Some low-level function libraries exist for efficiency
and interface to the native operating system. OOPC also comes with a set of classes that
provide automatic document management, an application user interface,
object-oriented database construction, a graphics package, and rule-based artificial
intelligence. The consistent use of verb functions, streamlined class architecture and
integration between classes simplify learning and using the OOPC class library, thus
making OOPC suitable for novice and professional programmer alike.