July 93 - James Coplien’s Advanced C++
James Coplien’s Advanced C++
Eric Rose
Advanced C++ is an excellent book for the C++ programmer who is interested in
object-oriented programming.
Aspects of the traditional issues of software design and reuse in an object-oriented
programming language are covered. But Coplien goes further, taking the model of more
dynamic languages and applying it to C++. He shows how one can design conventions in
C++ that support incremental development, garbage collection, runtime
polymorphism, and dynamic object loading. The book is filled with a diversity of ideas
and topics, programming techniques and idioms: food for thought for the interested
reader.
First Impressions
One is confronted with a myriad of C++ books today. They sit on the bookshelf in the
Computer Languages section of the bookstore, and most of them look the same. This
problem seems to plague all technical books, especially those on programming
languages. How can one evaluate one of many books about an unfamiliar language? This
is the dilemma I was confronted with last year at the bookstore. After selecting
Stroustrup's C++ book (the new edition),I wanted to have another reference book, so I
picked one called Advanced C++ by James Coplien. After all, most language books seem
to stop teaching the language just as it becomes interesting, so even if Coplien's book
was slightly advanced, it would be a good choice. Besides, my employer was buying.
Later, as I read the book, I found that I became completely lost during the fifth chapter.
Obviously this book would require a more thorough reading. Coplien's book is about
C++, but it teaches object-oriented programming technique as well. With C++,
Coplien ventures into many of the aspects of experimental, dynamic languages such as
Smalltalk and Self. By carefully reading and re-reading the chapters, often pausing to
ponder over the examples, I began to comprehend them. As I learned C++, chapters of
the book began to make more sense. This is one characteristic of a good book-it grows
with the reader.
TOPIC SUMMARY
Chapter 1, "Introduction," introduces the design philosophy of C++ as well as a short
history. Coplien explains how C++ was conceived and how it grew over the years as it
was used. Object-oriented programming has grown with C++, and there has always
been a tension between keeping the language simple and adding new features. For
example, rather than add new keywords many of the existing ones, such as static or
virtual, were reused. This kept the number of keywords low, but added to the
complexity of the compiler and made it harder for programmers to learn. Another
tension in C++ is that of providing for the programmer's needs with language features
without restricting the implementation of the language or the compiler.
Chapter 2, "Data Abstraction and Abstract Data Types," is a class-oriented
introduction to C++. Classes, constructors and destructors, initialization, scoping,
const, and pointers to members are introduced in this chapter. Many of the differences
between C and C++ that are not strictly object-oriented in nature are reserved for
Appendix A.
Chapter 3, "Concrete Data Types," describes how to write objects that behave as
built-in data types. Scoping and access control, the semantics of overloading,
operators, and type conversions are discussed. Coplien also introduces the idea of
reference counting along with several implementations: using handle classes or
counted pointer objects. Bootstrapping reference counting to an existing class is also
covered. The orthodox canonical form, introduced in the beginning of the chapter, is a
commonly used format for objects in C++.
Chapter 4, "Inheritance," is an explanation of single inheritance. Some of the
semantics of object inheritance, such as the order of instantiation and initialization,
member access (public/protected/private), pointer conversion, and passing
parameters to base class constructors are considered. The concept of a virtual function
is reserved for chapter 5; instead Coplien implements a set of classes with type
selector fields. This provides a good contrast with the next chapter, where the same
example is rewritten using virtual functions to eliminate the type member variable
and corresponding case statements.
Chapter 5, "Object-Oriented Programming," a long chapter, introduces many new
ideas. This is where the terminology becomes complicated. Coplien will introduce an
idea with its term, such as a virtual function, and will go on to build even more
abstract ideas from it. He spends several sections explaining the idea of a virtual
function and its purpose, including some explanation of the runtime penalty for
invoking one. Many other C++ books will gloss over this idea with a half-baked
example. Advanced C++ treats the topic thoroughly and is careful to explain the
implications and restrictions of virtual function calls.
Virtual destructors (why they are useful) and scoping are included. Coplien then uses
the idea of a virtual function to introduce the idea of a pure virtual function as a
mechanism to create an abstract base class. The Envelope/Letter idiom is reintroduced
as a powerful way to extend the idea of a single class. By using two classes, one that
contains and controls access to the other, more dynamic functionally can be achieved. It
is reimplemented using the new concepts of inheritance and delegation using virtual
functions.
The chapter continues by giving several implementations of virtual constructors.
There is also an extension to the Envelope/Letter idiom to allow variable-sized
objects, and one for delegation by overloading the "->" operator. Functors, objects that
behave like functions (by overloading the "()" operator) are introduced, with a good
example, and the chapter ends with a discussion of the uses and pitfalls of multiple
inheritance. This is a good introduction to some of the problems created by allowing
multiple inheritance in a flexible language such a C++. Casting a pointer to a
multiply-inherited object can, in some situations, change the value of the pointer.
This is a potentially nasty surprise for those native C programmers who are
accustomed to loose pointer casting conventions.
Chapter 6, "Object-Oriented Design," introduces some of the theory involved in
object-oriented programming. Object relationships and iconic representations for
them are shown. There is an excellent discussion of some of the problems of using
inheritance to solve type problems. Some designers will use inheritance not because it
implies a type relationship, but to gain functionality for an unrelated object. Others
will use inheritance with homonymic types, types that share a common set of
operations but differ semantically (usually one has more restricted functionality than
the other). Coplien distinguishes between proper and improper uses of inheritance,
with good examples. Inheritance with addition and cancellation are considered. Coplien
also brings up the topic of public data in objects, and relates it to the problems of
inheritance with class independence. At the end of the chapter are rules of thumb for
subtyping and inheritance .
Chapter 7, "Reuse and Objects," covers some of the software issues surrounding code
reuse. Coplien stresses that object-oriented systems must be consciously designed for
reuse. Four code reuse methods, from preprocessor macros to templates, are proposed.
Often a system that is designed with reuse as its highest priority cannot fulfill the
basic goals that were intended, so it is important to consider reuse, but perhaps not as
the primary factor. At the end of the chapter are some code reuse generalizations.
Reuse is not solely an object-oriented programming issue. In fact, code reuse is more
dependent on good documentation and indexing facilities than the language used. A CASE
system can enhance reusability for any large software project. While C++ provides
several reuse mechanisms, such as inheritance, templates, and macros, reuse must be
designed into the class hierarchy. Sometimes reuse and functionality are in conflict
with one another. Also, good software engineers should consider the level of generality
that the system will support. Class libraries are perhaps the best of example of a
system designed with reuse as a principal goal. Stroustrup is quoted as saying "Code
must be usable before it can be reusable.
Chapter 8, "Programming with Exemplars in C++," introduces the idea of an
Exemplar. Exemplars allow more dynamic programming by making the class
constructor private (or protected) and providing a "make" virtual function with a
global "exemplar" object. To create a new object, one applies "make" to the exemplar.
By using a virtual member function that is not a constructor, one can invoke make on
any object to clone it, when only a pointer (possibly to a base class) is available. The
exemplar can be either a global pointer or a class variable (static member).
Exemplars can be used to simulate virtual constructors by having an Abstract Base
Exemplar examine the data and return the appropriate new object. By moving all
member functions signatures to the base class as virtual functions (Inclusion
Polymorphism) one can handle objects by using pointers to their base classes. This
simulates a more dynamic style of object referencing. Frame-based programming, in
which messages are dispatched to subclasses using a single "doit" function, is more
dynamic but is subject to performance problems and requires a good error recovery
subsystem.
Inclusion Polymorphism is a good idea, but requires more software support to be
feasible. Putting the member functions of subclasses in a common class make a system
inflexible, and requires substantial recompilation when an interface is changed,
making incremental development difficult. Coplien is aware of these problems and
gives some program administration tips to help. A large project would require either
strict conventions or a source preprocessor. It is important to remember that any
software system in C++ will require style and use conventions. As one pushes C++
into the realm of the dynamic these conventions become important. Abstract base
exemplars need some information about their subclasses to implement virtual
constructors. The idea of an Autonomous Generic Constructor allows abstract base
exemplars to keep a list of their subclasses and return the first one that can
successfully construct itself from the input data.
Chapter 9, "Emulating Symbolic Language Styles with C++," describes how, by using
Envelope/Letter classes with Exemplars to create what Coplien calls the symbolic
canonical form, one can create a system that supports incremental development,
dynamic loading and reloading, and garbage collection. Coplien provides
system-dependent code that works on a Sun (SunOS 4.0, it seems) with the AT&T USL
C++ Release 3 compiler. To support dynamic loading in C++ with objects, one needs
procedures to load new virtual functions and to change object formats. All loadable
functions must be virtual, because when the new one is loaded, one edits the
virtual-function table (vtbl) and replaces the pointer to the old function with that of
the new. One also needs a way of indexing the vtbl, so that the slot for each virtual
member function is known. Loading an object with a new data format is more difficult.
It requires that one keep a list of all objects and apply a conversion-function to them
(called cutover) when the new object is loaded. There is also some trickiness with the
order in which the operations are performed, and one must assure that cutover is
applied only once for each object. Because the overhead to support these mechanisms
can be cumbersome, one would only want to implement a system this way if the
dynamic properties are important. Unfortunately, dynamic loading is implementation
dependent.
Coplien also shows how to implement garbage collection using a combined
mark-and-sweep and Baker's algorithm (semispace copying). The algorithm requires
each class to allocate from a memory pool and each object to have mark and in use bits.
When free memory becomes low, first the objects in the exemplar's master list are
marked, then the memory pool is scanned and the unmarked objects in use are
reclaimed. The disadvantage of this system is the fixed-size memory pools. The chapter
ends with an implementation of multi-methods using the symbolic canonical form.
Chapter 10, "Dynamic Multiple Inheritance," is a short chapter about implementing
multiple inheritance using pointers to base classes and delegation. Apparently this was
one of the methods used before language support for multiple inheritance was added to
C++.
Chapter 11, "Systemic Issues," is a collection of topics not directly related to the
other chapters. There is a short discussion of modules, frameworks, and software
libraries. "Dynamic System Design" is a section on the use of object in a multithreaded
environment and the design issues this raises, such as error recovery, scheduling,
separate name spaces, and inter-object communication. This is a good summary of the
software design idioms that are useful in a multithreaded or multiprocessor context.
OVERALL IMPRESSIONS
Every object-oriented language has its own terminology. One of the difficulties in
learning C++ is the vocabulary. Coplien makes frequent use of the terms specific to
C++, but often mentions those used in other languages. It seems that he invents some of
his own as well. Understanding the terminology is critical to understanding the ideas
he presents. As a result, one quickly becomes lost by casually reading or "skimming
this book. Each chapter builds on the ideas and idioms of the previous chapters, and
often will make reference to or reimplement idioms introduced earlier.
A major theme of Advanced C++ is the conflict of static typing with dynamic
programming. Coplien is interested in how a C++ programmer can write code to
simulate valuable features of dynamic object-oriented languages such as Self,
Smalltalk, or CLOS. For example, the idea of a virtual constructor is prevalent
throughout the book. A virtual constructor is an object constructor that evaluates the
data provided and returns the appropriate object. Thus what type of object is created is
not known until runtime. To accomplish this, a base class must have some information
about its derived classes. Coplien shows several ways to implement the idea of a
virtual constructor in C++, including Letter/Envelope classes and Abstract Base
Exemplars.
Examples
One aspect of the book I like is that Coplien does not lock himself into one mode of
solving problems. In the chapter on code reuse (chapter 7), he presents four different
mechanisms to achieve software reuse. Examples are used well in this book. They are
short enough to fit in a few pages, yet complex enough to warrant careful reading. The
text makes interesting comments on the examples. Some programming books do not go
into detail about the tradeoffs involved in the coding. Coplien usually explains what the
limitations of his examples are. In the beginning of the book Coplien writes C and its
equivalent C++ code side-by-side to show the usefulness of the C++ extensions. This
works well to show the usefulness of classes over structs as well as constructors and
destructors over C init/destroy functions. Longer examples are relegated to the ends of
chapters. At the end of the book are several complete programs; however, none are too
long or complicated to type in and run.
Appendices
Coplien uses his appendices to introduce code samples and for short tangential topics
that are interesting but not related directly to object-oriented programming in C++.
For example, Appendix D demonstrates some of the problems with bitwise copy of
objects, and explains why member-by-member copying (sometimes called "deep
copying) is not always the correct solution.
Appendix A, "C in a C++ Environment" covers converting C programs to a style more
like C++. Some of it involves the conversions necessary to take traditional Kernighan
and Ritchie C to ANSI-C. (C++ is almost ANSI-C compliant.) It also covers how to use
const, interfacing with C libraries, sharing header files between the two languages,
any how names and object data formats are represented in a C environment. This
chapter is a good summary of some of the non-Object-Oriented aspects of C++.
Appendix C, "Reference Return Values from Operators" clarifies the concept of a
Reference, especially as a return value from an operator. References often confuse
novice C++ programmers who are accustomed to pointers in C. Appendix F,
"Block-Structured Programming in C++" explains how to write a C++ program using
blocks, or scopes, such as in a Pascal or Modula-2 program.
Conclusions
A few weeks ago I found myself again in the computer science section of the bookstore.
As I browsed through a software magazine, I noticed a book review, which stated that
"Advanced C++ is a classic, a must-have on the shelf next to Stroustrup's C++ Primer
or the Annotated C++ Reference Manual (ARM)".
I concur. For the serious C++ programmer the book is a must-have. It is full of
interesting ideas and clever techniques that extend the power of the language. The only
problem I see is that using any of these techniques for a large system will require
some sort of preprocessing to aid in the generation of the support for each class. The
section on writing dynamically loadable (and reloadable) objects demonstrates the
programming skill and depth of understanding of the author. James Coplien is
well-versed not only in C++ and its implementation, but in Object-Oriented theory
and practice. I highly recommend this book. -