Iterating for Profit and Pleasure
Volume Number: 16
Issue Number: 5
Column Tag: Programming
By Neil Mayhew, Calgary, AB
Using iterators to tap into the power of the
Standard Template Library
A Brave New world
Iterators are not a new idea. These lightweight objects provide a convenient way to step
through the elements of a collection. Likewise, the Standard Template Library has been
around for a while now, although many people have still to explore its depths.
Iterators are a fundamental part of STL, and an understanding of their role is vital for
effective use of the Library. Many of the data structures that we work with every day
are 'virtual' collections, and by writing our own STL-compatible iterators for them
we can make use of many powerful algorithms from STL.
As an example of custom iterators, this third and final article presents a set of classes
that iterate through the items in a MacOS folder. These mesh with algorithms from STL
to produce an application that checks the System Folder for additions and changes to its
extensions and control panels since the last time the application was run.
Developers already familiar with STL and its use of iterators may wish to skip to the
first implementation section, "Walking a Folder".
In search of elegance
In the most general sense, an iterator is an object that records an index into a
collection of other objects, and allows those objects to be retrieved for processing. Of
course, it must also be possible to advance the index forwards, and possibly
backwards, to reference other members of the collection. These actions are invoked by
calling methods of the iterator object.
An example of the effective use of iterators is in the reading of run-length encoded
pixel data. It is possible to define an iterator that presents the illusion of a rectangular
array of pixels, whereas this array is a virtual collection that is computed on the fly
from the run-length-encoded data. The iterator's internal storage (instance
variables) record a pointer to the current run and a count of the number of pixels still
to be read from that run.
There are of course many ways to design a set of methods for an iterator. For example,
next() and previous() methods can be used for advancing forwards and backwards
through the associated collection, and atEnd() can be used to test for the availability of
further elements. Unfortunately, a lack of standards in this area is a great hindrance
to code reuse. Code that uses iterators from a given source cannot easily be adapted to
use iterators from a different source. Likewise, programmers often find it easier to
develop their own, simple-minded solutions to a problem than learn how to use yet
another set of iterator conventions.
Yet C has some very powerful iterators of its own that many other languages lack:
pointers. These are iterators in the sense that they step through a collection (an array
or linked-list) without additional support. There is no need for a separate reference to
a collection that must be used with a subscripting operation when retrieving elements.
Countless algorithms owe their elegance to the use of pointers instead of arrays and
subscripts. Learning to use pointers effectively is therefore one of the key
requirements for becoming a skilled C programmer.
A standardized approach
Since C already has a well-established paradigm for the use of iterators, it makes
sense to standardize on this model in C++. We discussed smart pointers in an earlier
article. These are objects that have the syntax and semantics of a pointer, through
overloading the *, -> and ++ operators. A standardized iterator is therefore a special
case of a smart pointer.
The advantage of this approach is that code may be written in a style that is very
similar to the one that would be used with raw pointers. This makes the code easy to
write and easy to understand, and yet considerable sophistication may be present
'under the surface' within the iterators themselves. There need not be any hidden
pitfalls, however: if the semantics of the iterator are clearly defined, and consistent
with the usual behavior of pointers, then client code will work as expected.
In fact, code using smart pointers for iterators may often be safer than equivalent code
using raw pointers. For example, a smart pointer is usually capable of performing
bounds checking, in a debug build at least. Or, awkward and error-prone pointer
manipulations can be taken care of in one place, inside the iterator, and client code is
free from having to worry about such details.
Mastering new idioms
A subtle characteristic of the smart-pointer approach to iterators is that they will be
passed around by value, since this is how regular pointers are treated. These iterators
therefore need to be small, usually just 4 or 8 bytes; any bigger than this and client