All Databases MacTech Vol 15-1999

Arrays Iterators Comparators

Volume Number: 15

Issue Number: 2

Column Tag: PowerPlant Workshop

Arrays, Iterators, and Comparators... Oh my!

by John C. Daub, Austin, Texas USA

A look at PowerPlant's Array Classes

No matter what you do in life, sooner or later you have to deal with data - it's

unavoidable. This is certainly true in software development. We need ways to read and

write data, make it persistent, add and remove, examine, and otherwise manipulate

information. Without data, there's not much point nor ability to do what we're doing.

Since the manipulation of data is so vital to software, it would stand to reason that

providing reusable means to work with data would be welcomed by software

developers.

There are many different forms of data manipulators available. There are end-user

solutions like FileMaker, database engines such as NeoAccess and OOFILE, and simpler

solutions such as the C++ Standard Library containers and PowerPlant's Array

Classes. If you haven't guessed, that's what this article is about - PowerPlant's Array

Classes. These classes provide PowerPlant and you with a Mac-savvy means of working

with lists of data. If you look through the PowerPlant sources themselves, you'll see

the Array classes used in many key places; chances are good you'll find yourself using

them throughout your own code as well. The Array Classes certainly don't fit the needs

of every situation, but in order to locate the place for them in your "Programmer's

Toolchest" read on to see what they have to offer.

Array Overview

The main purpose of the Array Classes is to provide a means for storing and working

with lists of data. The Array Classes are comprised of three components: Arrays,

Iterators, and Comparators. Arrays are an ordered collection of items. Iterators are a

means of sequentially traversing Arrays. Comparators are a means for comparing

Array elements. Used together, these components provide you with a powerful and easy

to use data management system - again, it's what PowerPlant uses itself (I'll provide

some examples of this throughout the article). Figure 1 shows the classes and class

hierarchy for the Array Classes.

Figure 1. The PowerPlant Array Classes hierarchy.

One item to note is that a couple classes listed above, LSharableArray and

LFastArrayIterator, are not found in the Array Classes folder within the PowerPlant

folder of the Metrowerks CodeWarrior Professional installation. These classes came

from the Metrowerks Constructor sourcebase, and consequently are found within the

Constructor Additions folder in the PowerPlant folder (at least as of this writing,

using CodeWarrior Pro 4). Nevertheless, they are still available for your use and

worth discussing in this article.

Arrays

As you might guess, Arrays are the core of the Array Classes, and LArray is the central

base class. LArray is an ordered sequence of fixed-sized items, not unlike a regular

C/C++ array. However unlike a regular C/C++ array whose size is static and

determined at compile time, an LArray is dynamic - you can add and remove items at

runtime. This is accomplished by storing the Array data within a Handle, which is

resized as data is added or removed using only as much memory as is required at the

time. An LArray is also unlike a regular array in that an LArray is one-based and not

zero-based; that is, the first item is at index one and not index zero. Other differences

between regular arrays and LArray is that an LArray can be iterated and sorted. We'll

discuss iteration and sorting later in the article.

One additional item of note is what can be stored within an LArray. The answer is:

almost anything. When you add an item to an LArray you do not store that item itself

but rather a copy of that item (deep in the bowels of LArray the internal storage

Handle is resized and BlockMoveData is called to actually add/copy the item into the

LArray). Due to this storage mechanism, you can store just about any type of data:

built in data types like char, int, long; other simple data types like Handle, Ptr,

OSType; simple data structures like Point, Rect, SPaneInfo; and pointers from

NewPtr, malloc, or new/new[]. What cannot be stored are more complex data such as

C++ objects. Objects cannot be stored for a few reasons.

First, recall that the internal storage of an LArray is a Handle, which is a relocatable

block of memory. C++ objects have some invisible structures attached to it to take

care of housekeeping details, such as vtables. These structures contain pointers to

specific areas in memory where your object is. If you store your object within a

Handle and the Memory Manager moves that Handle, your vtables do not move and

update with the object. The vtables are now invalid and chances are good that next time

you try to access members of that object you'll crash or have some other sort of

unpredictable behavior. Second, it's generally considered evil to perform bitwise

copies of objects (Cline 1995, FAQ 208), and that is exactly what BlockMoveData

does. This is why we have copy constructors and assignment operators, but

unfortunately there is no way to invoke those within LArray's copy mechanisms. But

even if they could be invoked, or if there was a case where the bitwise copy would be

ok (there certainly are instances of this), you still have problem one to contend with.

If you need to store objects in an LArray, you should instead store pointers to the

objects. LArray's can only store plain old datatypes (PODs).

Basic Functionality

If you haven't yet, and you have the ability to do so, open a copy of LArray.cp (I'm

using the version from CodeWarrior Pro 4). You'll notice the class is fairly sizable,

but it's certainly not unmanagable. Most of the functionality can broken down into a

few logical groups, and the ones you'll use most frequently will be the manipulators

and the inspectors.

The manipulators allow you to add, remove, and change Array items. To add an item to

an Array you can use AddItem() or InsertItemsAt(). AddItem() allows you to add one

item to the Array. InsertItemsAt() allows you to insert one or more items into the

Array at a specified index; all inserted items are set to the same value, and are

inserted contiguously. The difference between AddItem() and InsertItemsAt(), aside

from the ability to insert one item versus multiple items, is if the Array is unsorted,

AddItem() adds to the end of the Array (if the Array is sorted, AddItem() actually calls

through to InsertItemsAt(), and the item is added wherever the sort places it). The

bonus of AddItem() is a speed gain as it does not have to contend with index range

checking nor the possible overhead of maintaining a sorted Array. If your Array

doesn't bother with sorting and/or adding to the end is acceptable, you'll probably want

to use AddItem() over InsertItemsAt() for the speed gain.

Since you can add items, you also need to remove items. LArray provides the

RemoveItemsAt(), RemoveAllItemsAfter(), and Remove() methods for just this

purpose. RemoveItemsAt() removes a specified number of items starting at a given

index. RemoveAllItemsAfter() is a variation on RemoveItemsAt() that will remove all

items located after the specified index. Finally, given a particular item, Remove()

removes that item from the Array.

The remaining manipulators allow you to modify existing Array items.

AssignItemsAt() allows you to assign a (new) value to the item(s) you specify.

SwapItems() allows you to exchange the values of two items within an unsorted Array.

MoveItem() moves an item within an unsorted Array from one position to another. It

is important to note that MoveItem() works as if you first removed the item and then

inserted it - the index is relative to the Array, not the items within the Array. To

illustrate, suppose you have an Array containing five items and you wish to move the

item at index two to index four:

theArray->MoveItem(2, 4);

The action is simple to do, but the results might not be quite what you expect. Before

the move your Array might have looked like this:

Index: 1 2 3 4 5

Item: A B C D E

You might expect after the move for your Array to look like this:

Index: 1 2 3 4 5

Item: A C B D E

If MoveItem() was item-based, this would be true. It would have taken the second

item, "B", and placed it where the fourth item, "D", was, bumping up those after, "D

and "E". But notice a resulting effect: "B" is not at index four but rather index three!

This could certainly lead to unexpected behavior as the result doesn't quite match the

original intent. Instead, MoveItem() is index based so after the move your Array would

actually look like this:

Index: 1 2 3 4 5

Item: A C D B E

Now item two has been moved to become item four. It was as if item two was first

removed and the Array adjusted from that removal, the itermediate step leaving the

Array looking like this:

Index: 1 2 3 4

Item: A C D E

And then the item inserted at the new index (four). MoveItem() doesn't actually

remove then insert the item, but that's the essence of how the method works.

The other group of methods you'll commonly use will be the inspectors. The inspectors

enable you to obtain information about items in the Array or the Array itself.

ValidIndex() returns if the given index is valid for the Array or not. GetItemSize()

returns the size of the requested item. GetCount() returns the number of items in the

Array. FetchIndexOf() returns the index of the given item in the Array, if the item is

in the Array. And probably the most commonly used inspector is FetchItemAt(), which

returns to you a copy of the item at the given index.

LArray also contains many other methods. The majority of these methods are protected

methods used internally by LArray to do the voodoo that it do so well, but there are a

few other public methods such as: Sort(), to sort the Array; and GetComparator() and

SetComparator(), to access the Array's Comparator object; and of course the various

constructors and destructor for the class. Although you do not need to worry yourself

with these internal methods to begin using LArray, you should eventually familiarize

yourself with how all of the methods work so you can gain a complete and thorough

understanding of how LArray operates.

LArray Subclasses

Looking back at Figure 1, you see there are five subclasses of LArray provided by

Metrowerks in the PowerPlant distribution: TArray, LVariableArray, LRunArray,

TRunArray, and LSharableArray.

TArray is a simple template wrapper class that inherits from LArray. Looking in

TArray.h, you'll notice only a handful of LArray methods are overridden, and then only

those public methods that take items/data as an argument. This is done to provide you

with a degree of typesafety in using the Array classes. Notice the LArray methods that

take items/data as an argument pass the argument as a void*; this is a necessary evil

to allow LArray to store various types of data and be a general-purpose class.

However, the use of void* not typesafe and prone to errors. By using TArray you are

able to specify the type of data stored within the Array and can avoid errors that can

stem from using mismatched types. Furthermore, since the type is known you can pass

the item/data by reference instead of by address, which some would contend is a "more

C++" way to pass arguments. Moreover, since type information is known you do not

need to worry about passing the size of the data since TArray handles that for you. Look

again at the overrides in TArray and you'll notice they're actually overloads. There is

one side-effect to TArray caused by these overloads/overrides, but it's nothing to

worry about. Strictly speaking the TArray methods are hiding the inherited virtual

functions from LArray, and if your compiler can notify you of such situations (the

Metrowerks C/C++ compilers can do this) then you will receive some compiler

warnings to that effect. There is no need to worry about these warnings as the design of

TArray will work correctly. You can suppress the warnings by turning off that

compiler warning globally (e.g. turn it off in the compiler's preference panel) or on a

case by case basis, which is what PowerPlant itself does. Look at the top of any

PowerPlant header file for a class that uses a TArray, such as LView.h. You should see

a declaration similar to this:

#pragma warn_hidevirtual off

template class TArray;

#pragma warn_hidevirtual reset

The #pragma will temporarily turn off the compiler warning as you explicitly

instantiate the template class allowing your code to compile cleanly and quietly.

A feature that TArray has that LArray does not is operator[]. operator[] allows your

Array to function, in terms of notation, just like a regular C/C++ array. Don't forget

that PowerPlant Array's are one-based, so although the notation might make it look

like a C/C++ array, it's still a PowerPlant Array. The benefit of operator[] is a huge

speed gain over inspectors like FetchItemAt(), but in exchange for speed you are

returned a pointer to the data within the internal storage buffer (instead of a copy)

and there is no range checking performed on the given index. These trade-offs can be

viewed as positives and/or negatives for using operator[]; it just depends upon the

context for use. But regardless of how you view TArray::operator[], tread carefully to

avoid any problems.

LVariableArray is similar to LArray in almost every way except instead of storing

fixed-sized items, LVariableArray stores items of varying size (but all items must

still be of the same type). Probably one of the most popular uses for LVariableArray is

Referenced by (5):