All Databases MacTech Vol 08-1992

Paralation

Volume Number: 8

Issue Number: 7

Column Tag: Lisp Listener

The Paralation Model

A simple model comprised of only one data structure and three

operators

By Paul Snively, Sunnyvale, California

Note: Source code files accompanying article are located on MacTech CD-ROM or

source code disks.

We’re rapidly approaching perhaps the most interesting time in all of the

existence of personal computing. The winds of change are upon us, and this much is

clear: we programmers are going to have to adopt new programming paradigms if we

expect to survive into the future.

In terms of the hardware that we use, old classifications and distinctions are

being either blurred or obliterated on a daily basis. The line between high-end PC and

low-end workstation is becoming artificial, a matter of marketspeak only, as time

marches on. The machine that this magazine is devoted to has always had the capability

that is probably 2nd most-often used to designate a workstation as opposed to a

PC-networking capability. The most-often used designator, “performance,” is

something that can be achieved principally through changing processors. Nearly

everyone in the industry has heard about the architectures vying for

attention-Motorola’s 88000 family, MIPS, Sun’s SPARC, and IBM’s RS/6000, to name

only the ones that come immediately to mind. In addition, Apple has been investigating

RISC for some time.

It only stands to reason that another revolution will ensue as the average power

of even the most affordable machines goes from the order of the single-digit MIPS

(with the “high end” Macintosh IIfx being around 8.5, if memory serves me correctly)

to more on the order of the low-three-digit MIPS, say 120 on the “low” end of the

spectrum. If the new revolution doesn’t begin-if all we do when we go from 8.5 MIPS to

120 MIPS is recalculate our spreadsheets faster-we’ll have only ourselves to blame.

Regardless, something that should be fairly apparent in all of this is that the chips are

approaching their limits. I’m not referring to limits of the state of the art; those

change daily. Rather, I’m referring to the pace at which the capabilities of our silicon

foundries are approaching the limits of the laws of physics. We can only go so far with

smaller/faster/cheaper until Einstein, Bohr, and others step into muddy the waters.

Mechanical engineering gives way to quantum mechanics, as it were.

This may sound pretty dismal, but there is an obvious solution to the problem:

start building machines with more than one processor. If one blazing processor is good,

wouldn’t more be better? The answer, of course, is “yes, depending upon the nature of

the problem being solved,” and that opens a can of worms all its own. If you have a

parallel architecture, the way that you must go about creating software for it changes

accordingly, and most of us are not accustomed to programming for parallelism. God

knows I’m not. And it’s not as though there were a consistent mental model to apply to

the notion of parallel programming, either. For one thing, most of the effort that I’m

familiar with regarding parallel programming has been done in order to adapt an

already largely outmoded programming methodology-“structured programming,” the

holy grail of the ’70s-to parallelism. Research to integrate what I’ll call the holy grail

of the ’80s-“object-oriented programming,” which seems better suited to

programming in the large than structured programming alone-with parallel

programming still seems to be largely open, with few real-world implementations

available.

Parallelism has its own language, of course, with terms like SIMD, MIMD,

Shared Memory, coarse- and fine-grained, “massively,” data flow, and the like, and

has systems with names like Transputer, Sequent, Butterfly, Hypercube, Cray, and

Connection Machine. In terms of generally-applicable software to allow the

implementation of parallel systems, the ones that I’ve heard the most about are Linda

(as a model and environment) and Occam (as a language). Unfortunately for me, I could

never wrap my brain around either of them, although I’d like to give Linda another shot

some day.

In the meantime, there is another model that I think deserves a lot of attention,

because it provides you explicit control over the costs of both computation (work done

by the processors) and communication (how much wire and how many intermediate

stops must a signal take to get from one processor to another). It’s architecture

independent-it doesn’t care whether it’s on a MIMD, SIMD, Shared Memory, etc.

machine. Most importantly to me, it’s simple. The whole model is comprised of one,

count ’em, one data structure and three, count ’em, three operators. The model is also

base-language independent: there are at least two implementations in Lisp, and an

experimental implementation in C.

The model is called the Paralation Model, and was developed by Gary Sabot, a

senior research scientist at Thinking Machines, Inc., creators of the Connection

Machine, a massively-parallel computer capable of supporting up to 65,536

processors. It’s described in the book entitled The Paralation Model, published by The

MIT Press, ISBN 0-262-19277-2.

The principal data structure of the Paralation Model is the field. A field is simply

a collection of data that can be referred to by an ordinal index. Related fields comprise a

paralation, which is short for “parallel relation.” A paralation always includes an

index field. A paralation consists of sites, which are similar to the elements of an

array; the nth site of a paralation contains the nth field elements of the fields of the

paralation. The index field at that site has the value n. When you create a paralation,

you create its index field. Additional fields can then be added to the paralation, and each

of them contains a pointer to the index field.

Figure 1

Elements of fields in a given paralation that have a particular index are

guaranteed to be near each other-that is, the 173rd elements of two fields in the same

paralation are near, the 2nd elements are near, etc. This means that intra-site

communication is guaranteed to be cheap. Inter-site communication, on the other hand,

can be arbitrarily cheap or costly. By default, paralations are “shapeless;” there’s no

Referenced by (2):