All Databases develop - 1992

August 92 - THE NETWORK PROJECT: DISTRIBUTED COMPUTING ON THE MACINTOSH

THE NETWORK PROJECT: DISTRIBUTED COMPUTING ON THE

MACINTOSH

GÜNTHER SAWITZKI

Distributed computing is the wave of the future, soon to come rolling onto the shores

of programming. Programmers should be prepared for the possibilities and challenges

that distributed computing will offer. The NetWork model proposes a design strategy

and provides a testbed implementation that enables you to explore and experiment with

distributed computing on the Macintosh. While this article may not help you write a

better application today, it will help familiarize you with the idea of distributed

computing so that when system support for it comes along, you'll be ready to take

advantage of it.

As computing evolves, we're rapidly moving from a reliance on discrete personal

computers and workstations to a new type of computing infrastructure--acomputing

environment. In a computing environment, applications will make massive use of

many partially coordinated or uncoordinated autonomous computing devices. That is,

one device won't necessarily know which application subtask any other device is

working on or when and how any other device is completing its particular subtask.

These autonomous devices will be connected by multiple threads of communication.

What's more, the computing environment of tomorrow will be continually changing,

with portable devices moving in and out and with new capabilities added dynamically.

Devices will change in time and will have varying availability. In short, distributed

computing in an environment with no guaranteed stability will become the order of the

day.

Visions like Apple's Personal Digital Assistant and the TRON Project give some idea of

what we'll see. The Personal Digital Assistant will be a small intelligent device that

will help you with some aspect of living and working; for example, it might be a smart

map leading you around in a town you're visiting, or a dietary assistant helping you

plan a week's meals, or a TV viewer helping you trace back a thread of interesting

news you've just become aware of. TRON will work the other way, making your

environment smart on its own; for example, the washing machine itself will place

orders for more detergent and will tell the warm water supply to diminish for a

moment because there will be hot wastewater that will feed a heat exchanger. Both

these visions will soon become reality in a distributed computing environment. What

distributed computing will mean for users is that they'll have access to the

considerable computing power that's typically left unused in today's computing setup.

Implementing a system for distributed computing is easy if you reduce or restrict the

availability of personal workstations to their users. The challenge addressed by the

NetWork Project is to make access to idle workstations possible while still

guarantee-ing users immediate access to their personal workstations. NetWork is a

minimal communication and management model designed to operate in this

environment. By handling communication and managing computing resources, it frees

the programmer to think about how to split up a task so that it can be done by multiple

workstations working on small pieces in an uncoordinated and asynchronous way.

NetWork is available on the currentDeveloper CD Series disc and via Internet for those

who want to try it out. This article describes the NetWork Project itself, considers the

types of applications that are most amenable to a distributed computing approach,

thoroughly examines the NetWork model, and then suggests how to implement a

NetWork program on the Macintosh. Because I'm a statistician I've included some

discussion of statistical underpinnings. I've presented this discussion separately,

though, so that if you don't find mathematics fascinating, you can skip it.

HISTORY OF THE NETWORK PROJECT

NetWork is a project of StatLab, the statistical laboratory at the University of

Heidelberg. StatLab was founded in 1984 to complement the existing mathematical

statistics research group by studying practical applications of advanced statistical

methods. We took a look at what was available as the hardware base for our work and

chose the Macintosh, but since no Macintosh was on the German market at that time,

we bought a Lisa. We've been developing our statistical software on Lisa and Macintosh

ever since. This eventually brought us into contact with Larry Taylor, representing

Apple's Advanced Technology Group in Europe.

During a November 1988 meeting, we discussed future perspectives in computing

with Larry. We tried to identify current gaps and obvious next steps. One thing we

could point to was the discrepancy between the amount of computing power we had

installed and the return it gave us. At that time, we were running an installation of

Macintosh Plus and Macintosh II computers, and the usual turnaround time for a

statistical simulation was one night. This was better than the turnaround time for the

same job on the IBM mainframe time-sharing system (about a week), but still it was

frustrating to have to wait so long while other computer resources lay idle. Just the

same, given the Macintosh's character as an absolutely devoted servant of one master,

how in the world could we find a way to share its computing power while still

guaranteeing reliable and efficient service for the Macintosh owner?

In December 1988 we had a visit from Bill Eddy, then head of the statistics

department at Carnegie Mellon University. In a lecture he mentioned that the CMU

people were annoyed at the discrepancy between installed computing power and the

return it gave them and were doing research on executing iterations asynchronously

(in an uncontrolled way) to make use of aggregated computing power. Until then, I'd

been thinking of the solution only in terms of distributed computing in acontrolled

environment. Bill emphasized that in the computing environment of the future,

computing time per se won't be expensive. In fact, in a network consisting of thousands

of CPUs, computing power will befree --if you can access it. This started me thinking

about how we could possibly make a distributed system work under these

circumstances--that is, in a large heterogeneous environment.

When we next met with Larry Taylor in February 1989, I claimed that we could build

a system for distributed computing based on the Macintosh philosophy of the absolute

priority of the user and at the same time able to cope with a large environment. Larry

agreed to support the project, and we formed a team consisting initially of Larry, me,

Reimer KÜhn and Leo van Hemmen of the Heidelberg Neural Network Research Group,

and Joachim Lindenberg, then a computer science student at Karlsruhe University.

The project started in May 1989. We called it the NetWork Project, a reference to the

fact that in the future the only measure of performance that will matter will be thenet

work done per unit of time , not cumulative computing time or other measures of

resource utilization. We gave ourselves six months to decide on the specifications and

build a working prototype of a distributed system that would fit a Macintosh

environment and be scalable up to some thousands of CPUs. AlthoughMacintosh was the

original development target, we did make sure that the system would run in any other

decent environment (DEC TM, UNIX®, what have you). We finished our first release

one week late in November 1989. As they say, the rest is history.

Worth mentioning is the fact that with NetWork's accelerated development schedule,

we didn't spend a lot of time on planning and administration. That's the nature of

progress sometimes. Fortunately, Apple's Advanced Technology External Research

Group had resources available to allocate to the project on the spot. Without this kind

of flexible support, the NetWork Project could not have succeeded.

CANDIDATES FOR DISTRIBUTED COMPUTING

Distributed computing will be a great boon to applications where computing power is

critical and where the computing task can be split into discrete subtasks. Such

applications include the following:

• compiling a new product using a superoptimizing compiler

• solving an optimization problem like placing chips on a board

• generating computer graphics, especially ray tracing

• performing optical character recognition

In these cases, processing may take too long on one particular machine, but if the

application can tap into the computing power available by sending out subtasks, the

processing can be completed in a much more timely manner.

Many applications that involve working on large data sets can benefit from additional

computing power, even in an environment where completion of a subtask is not

guaranteed. Such tasks include sorting with some appropriate merge/sort algorithm:

the global sort can benefit if a subset has already been sorted by another machine but

need not be affected if the result of the presorting is not available. The same applies to

searching and practically all major accounting tasks. Any statistical analysis based on

exponential families, like normal (Gaussian) distributions, can also benefit from

distributed computing: in these analyses you can calculate global sufficient statistics

from those of partial data sets, if available. Problems of this type are completely

splittable into subtasks and clearly are fine candidates for distributed computing.

But what about problems that have a stronger internal structure than those that are

completely splittable? What about iterative and recursive problems, or problems that

lead to pipeline processing or networks of data flow? We can't automatically assume

that these can take advantage of additional computing power in a distributed

environment where the completion of a subtask isn't guaranteed. Still, mathematical

theory can help us identify problems of this type that are good candidates for

distributed computing.

A SPECIAL CLASS: ASYNCHRONOUS ITERATIONSAs an example of problems with

a stronger internal structure than those that are completely splittable, we'll focus on

iterative algorithms. The trouble with running an iterative algorithm in a

nonguaranteed distributed environment is this: the outcome of iterations in one part of

the problem might critically depend on results from iterations in other parts, and the

result of a previous iteration may or may not be available for the next round. Even if

the original iteration converges to a correct result, we don't know whether the same

will hold true if the iterations are done asynchronously.

Suppose, for instance, we have a mapping to be iterated that operates on some

high-dimensional vector or matrix. To prepare for a distributed version, we restrict

the mapping to a subset by providing the full input but allowing the mapping to operate

Referenced by (3):