All Databases MacTech Vol 07-1991

db_Vista III

Volume Number: 7

Issue Number: 2

Column Tag: HyperChat

db_VISTA III and HyperCard

By Joseph S. Terry, Jr., Adam R. Joyner, Ajalon Corporation

Note: Source code files accompanying article are located on MacTech CD-ROM or

source code disks.

Introduction

This article is about database design using the network database model. This is

not an exhaustive explanation of the pro’s and con’s of the different database models. We

believe that the network model with relational database extensions is the best

combination for “serious” database systems. Serious database systems in this context

would be those systems that required maximum performance while handling very large

amounts of information.

Also, this article is about the need to have sophisticated database engines on the

Macintosh1 for developers using many different tools. And lastly, this article is about

standards and about the developer community choosing wisely among all technologies,

even those developed on other platforms. Someone, who writes an application should

have the confidence that another application can read certain “export” files they may

wish to create, without resorting to the “anyone can read ASCII” cop-out. Look at the

dbase2 file format in the DOS world. If your application can write dbase files almost

any other application can not only read, but probably make sense, of the files content

and find that information quickly. The Macintosh needs a general file interchange

format (GIFF). And in the future, with System 7.0 around the corner, We

developers-You and us-need to help Apple define more AppleEvents and useful ones too.

When our applications begin talking to each other let’s hope that its’ not a tower of

babel.

Figure 1. Database Levels

Database Models

There are three main database models: Relational, Network and Hierarchical.

The relational model is based on the theory of sets and derives power and

simplicity from this related notion. Much of the activity surrounding the relational

model is due to the ability to analyze relational structures readily with mathematical

tools. This is e specially true in query optimization, a field of study which has given us

very flexible ubiquitous SQL3 servers and SQL based databases.

Performance is not one of the variables in all those equations that can be

optimized so neatly. In the real world of daily computing the flexibility of the relational

model is proven, but it’s performance is guaranteed to be less than a network model

defined for the same problem definition. Of course, that flexibility can come in handy if

your users will continue to come up with queries you never imagined they would!

The network model is based on the association. First, you define independent

entities or record types, then the possible connections between them is defined through

Parent/Child associations or sets. All “links” of interest are predefined so that the

operations on the database stress navigation between bound data elements. Of course

some operations have to do with reading and writing data.

One thing to keep in mind is that you can define many different record types, but

they don’t actually come into existence (read - use disk space) until you create an

instance of a record type. This has very important implications (for performance and

space usage) and the “class-like” nature of a network view should be familiar to those

who have experience with Object-oriented languages. That doesn’t mean that a network

model database is Object-oriented. True Object-oriented databases (OODB) have other

features that characterize them, not the least of which is that certain operations or

transformations occur just by storing data into the database, without the program doing

the storing knowing anything about it. OODB’s do different things based on the

content/type of the information stored in them.

The hierarchical model is a specialization of the network model with the

additional constraint that each collection of specific parents and specific children are

distinct hierarchies. Children cannot be the parents of themselves or their parents

(although they can be parents to their grandparents).

The network model uses direct addresses of data to “relate” one record to

another. This “address” usually includes the file and the record within the file. The

network model rarely duplicates information, except when it will enhance

performance. As contrasted with the relational database model which duplicates key

field information (Sometimes this is good), and keys are the “key”, so to speak, to the

utility of the relational model. Keys are found in the record to be related, the key, and

the record that is the object of the relation. (see figures 2 and 3)

Figure 2 Relational Model

Figure 3 Combined Relational/Network Model

The shaded areas in figures 2 and 3 point out the overhead inherent in each

technology for a given problem. There are three record types: the customer record,

invoice record and line item record. For the relational model, Each customer will have

one customer record and several invoices, with several line items per invoice. To

relate invoices to a customer, a unique customer code (Account Number) is stored in

each invoice record as a key. To find the invoices for a customer, you would search the

index for this customer code. In the combined relational/network model, the customer

record contains a key field, but no other record need contain that field. One customer

has many invoices; one invoice has many line items. No redundant data is required

because record addresses relate one record to another. Access is direct and therefore

fast. If you need to change a record, you make only the one change, as no other records

are affected. This means that referential integrity is assured (the same logical data is

actually identical throughout the database).

Database technology is very complex. We don’t pretend to have given the best or

even a totally adequate overview of the issues involved in database design. Design,

whether in programming, architecture, dance, graphic arts or whatever is always the

hardest part, implementation is just a matter of fulfilling the promise of the design (If

it was promising at all).

Our Database

The example database which we are developing is called Technical Information

Management System or TIMS for short. TIMS was designed to maintain a database of

technical information contained in books, magazines, and journal articles. The original

TIMS design was for the text interface only of some other kinds of computers. We took

the exact same database and gave it a face lift and now we call it Hyper_TIMS.

Please note: the descriptions of the records/sets and other database definitions

used in this article about Hyper_TIMS are not in some C language pseudo-code. They are

written in the data description language (DDL) for db_VISTA III. These descriptions are

input to the db_VISTA schema compiler and that spits out a database dictionary that the

Referenced by (5):