db_Vista III
Volume Number: 7
Issue Number: 2
Column Tag: HyperChat
db_VISTA III and HyperCard
By Joseph S. Terry, Jr., Adam R. Joyner, Ajalon Corporation
Note: Source code files accompanying article are located on MacTech CD-ROM orsource code disks.
This article is about database design using the network database model. This is
not an exhaustive explanation of the pro’s and con’s of the different database models. We
believe that the network model with relational database extensions is the best
combination for “serious” database systems. Serious database systems in this context
would be those systems that required maximum performance while handling very large
amounts of information.
Also, this article is about the need to have sophisticated database engines on the
Macintosh1 for developers using many different tools. And lastly, this article is about
standards and about the developer community choosing wisely among all technologies,
even those developed on other platforms. Someone, who writes an application should
have the confidence that another application can read certain “export” files they may
wish to create, without resorting to the “anyone can read ASCII” cop-out. Look at the
dbase2 file format in the DOS world. If your application can write dbase files almost
any other application can not only read, but probably make sense, of the files content
and find that information quickly. The Macintosh needs a general file interchange
format (GIFF). And in the future, with System 7.0 around the corner, We
developers-You and us-need to help Apple define more AppleEvents and useful ones too.
When our applications begin talking to each other let’s hope that its’ not a tower of
babel.
Figure 1. Database Levels
Database Models
There are three main database models: Relational, Network and Hierarchical.
The relational model is based on the theory of sets and derives power and
simplicity from this related notion. Much of the activity surrounding the relational
model is due to the ability to analyze relational structures readily with mathematical
tools. This is e specially true in query optimization, a field of study which has given us
very flexible ubiquitous SQL3 servers and SQL based databases.
Performance is not one of the variables in all those equations that can be
optimized so neatly. In the real world of daily computing the flexibility of the relational
model is proven, but it’s performance is guaranteed to be less than a network model
defined for the same problem definition. Of course, that flexibility can come in handy if
your users will continue to come up with queries you never imagined they would!
The network model is based on the association. First, you define independent
entities or record types, then the possible connections between them is defined through
Parent/Child associations or sets. All “links” of interest are predefined so that the
operations on the database stress navigation between bound data elements. Of course
some operations have to do with reading and writing data.
One thing to keep in mind is that you can define many different record types, but
they don’t actually come into existence (read - use disk space) until you create an
instance of a record type. This has very important implications (for performance and
space usage) and the “class-like” nature of a network view should be familiar to those
who have experience with Object-oriented languages. That doesn’t mean that a network
model database is Object-oriented. True Object-oriented databases (OODB) have other
features that characterize them, not the least of which is that certain operations or
transformations occur just by storing data into the database, without the program doing
the storing knowing anything about it. OODB’s do different things based on the
content/type of the information stored in them.
The hierarchical model is a specialization of the network model with the
additional constraint that each collection of specific parents and specific children are
distinct hierarchies. Children cannot be the parents of themselves or their parents
(although they can be parents to their grandparents).
The network model uses direct addresses of data to “relate” one record to
another. This “address” usually includes the file and the record within the file. The
network model rarely duplicates information, except when it will enhance
performance. As contrasted with the relational database model which duplicates key
field information (Sometimes this is good), and keys are the “key”, so to speak, to the
utility of the relational model. Keys are found in the record to be related, the key, and
the record that is the object of the relation. (see figures 2 and 3)
Figure 2 Relational Model
Figure 3 Combined Relational/Network Model
The shaded areas in figures 2 and 3 point out the overhead inherent in each
technology for a given problem. There are three record types: the customer record,
invoice record and line item record. For the relational model, Each customer will have
one customer record and several invoices, with several line items per invoice. To
relate invoices to a customer, a unique customer code (Account Number) is stored in
each invoice record as a key. To find the invoices for a customer, you would search the
index for this customer code. In the combined relational/network model, the customer
record contains a key field, but no other record need contain that field. One customer
has many invoices; one invoice has many line items. No redundant data is required
because record addresses relate one record to another. Access is direct and therefore
fast. If you need to change a record, you make only the one change, as no other records
are affected. This means that referential integrity is assured (the same logical data is
actually identical throughout the database).
Database technology is very complex. We don’t pretend to have given the best or
even a totally adequate overview of the issues involved in database design. Design,
whether in programming, architecture, dance, graphic arts or whatever is always the
hardest part, implementation is just a matter of fulfilling the promise of the design (If
it was promising at all).
Our Database
The example database which we are developing is called Technical Information
Management System or TIMS for short. TIMS was designed to maintain a database of
technical information contained in books, magazines, and journal articles. The original
TIMS design was for the text interface only of some other kinds of computers. We took
the exact same database and gave it a face lift and now we call it Hyper_TIMS.
Please note: the descriptions of the records/sets and other database definitions
used in this article about Hyper_TIMS are not in some C language pseudo-code. They are
written in the data description language (DDL) for db_VISTA III. These descriptions are
input to the db_VISTA schema compiler and that spits out a database dictionary that the