All Databases develop - 1995

December 95 - Getting Started With OpenDoc Storage

Getting Started With OpenDoc Storage

Vincent Lo

OpenDoc's structured storage model is an innovative departure from the traditional

storage scheme. As you make the move into OpenDoc development, you need to

understand the new storage model and its implications for the way data is stored and

retrieved. This article introduces the new concepts and policies you'll need to know in

order to use OpenDoc storage effectively.

In the traditional Macintosh user model, each application creates and maintains its own

documents, storing each document in a separate file. A file has one creator signature

and one file type, identifying the application it belongs to and the kind of document it

contains. In OpenDoc, by contrast, a document can have multiple parts, created and

maintained by different part editors (called part handlers in earlier versions of

OpenDoc), which are analogous to the standalone applications of the traditional model.

Because all of a document's parts are stored together in the same container (usually

corresponding to a file), there has to be a way for separate part editors to share access

to the same container without interfering with each other.

OpenDoc meets this need by providing a structured model for persistent storage (that

is, for storing data from one session to the next). Each part is given its own storage

unit in which to store and retrieve data. The part can thus operate as a standalone

entity, independent of other parts and their storage. OpenDoc maintains all of the

storage units and notifies each part when to read or write its data.

The same techniques that are used in dealing with persistent storage also apply to the

various forms of data interchange between part editors, such as the Clipboard, drag

and drop, and linking. Because all of these mechanisms use the same data storage

medium (the storage unit), they all work essentially the same way from the part

editor's point of view. For example, a part uses the same API calls to copy data to the

Clipboard that it would use in writing the data to a file container. The same is true for

drag and drop and for linking. Thus, once you learn how to work with OpenDoc storage

units for file storage, you can use the same techniques to implement data interchange

as well.

This article assumes that you're already familiar with basic OpenDoc concepts and

terminology. If you need a quick introduction or refresher, see the article "The

OpenDoc User Experience" in develop Issue 22. You can find additional information on

some of OpenDoc's technical basics in the articles "Building an OpenDoc Part Handler

in Issue 19 and "Getting Started With OpenDoc Graphics" in Issue 21. Developer

releases of OpenDoc include the definitive documentation, the OpenDoc Programmer's

Guide and OpenDoc Class Reference. Developer releases are available through a number

of different sources, or you can request the latest release at AppleLink OPENDOC or at

opendoc@applelink.apple.com on the Internet. The source code in this article is

excerpted from a sample part included with the developer release.

Because OpenDoc was developed jointly by a consortium of companies including Apple,

IBM, and Novell, its interfaces are designed for cross-platform compatibility, using

IBM's platform-independent Standard Object Model (SOM). OpenDoc method

definitions, including the ones in this article, are commonly written in a

language-neutral Interface Definition Language (IDL). The SOM compiler converts

these into equivalent language-specific declarations for whatever source language you

happen to be using. The method definitions shown in this article, for instance, are

taken from the OpenDoc interface file StorageU.idl. To use these methods in your

program, you must include the corresponding language-specific binding file (such as

StorageU.xh for a C++ program).

DRAFTS, DOCUMENTS, AND CONTAINERS

The OpenDoc classes responsible for providing storage capabilities are ODContainer,

ODDocument, ODDraft, and ODStorageUnit. Collectively, a set of subclasses derived

from these four is known as a container suite. A containerrepresents the physical

storage medium in which a document is stored, such as a disk file. Different container

suites share the same API, but may use different low-level storage mechanisms and

operate on different physical storage media. For example, the Bento container suite,

which will be shipped with OpenDoc 1.0, supports both file containers and in-memory

containers. A part editor can thus use the same code to store a part's data either to a

file or in memory.

A single container may contain one or more documents, each of which in turn can

include one or more drafts. A part ordinarily works with a draft, rather than directly

with a document or its container. Each draft is a "snapshot" representing the state of

the document at a particular point in its development. Together, the drafts embody the

history of the document over time.

A part may need to interact with its draft for a variety of reasons:

• Persistent objects -- Every persistent object (such as a part, a frame,

or a link) is created by a draft.

• Data interchange -- A part asks its draft to copy transferred objects to

and from a data-interchange container, such as the Clipboard or a

drag-and-drop container.

• Linking -- A part uses its draft to create link specifications and copy data

to and from link objects.

• Permissions -- A part may need to find out whether it's allowed to write

to the draft.

• Scripting -- A part gets its scripting-specific identifier through its

draft.

STORAGE UNITS

The basic entity of a container suite is the storage unit. Every persistent OpenDoc

object has a storage unit in which to store and retrieve its data. Figure 1 shows a

typical example.

Figure 1. Structure of a storage unit

A storage unit consists of one or more properties, each of which in turn is associated

with one or more values containing the data itself. The storage unit shown in Figure 1,

for instance, has properties named kODPropContents, kODPropPreferredKind, and

kODPropDisplayFrames; the kODPropContents property has values of types

kTextEditorKind and kODMacIText.

Using multiple values allows a property to represent the same data in different forms.

For example, a property holding a drawing may have three values representing the

same data: one as a Macintosh PICT, one as a Windows metafile, and one in TIFF format.

Although OpenDoc cannot enforce the principle, part developers are urged to use

multiple values within a property only for multiple representations of the same data,

not for storing unrelated data items.

The property names and value types shown in Figure 1 represent string constants of

type ODPropertyName and ODValueType, respectively. For cross-platform

extensibility, both of these types are defined as equivalent to an ISO string instead of a

traditional Macintosh OSType: that is, they're 7-bit ASCII null-terminated strings, as

specified by the International Standards Organization (ISO). The string values

themselves are expected to follow a standard naming convention: for instance, the

constants kODPropDisplayFrames and kODWeakStorageUnitRefs stand for the strings

"OpenDoc:Property:DisplayFrames" and "OpenDoc:Type:StorageUnitRefs",

respectively. The OpenDoc interface files StdProps.idl and StdTypes.idl define name

constants for standard properties and value types; any property and type names that

you define for yourself should follow the same naming conventions.

FOCUSING A STORAGE UNIT

The OpenDoc operations for manipulating values don't explicitly identify the value to

operate on. Instead, you have to focus the storage unit on the desired property or value

before invoking the operation. The method for setting the focus is defined in class

ODStorageUnit as follows:

ODStorageUnit Focus(in ODPropertyName propertyName,

in ODPositionCode propertyPosCode,

in ODValueType valueType,

in ODValueIndex valueIndex,

in ODPositionCode valuePosCode);

This allows you to set the storage unit's focus in a variety of ways:

• to a property by name

• to a property by position relative to the current property

• to a value by type within a property

• to a value by position within a property

• to a value by position relative to the current value

Properties and values are ordered within the storage unit according to the sequence in

which they were added. Values within a property are indexed from 1: that is, the first

value has index 1, the second index 2, and so on. Positions relative to the current focus

are specified with a position code. The same position code can refer to either a

property or a value, depending on the current focus. For instance, if the storage unit is

currently focused on a property, the position code kODPosNextSib designates the next

property; if the current focus is on a value, kODPosNextSib designates the next value.

Another way to set the focus of a storage unit is with a storage unit cursor:

ODStorageUnit FocusWithCursor(in ODStorageUnitCursor cursor);

The cursor identifies a property by name or a value by its property name and its index

or value type. Once created (with method CreateCursor or CreateCursorWithFocus of

class ODStorageUnit), the same cursor can be reused multiple times to refer to

properties or values within the storage unit.

Once you've focused a storage unit, you can create a storage unit view to refer to the

same property or value again later without having to reset the focus:

ODStorageUnitView CreateView();

The view responds to all the same access methods as the storage unit itself, but applies

them to the property or value that had the focus at the time the view was created,

rather than at the time the method is invoked. It does this by automatically resetting

the underlying storage unit to the original focus, then forwarding the method call to

the storage unit for processing.

MANIPULATING VALUE DATA

The operations for manipulating data within a storage value are stream-based, very

much like reading or writing to a sequential file. Each value has a current offset

position that controls where the next operation will take place, similar to the file

mark in the Macintosh file system. In addition to reading and writing data sequentially,

you can also insert or delete data at the current offset position.

Class ODStorageUnit defines the following methods for manipulating value data:

void SetOffset(in ODULong offset);

ODULong GetOffset();

void SetValue(in ODByteArray value);

ODULong GetValue(in ODULong length, out ODByteArray value);

void InsertValue(in ODByteArray value);

void DeleteValue(in ODULong length);

The ODByteArray structure is used to pass data to or from a storage unit.

typdef struct {

unsigned long _maximum; /* size of buffer */

unsigned long _length; /* number of bytes of actual data */

octet* _buffer; /* pointer to buffer containing the */

/* data */

} _IDL_SEQUENCE_octet;

typedef _IDL_SEQUENCE_octet ODByteArray;

(Anoctet is simply the SOM term for an 8-bit byte.) Listing 1 shows how to

manipulate one of the values shown in Figure 1.

Listing 1. Adding data to a value

/* Focus the storage unit, using property name and value type. */

storageUnit->Focus(ev, kODPropContents, kODPosUndefined,

kTextEditorKind, 0, kODPosUndefined);

/* Set up the byte array. */

ODByteArray ba;

ba._length = size;

ba._maximum = size;

ba._buffer = buffer;

/* Set the offset. (This step isn't really needed here, since the

Focus operation automatically sets the offset to 0. It's included

for illustrative purposes only.) */

storageUnit->SetOffset(ev, 0);

/* Add the value. */

storageUnit->SetValue(ev, &ba);

STORAGE UNIT REFERENCES

Storage unit references allow one storage unit to refer persistently to another. A part

can use this mechanism to access information stored in a storage unit (which may or

may not belong to it) across multiple sessions. A draft thus consists essentially of a

network of storage units connected to each other with persistent references.

When a storage unit is cloned (copied to a data-interchange container), any other

storage units it references are cloned along with it. Since all storage units in a draft

are interconnected, cloning any one of them may cause the whole draft to be cloned.

Because this may be an expensive and unnecessary operation, OpenDoc provides two

levels of storage unit reference: strong and weak. Only strongly referenced storage

units are copied when the unit that refers to them is cloned.

In Figure 2, frame A refers strongly to part A, which refers strongly to frame B,

which refers strongly to part B. Thus if frame A's storage unit is cloned, all four

storage units will be copied. On the other hand, cloning frame B's storage unit will

copy those for frame B and part B only, since frame B's reference to frame A is weak

rather than strong.

Figure 2. Strong and weak storage unit references

An object can use strong storage unit references to refer to other objects that are

essential to its functioning, such as embedded frames. Weak references are mainly for

informational or secondary purposes: a part might use them, for instance, to refer to

its display frames.

LIFE CYCLE OF A PART

Figure 3 shows the life cycle of a part and its associated storage unit. Because the

part's lifetime may span multiple editing sessions, it must be able to externalize its

internal state (save it to persistent storage) in order to reconstruct itself from one

session to the next. The part's InitPart method, called when the part is first created,

receives a storage unit as a parameter. The Externalize method can then use this

storage unit to save the part's state. Once externalized, the part can be released from

memory and later reconstituted from external storage by a method named

InitPartFromStorage. Unlike InitPart, InitPartFromStorage can be called multiple

times during a part's lifetime, whenever the part needs to be reconstructed from

external storage.

Figure 3. Life cycle of a part

Notice that externalizing a part is not the same as cloning it. Externalizing means

writing the part's data to persistent storage, using a storage unit associated with the

draft in which the part resides; cloning is transferring the part's data to a

data-interchange container such as the Clipboard, using a storage unit associated with

the container. Although the two operations are different, they're both based on the

same ODStorageUnit API and can share much of the same code.

Another related operation is purging, which reclaims memory space by eliminating

unnecessary runtime data structures such as caches. Because such structures can

usually be reconstructed from persistent data, many OpenDoc programmers believe

that a part's Purge method should always begin by externalizing the part's data before

deleting unused or unnecessary memory. While this might sound plausible in

principle, the externalization operation itself requires additional memory -- the very

thing that's in short supply during purging. As a general rule, the Purge method should

avoid invoking externalization unless it's absolutely necessary.

All persistent objects carry a reference count, enabling OpenDoc to identify unused

objects and reclaim the memory they occupy. The Acquire method, which creates a

reference to a specified object, increments the object's reference count; the Release

method destroys a reference and decrements the reference count. When the reference

count goes down to 0, OpenDoc can safely delete the object from memory.

INITIALIZATION

The initialization method InitPart is called only once, to set up a part's initial state. It

should take the following actions:

1. Call the parent class's InitPart method to perform any initialization

required at the parent level.

2. Save the incoming part wrapper object (discussed below) in an internal

field.

3. Set up an internal permissions field to indicate that writing to the draft is

allowed.

4. Set up the part's runtime data structures.

5. Set the part's internal dirty flag to true.

Referenced by (5):