December 95 - Getting Started With OpenDoc Storage
Getting Started With OpenDoc Storage
Vincent Lo
OpenDoc's structured storage model is an innovative departure from the traditional
storage scheme. As you make the move into OpenDoc development, you need to
understand the new storage model and its implications for the way data is stored and
retrieved. This article introduces the new concepts and policies you'll need to know in
order to use OpenDoc storage effectively.
In the traditional Macintosh user model, each application creates and maintains its own
documents, storing each document in a separate file. A file has one creator signature
and one file type, identifying the application it belongs to and the kind of document it
contains. In OpenDoc, by contrast, a document can have multiple parts, created and
maintained by different part editors (called part handlers in earlier versions of
OpenDoc), which are analogous to the standalone applications of the traditional model.
Because all of a document's parts are stored together in the same container (usually
corresponding to a file), there has to be a way for separate part editors to share access
to the same container without interfering with each other.
OpenDoc meets this need by providing a structured model for persistent storage (that
is, for storing data from one session to the next). Each part is given its own storage
unit in which to store and retrieve data. The part can thus operate as a standalone
entity, independent of other parts and their storage. OpenDoc maintains all of the
storage units and notifies each part when to read or write its data.
The same techniques that are used in dealing with persistent storage also apply to the
various forms of data interchange between part editors, such as the Clipboard, drag
and drop, and linking. Because all of these mechanisms use the same data storage
medium (the storage unit), they all work essentially the same way from the part
editor's point of view. For example, a part uses the same API calls to copy data to the
Clipboard that it would use in writing the data to a file container. The same is true for
drag and drop and for linking. Thus, once you learn how to work with OpenDoc storage
units for file storage, you can use the same techniques to implement data interchange
as well.
This article assumes that you're already familiar with basic OpenDoc concepts and
terminology. If you need a quick introduction or refresher, see the article "The
OpenDoc User Experience" in develop Issue 22. You can find additional information on
some of OpenDoc's technical basics in the articles "Building an OpenDoc Part Handler
in Issue 19 and "Getting Started With OpenDoc Graphics" in Issue 21. Developer
releases of OpenDoc include the definitive documentation, the OpenDoc Programmer's
Guide and OpenDoc Class Reference. Developer releases are available through a number
of different sources, or you can request the latest release at AppleLink OPENDOC or at
opendoc@applelink.apple.com on the Internet. The source code in this article is
excerpted from a sample part included with the developer release.
Because OpenDoc was developed jointly by a consortium of companies including Apple,
IBM, and Novell, its interfaces are designed for cross-platform compatibility, using
IBM's platform-independent Standard Object Model (SOM). OpenDoc method
definitions, including the ones in this article, are commonly written in a
language-neutral Interface Definition Language (IDL). The SOM compiler converts
these into equivalent language-specific declarations for whatever source language you
happen to be using. The method definitions shown in this article, for instance, are
taken from the OpenDoc interface file StorageU.idl. To use these methods in your
program, you must include the corresponding language-specific binding file (such as
StorageU.xh for a C++ program).
DRAFTS, DOCUMENTS, AND CONTAINERS
The OpenDoc classes responsible for providing storage capabilities are ODContainer,
ODDocument, ODDraft, and ODStorageUnit. Collectively, a set of subclasses derived
from these four is known as a container suite. A containerrepresents the physical
storage medium in which a document is stored, such as a disk file. Different container
suites share the same API, but may use different low-level storage mechanisms and
operate on different physical storage media. For example, the Bento container suite,
which will be shipped with OpenDoc 1.0, supports both file containers and in-memory
containers. A part editor can thus use the same code to store a part's data either to a
file or in memory.
A single container may contain one or more documents, each of which in turn can
include one or more drafts. A part ordinarily works with a draft, rather than directly
with a document or its container. Each draft is a "snapshot" representing the state of
the document at a particular point in its development. Together, the drafts embody the
history of the document over time.
A part may need to interact with its draft for a variety of reasons:
• Persistent objects -- Every persistent object (such as a part, a frame,
or a link) is created by a draft.
• Data interchange -- A part asks its draft to copy transferred objects to
and from a data-interchange container, such as the Clipboard or a
drag-and-drop container.
• Linking -- A part uses its draft to create link specifications and copy data
to and from link objects.
• Permissions -- A part may need to find out whether it's allowed to write
to the draft.
• Scripting -- A part gets its scripting-specific identifier through its
draft.
STORAGE UNITS
The basic entity of a container suite is the storage unit. Every persistent OpenDoc
object has a storage unit in which to store and retrieve its data. Figure 1 shows a
typical example.
Figure 1. Structure of a storage unit
A storage unit consists of one or more properties, each of which in turn is associated
with one or more values containing the data itself. The storage unit shown in Figure 1,
for instance, has properties named kODPropContents, kODPropPreferredKind, and
kODPropDisplayFrames; the kODPropContents property has values of types
kTextEditorKind and kODMacIText.
Using multiple values allows a property to represent the same data in different forms.
For example, a property holding a drawing may have three values representing the
same data: one as a Macintosh PICT, one as a Windows metafile, and one in TIFF format.
Although OpenDoc cannot enforce the principle, part developers are urged to use
multiple values within a property only for multiple representations of the same data,
not for storing unrelated data items.
The property names and value types shown in Figure 1 represent string constants of
type ODPropertyName and ODValueType, respectively. For cross-platform
extensibility, both of these types are defined as equivalent to an ISO string instead of a
traditional Macintosh OSType: that is, they're 7-bit ASCII null-terminated strings, as
specified by the International Standards Organization (ISO). The string values
themselves are expected to follow a standard naming convention: for instance, the
constants kODPropDisplayFrames and kODWeakStorageUnitRefs stand for the strings
"OpenDoc:Property:DisplayFrames" and "OpenDoc:Type:StorageUnitRefs",
respectively. The OpenDoc interface files StdProps.idl and StdTypes.idl define name
constants for standard properties and value types; any property and type names that
you define for yourself should follow the same naming conventions.
FOCUSING A STORAGE UNIT
The OpenDoc operations for manipulating values don't explicitly identify the value to
operate on. Instead, you have to focus the storage unit on the desired property or value
before invoking the operation. The method for setting the focus is defined in class
ODStorageUnit as follows:
ODStorageUnit Focus(in ODPropertyName propertyName,
in ODPositionCode propertyPosCode,
in ODValueType valueType,
in ODValueIndex valueIndex,
in ODPositionCode valuePosCode);
This allows you to set the storage unit's focus in a variety of ways:
• to a property by name
• to a property by position relative to the current property
• to a value by type within a property
• to a value by position within a property
• to a value by position relative to the current value
Properties and values are ordered within the storage unit according to the sequence in
which they were added. Values within a property are indexed from 1: that is, the first
value has index 1, the second index 2, and so on. Positions relative to the current focus
are specified with a position code. The same position code can refer to either a
property or a value, depending on the current focus. For instance, if the storage unit is
currently focused on a property, the position code kODPosNextSib designates the next
property; if the current focus is on a value, kODPosNextSib designates the next value.
Another way to set the focus of a storage unit is with a storage unit cursor:
ODStorageUnit FocusWithCursor(in ODStorageUnitCursor cursor);
The cursor identifies a property by name or a value by its property name and its index
or value type. Once created (with method CreateCursor or CreateCursorWithFocus of
class ODStorageUnit), the same cursor can be reused multiple times to refer to
properties or values within the storage unit.
Once you've focused a storage unit, you can create a storage unit view to refer to the
same property or value again later without having to reset the focus:
ODStorageUnitView CreateView();
The view responds to all the same access methods as the storage unit itself, but applies
them to the property or value that had the focus at the time the view was created,
rather than at the time the method is invoked. It does this by automatically resetting
the underlying storage unit to the original focus, then forwarding the method call to
the storage unit for processing.
MANIPULATING VALUE DATA
The operations for manipulating data within a storage value are stream-based, very
much like reading or writing to a sequential file. Each value has a current offset
position that controls where the next operation will take place, similar to the file
mark in the Macintosh file system. In addition to reading and writing data sequentially,
you can also insert or delete data at the current offset position.
Class ODStorageUnit defines the following methods for manipulating value data:
void SetOffset(in ODULong offset);
ODULong GetOffset();
void SetValue(in ODByteArray value);
ODULong GetValue(in ODULong length, out ODByteArray value);
void InsertValue(in ODByteArray value);
void DeleteValue(in ODULong length);
The ODByteArray structure is used to pass data to or from a storage unit.
typdef struct {
unsigned long _maximum; /* size of buffer */
unsigned long _length; /* number of bytes of actual data */
octet* _buffer; /* pointer to buffer containing the */
/* data */
} _IDL_SEQUENCE_octet;
typedef _IDL_SEQUENCE_octet ODByteArray;
(Anoctet is simply the SOM term for an 8-bit byte.) Listing 1 shows how to
manipulate one of the values shown in Figure 1.
Listing 1. Adding data to a value
/* Focus the storage unit, using property name and value type. */
storageUnit->Focus(ev, kODPropContents, kODPosUndefined,
kTextEditorKind, 0, kODPosUndefined);
/* Set up the byte array. */
ODByteArray ba;
ba._length = size;
ba._maximum = size;
ba._buffer = buffer;
/* Set the offset. (This step isn't really needed here, since the
Focus operation automatically sets the offset to 0. It's included
for illustrative purposes only.) */
storageUnit->SetOffset(ev, 0);
/* Add the value. */
storageUnit->SetValue(ev, &ba);
STORAGE UNIT REFERENCES
Storage unit references allow one storage unit to refer persistently to another. A part
can use this mechanism to access information stored in a storage unit (which may or
may not belong to it) across multiple sessions. A draft thus consists essentially of a
network of storage units connected to each other with persistent references.
When a storage unit is cloned (copied to a data-interchange container), any other
storage units it references are cloned along with it. Since all storage units in a draft
are interconnected, cloning any one of them may cause the whole draft to be cloned.
Because this may be an expensive and unnecessary operation, OpenDoc provides two
levels of storage unit reference: strong and weak. Only strongly referenced storage
units are copied when the unit that refers to them is cloned.
In Figure 2, frame A refers strongly to part A, which refers strongly to frame B,
which refers strongly to part B. Thus if frame A's storage unit is cloned, all four
storage units will be copied. On the other hand, cloning frame B's storage unit will
copy those for frame B and part B only, since frame B's reference to frame A is weak
rather than strong.
Figure 2. Strong and weak storage unit references
An object can use strong storage unit references to refer to other objects that are
essential to its functioning, such as embedded frames. Weak references are mainly for
informational or secondary purposes: a part might use them, for instance, to refer to
its display frames.
LIFE CYCLE OF A PART
Figure 3 shows the life cycle of a part and its associated storage unit. Because the
part's lifetime may span multiple editing sessions, it must be able to externalize its
internal state (save it to persistent storage) in order to reconstruct itself from one
session to the next. The part's InitPart method, called when the part is first created,
receives a storage unit as a parameter. The Externalize method can then use this
storage unit to save the part's state. Once externalized, the part can be released from
memory and later reconstituted from external storage by a method named
InitPartFromStorage. Unlike InitPart, InitPartFromStorage can be called multiple
times during a part's lifetime, whenever the part needs to be reconstructed from
external storage.
Figure 3. Life cycle of a part
Notice that externalizing a part is not the same as cloning it. Externalizing means
writing the part's data to persistent storage, using a storage unit associated with the
draft in which the part resides; cloning is transferring the part's data to a
data-interchange container such as the Clipboard, using a storage unit associated with
the container. Although the two operations are different, they're both based on the
same ODStorageUnit API and can share much of the same code.
Another related operation is purging, which reclaims memory space by eliminating
unnecessary runtime data structures such as caches. Because such structures can
usually be reconstructed from persistent data, many OpenDoc programmers believe
that a part's Purge method should always begin by externalizing the part's data before
deleting unused or unnecessary memory. While this might sound plausible in
principle, the externalization operation itself requires additional memory -- the very
thing that's in short supply during purging. As a general rule, the Purge method should
avoid invoking externalization unless it's absolutely necessary.
All persistent objects carry a reference count, enabling OpenDoc to identify unused
objects and reclaim the memory they occupy. The Acquire method, which creates a
reference to a specified object, increments the object's reference count; the Release
method destroys a reference and decrements the reference count. When the reference
count goes down to 0, OpenDoc can safely delete the object from memory.
INITIALIZATION
The initialization method InitPart is called only once, to set up a part's initial state. It
should take the following actions:
1. Call the parent class's InitPart method to perform any initialization
required at the parent level.
2. Save the incoming part wrapper object (discussed below) in an internal
field.
3. Set up an internal permissions field to indicate that writing to the draft is
allowed.
4. Set up the part's runtime data structures.
5. Set the part's internal dirty flag to true.