HFS
Volume Number: 2
Issue Number: 1
Column Tag: C Workshop
Programming for HFS Compatibility 
By Mike Schuster, Consulair Corp.
Foreword
The Apple “Hierarchical File System” is a major advance in Macintosh
technology. From the programmer's standpoint, HFS is an an extension to the current
file manager. Many calls are the same, some calls have been extended, and there are
some new HFS-specific extensions.
Internally, however, HFS is a brand-new file system. The Apple developers have
done a wonderful job of making it compatible. HFS must be a part of any serious
programmer's understanding.
I had planned for Mike to do two articles, to cover for me during the peak of
start-up activity on our VAX AppleTalk contract and preparations for the DECUS
Symposium and DEXPO show. His first article, showing how to do “pop-up” menus,
was very good.
However, this month's article is so important and so full of useful information,
that I have asked him to follow up with a second article, this time covering HFS
internals. Next month, Mike will cover the “on-disk structures” of HFS and details of
HFS internal data structures. Now onward.
Bob Denny
November, 1985
Programming for HFS Compatibility
Apple's new Hierarchical File System (HFS), shipped with Apple's Hard Disk
HD20 and used by Finder 5.0, provides a much more effective mechanism for managing
large volumes than the original Macintosh File System (MFS), represented by Finder
4.1 and below. In MFS, all of the files on a volume are indexed in a single directory
organized as an unsorted, linear list of files names.
While adequate for small volumes, this structure proved inefficient for handling
larger storage devices containing hundreds of files. When a call is made to open a file,
an exhaustive, linear search must be performed to find that file's directory entry. As
the number of files on the volume increases, these searches become relatively time
consuming.
Fig. 1 HFS Directories Explained
The Hierarchical File System abandons this flat, unsorted approach and instead
employs a hierarchical file directory, known as the File Catalog. The file catalog
organizes and maintains the user's perceived desktop hierarchy of folders (directories)
which contain other folders and files. The catalog is organized as a B*-Tree for quick
searching and enumeration of files and folders in the hierarchy.
In addition to providing fast access to a large number of files, the catalog
provides the Finder with the information it needs to render the desktop, without the
substantial overhead of recreating in memory a tree describing the relationships
between folders and files.
MFS performs volume space management and file extent mapping using a
Allocation Block Map (ABM). In this scheme, space on the volume is allocated in equal
sized units called allocation blocks. The ABM contains an entry for each allocation block
on the volume. An entry is zero if the corresponding allocation block is free, and
otherwise equals the index of the next allocation block in the associated file.
To guarantee fast space management and file access, MFS keeps the entire ABM
for each mounted volume in memory. Thus, the ABM must be kept to a reasonable size,
constraining the size of an allocation block. For larger volumes, this requirement
results in severe storage fragmentation and wastage (in which the size of an allocation
block often exceeds the median file size), and forces the partitioning of a storage device
into several smaller, independent volumes, only a few of which are mountable at any
one time.
The Hierarchical File System abandons the ABM scheme and employs instead a
volume space map (VSM) and a hierarchical file block allocation structure. The VSM is
used only for space management and contains one bit per allocation block on the volume,
requiring an order of magnitude less space than a corresponding ABM. A bit in the VSM
is one if the corresponding allocation block is in use, and otherwise is zero. File
mapping is based on a B*-Tree structure which provides efficient sequential and
random file mapping.
Apple designed the Hierarchical File System to be upwards compatible with the
MFS. Apple was quite successful. Most applications written for MFS work correctly
under HFS. The Hierarchical File System supports the existing MFS disk volumes and
allows all existing MFS calls to access HFS volumes.
File Catalogs and Pathnames
A file catalog organizes the folders and files (both called nodes ) on a volume into
a hierarchical tree structure. Folders are equivalent to directories in traditional file
systems; they contain other folders and files. Figure 1 shows an example of HFS file
and folder layout.
The node at the base of the catalog, named StartUp, is the root folder. By
convention, its name is also the name of the volume. Internal nodes are folders that
contain other folders and files. Folders and the nodes they contain are connected by
branches in the figure.
A folder at the top of a branch is the parent of the folder or file offspring at the
bottom. The offspring of StartUp are the folders System Folder, Document Folder, and
Application Folder. No two offspring of the same parent may have the same name.
Leaf nodes do not have branches beneath them; they are either empty folders or
files. The files Products, Logo, and MacPaint are all leaves.
Every folder in the catalog has a unique directory ID. On an HFS volume, the root
folder always has a directory ID of 2; other folders have positive integers as directory
ID's. In the figure the directory ID of the node System Folder is 10. Since offspring
have unique names, every node in the catalog can be uniquely identified by its name and
the directory ID of its parent, which is called the node's parent ID. The node
specification pair {parent ID, node name} in fact constitutes the "key" used to search
through the B*-Tree for a particular node. It corresponds to the quickest way for HFS
to find a file or folder.
Another way to identify a node in a catalog is by a pathname. A pathname is a