Voodoo Review
Volume Number: 12
Issue Number: 6
Column Tag: Version Control
Keeping Things Straight, Orthogonally
Do do that VOODOO
By Christoph Reichenberger
Organizing Variants and Revisions
The importance of software version management increases with the size of a software
project. The implementation of large projects produces program families (programs
existing in many different variants and revisions). Therefore, all components of a
project (source texts, documentation, pictures, etc.) must be stored in many different
versions, and have to be retrievable at any time.
A drawback of many version control tools (SCCS and RCS, as well as most of their
derivatives like DSEE, ClearCase, Projector/SourceServer, etc.) is the failure to
distinguish variants and revisions. This is called intermixed version organization.
Here, variants and revisions of components are organized together in version trees.
Variants are generated by branching off new revisions from certain points and
developing them in parallel. The component forms the center of attention, and the
different versions of the components are managed by maintaining a single version tree
for each component. This results in two main shortcomings:
• Revisions and variants are identified by means of a multi-digit number. When a
variant is split off, two more digits are appended to the version number.
Especially in projects with a couple of variants, this numbering scheme becomes
unwieldy.
• The tree’s structure represents the chronological order of branching of variants.
As the components’ histories of development will in general be different, distinct
tree structures arise within the project, all representing the same variant
information. The user of the software version control tool must know each of
these structures in order to retrieve the specific variants and revisions of the
components to be worked with. Moreover, these trees soon become difficult to
survey when the number of variants of the project increases.
Let’s take an example. Suppose there is a project consisting of two components A
and B. At the beginning, this project is implemented in Modula-2 on a Macintosh
computer. Some revisions of A and B arise. Figure 1 shows the version trees for A and
B that would evolve when managing this project with SCCS at instant 0.
Figure 1. The project’s version trees at instant 0
Next we need to implement a variant of the project in Pascal and branch off a new
variant of the component A for the Pascal implementation, where the programmers
expect that component B will not differ for the Pascal and Modula-2 implementations.
Next we branch off a variant of component B for a Sun workstation, where
(conversely) component A is believed identical for Macintosh and Sun (instant 1).
At instant 1, the project exists in three different implementations:
1. in Modula for Macintosh, consisting of A 2.1 and B 2.1
2. in Pascal for Macintosh, consisting of A 1.3.1.1 and B 2.1
3. in Modula for Sun, consisting of A 2.1 and B 1.2.1.1
This information, however, cannot be inferred from the version trees in Figure
2.
Figure 2. The project’s version trees at instant 1
After development of some revisions of the different variants it turns out that
component A must be implemented differently on both machines, and component B has
to be developed in two variants according to the programming languages (instant 2).
The resulting version trees are presented in Figure 3.
Figure 3. The project’s version tree at instant 2
This small example shows that the chronological order of branching is reflected
by the naming of the different variants of the project’s components. The shape of the
trees of components A and B are different, although both have the same variant
structure. As the tree structure is used for naming software objects, the different
structures must be known to the user in order to enable retrieval of a certain revision
of a given variant. A fixed revision i of the variant Sun/Modula of component A is
referred to by 2.i, whereas the same version of component B is called 1.2.1.2.1.i.
Moreover, at each new branch two digits have to be appended. If there are four
distinctive marks within a project, the software objects in SCCS-like systems have
revision numbers like 1.2.4.2.3.6.2.5.3.4, bordering on incomprehensibility.
Orthogonal Organization
These drawbacks can be avoided by using orthogonal organization of variants and
revisions. Two main characteristics distinguish orthogonal from intermixed version
organization.
• Instead of managing versions of atomic objects only, orthogonal version
organization also deals with variants and revisions of the whole software project.
In fact, the management of versions of the whole software project is the basic
idea of orthogonal version management.
• Variants and revisions span the whole project and are considered to be orthogonal
to each other.
Using the orthogonal organization model, a software project consists of a set of
objects which we call the object pool, and a set of project structure trees. The
following sections explain these terms.
The Object Pool. As mentioned above, our task is to manage a project consisting
of a set of fundamental components. Each of these components may exist in different
variants and revisions. We use the term object for an instance of a component, which
is uniquely defined by a certain variant and a certain revision. An object could be the
source text of a module within a software project, i.e., a particular implementation of
a component (e.g., revision 7.3 of module XY belonging to variant A). We represent an
object graphically as a small cube (Figure 4).
The object pool of a project is the collection of all these different variants and
revisions of all of the project’s components. We can envision the object pool as a
three-dimensional space whose three dimensions are component, variant, and
revision, as shown in Figure 5.
Figure 4. Object
Figure 5. Object pool
The object pool can be variously “projected” [4]. For instance, by cutting off a
vertical slice, we get a project revision (Figure 6), comprising all variants of all
components of the project at a given time. Since the component group consists of all
objects for a specific implementation of the project at a given time, it may constitute
the basis for automatic build management.
Figure 6. Project revision Figure 7. Component group
Restricting a project revision to a particular variant gives a component group
(Figure 7).
Hierarchy
There is also, however, a hierarchical structure connecting the components of a
project. Representing this is the project structure tree, consisting of structure nodes
and component nodes. Component nodes are the leaves of the project structure tree,
and stand for indivisible components of the project, the components (e.g., a module of a
program system, a chapter of a manual, etc.). Note that an object node does not
correspond to a physical file. It is a logical element of the project and can exist in
multiple versions. Structure nodes are the inner nodes of the project structure tree.
They comprise several structure and/or object nodes and form a vehicle for
structuring the project structure. The relation between parent and child in the
structure tree can be defined as “contains”, as in a hierarchical file system. (Note
that the project structure tree has nothing to do with relations among the objects; it
cannot, for example, represent an import relationship.)
Figure 8 shows a project structure tree together with the object pool. In this
example, the project consists of five components (c1, c2, m1, m2 and m3). Figure 8
shows further that each horizontal slice of the object pool (called a project
component) corresponds to one component node.
Figure 8. Connection between
project structure tree and object pool
Revisions and Variants of the Project Structure. Since the project structure may
change in the course of time, more than one project structure tree may exist within
one project, each describing the structure of the project for a certain period of time.
Thus, a particular project structure tree does not describe the structure of a whole
project, but rather the structure of a project revision. In the extreme case, every
project revision can have its own structure tree.
Figure 9 shows an object pool consisting of five project revisions together with
two revisions of the structure tree. Revision 1 of the structure tree was valid for
project revisions 1 and 2, when the structure node m contained only components m1
and m2. Starting with project revision 3, the structure node m additionally contains
component m3.
Figure 9. Revisions of the project structure tree
Even for a given project revision, we cannot speak of the project structure. A
certain project variant may not use some of the components, so that the structures of
various project variants may differ. However, we can always imagine a total project
structure tree, which is a project structure tree with all single trees overlaid.
Figure 10 shows possible variants of the total project structure tree shown in
Figure 8. We can see that project variant x does not use components c1 and c2.
Project variant y does not use component m1, and component m3 is not used in project
variant z.
Figure 10. Variants of the project structure tree
This detailed project structure tells the user exactly which components must be
chosen when building a special variant of the project. (Note that there is no statement
about how they are to be assembled; that is the responsibility of configuration
management.)
Delta Storage
In every software version control tool, delta storage is used to store all the versions
space-efficiently. Only one version is stored in full; the others are stored as delta
scripts. A delta script (or delta) is a sequence of edit commands transforming one
version of a document into another. One of the first version control tools, SCCS [7],
was designed to store different versions of files in a UNIX environment. All source
documents in this environment were text documents with an inherent line structure.
The delta algorithm made use of this structure information and regarded lines as
atomic elements of the file. The same holds for most of the other even more
sophisticated version control tools like RCS [8] and DSEE [3].
But today’s software projects do not consist exclusively of text files.
Programming environments may store source programs in the form of an intermediate
language; the documentation may be written using a word processor which stores data
in a special file format; different versions of drawings have to be stored. An
extraordinarily nasty example are files on the MacOS which consist of a data fork and a
resource fork.
Thus, modern version control systems must make no assumptions about the
structure of the files to be stored, but have to supply delta storage for arbitrary files.
This means that they must be able to generate delta scripts between any two byte
streams.
In [5] we introduced an algorithm for generating deltas between arbitrary files.
Besides its applicability to arbitrary files, the calculated deltas are smaller, and are
calculated faster, than those of other algorithms [1, 2, 9].
We generated deltas between a large number of files and compared the results
with those generated by SCCS (on a SUN SparcStation 2). For non-text files we used
two versions each of about 30 files of different types. The total size of the 30 files was
about 4 MB. Half of the files could not be managed by SCCS. Table 1 shows the results
of the delta generation between the remaining files.
Table 1. Delta generation between text files
The test suite for text files consisted of two versions each of about 300 plain text
files. The total size of the 300 files was about 20 MB. Table 2 summarizes the
results.
Table 2. Delta generation between non-text files
VOODOO
VOODOO implements the above techniques, and is usable not only for the organization of
software development projects in a narrow sense (program development), but also for
CAD, technical documentation, desktop publishing, etc. Even the writing of a book, for
example, is a project in which multiple elementary building blocks (the individual
chapters, illustrations, etc.) evolve in various revisions.
VOODOO models the structure of the software project as a project tree, an
extension of the structure tree. The project tree represents the logical associations of
the individual components and gives insights into their association with project
variants. Figure 11 shows the complete project tree of a small sample project.
Figure 11. Project tree consisting of four kinds of nodes
Let us examine the meaning of the individual kinds of nodes:
Structure node (e.g., )
Structure nodes provide the logical structure of the project. Each structure node
can contain structure nodes and component nodes.
Component node (e.g., )
Component nodes represent the elementary building blocks of a project (e.g., a
module of a program system, a chapter of the documentation, etc.). A component
node does not represent a physical file, however. It represents a project
component that can exist in multiple variants and revisions. Each component
node is the child of a structure node and can contain any number of version group
nodes.
Version group node ( )
A version group node represents a certain version of a component. Each version
group node is the child of a component node and can contain any number of variant
nodes.
Variant node (e.g., )
A variant node always carries the name of a project variant. Variant nodes
identify in which project variants a certain version group is used. If a version
group node contains a variant node x, this means that this version group is used
in the project variant x. Each variant node is the child of a version group node
and cannot have children itself.
In Figure 11, the example composite project (a compiler) consists of two parts,
implementation and documentation. The implementation is subdivided into lexical
analysis and syntax analysis (including code generation), of which each part again
consists of two components. The further structure of the documentation is not yet
defined.
The project is being developed in two variants, one with optimization and one
without. The component Parser can be used commonly by both variants (a version
group node that contains both variant nodes). The components Scanner and CodeGen are
being developed differently for each of the two project variants (each variant node has
its own version group node). The component Switches is used only in the variant
Optimizing (no version group node for the variant Standard).
The connection between the project tree and the object pool can be illustrated by
turning the version group nodes 90 degrees, i.e., by visualizing the third dimension
(see Figure 12). To keep the illustration clear and simple, the connections between
the component nodes and the object pool have been drawn only for the components
Scanner and Parser.
Figure 12. Connection between project tree and object pool
Within a project managed with VOODOO, the project tree not only defines the
logical connections between the project’s components, but also constitutes the basis of
the user interface. It can be manipulated with a browser, and can be filtered by
particular components, variants and/or revisions.
VOODOO presents the software project in two windows. In Figure 13, the front
window shows the project tree. The user is about to check in a new version of the
component TCLTools. The other window shows the project history (see below).
Both windows can be filtered. The variant information that applies to the project
is not mixed with the information about the revisions of the individual components.
The two are managed in a strictly orthogonal fashion. VOODOO is thus able to filter
even the project tree according to variants of the project, i.e., to display only those
parts of the project tree that are associated with a certain variant (or a set of
variants). The filtering of the project tree is applied in two levels.
Figure 13. Screen snapshot during a typical VOODOO session
Figure 14. Different view menus for different users
First of all, the project tree shows nodes of only those variants to which the user
has at least read-only privileges. The names of variants to which he lacks access are
not displayed in the View menu either. The user is thus not even aware that these
variants exist (Figure 14).
Then, within those variants that are visible to a user, he can set variant filters
that further restrict the view of the project. Figure 15 shows the unfiltered project
tree for the sample project. The user has set the variant filter to the variant
Standard, but has not yet activated the filter mechanism (Use Filter is not checked).
The project tree is thus displayed unfiltered for both variants, Optimizing and
Standard.
By activating the filter mechanism, the nodes associated with the variant
Optimizing are hidden. Since the component Switches is used only in the variant
Optimizing, this component node is not visible in the filtered display of the project
tree (Figure 16).
Support for drag and drop simplifies the creation of the project tree as well as
the task of checking in files. The user drags the files he wants to store from any other
application to the VOODOO project window. VOODOO looks up the corresponding
components and brings up a dialog that shows which files will be checked in to which
version group nodes. Figure 17 shows an example of how files are stored to the
VOODOO object pool by dragging them from a Symantec Project Manager window.
Figure 15. Unfiltered project tree
Figure 16. Filtered project tree
Figure 17. Archiving objects using Macintosh Drag and Drop
Project History
Each time a new object is archived or modifications are made to the project
structure or variant information, VOODOO generates an entry in the project history.
The entry consists of the date and time of the modification, the affected components and
variants, the name of the user having made the modifications, and a comment.
Figure 18 shows the history window with its two types of entries. Modifications
to the project structure, as well as named configurations, are displayed in bold, all
other entries in normal.
The small boxes in front of the names indicate whether there were changes made
to the variant information () and/or to the project structure (), or whether a line
represents a named configuration ().
To provide a general view of the software project, the entries in the history
window can also be filtered according to various criteria. Setting the variant filters
affects the history window as it does the project tree - only the entries for checked
variants are displayed. If one or more nodes in the project tree are selected, then only
the history records of these nodes are displayed in the history window. If no node is
selected, all entries are shown. To set the viewing time of the entire software project
(project tree, variants, revisions, etc.), you select the line in the history window
with the desired date/time and press the Turn Back button. The entire project will
then appear as it looked at the selected time.
Figure 18. History window
Access Control
A problem that often arises in software projects with multiple programmers is
that of simultaneous access of several programmers to a given component. Version
control tools can help to solve this problem by providing a locking mechanism.
Figure 19. Project tree with locked nodes
VOODOO provides its locking mechanism at the version-group level rather than at
the component level. Since various version groups represent different parallel
development branches, two version groups of a component can be worked on
simultaneously by two team members without causing any problems. Version groups
can be locked to prevent other team members from overwriting them until they are
unlocked again, either explicitly or as a side effect of retrieving or archiving objects.
The current status of a version group is displayed in the project tree with
corresponding icons (Figure 19).
VOODOO also supports locking of files within the local workspace, using the
Finder’s “locked” flag (instead of the 'ckid' resource) to lock local files.
A working demo of VOODOO can be found at:
ftp:///dev/voodoo-lite-17.hqx
References
1. Heckel, P. “A Technique for Isolating Differences Between Files.”
Communications of the ACM 21:4 (April 1978).
2. Hunt, J. W., and T. G. Szymanski. “A Fast Algorithm for Computing Longest
Common Subsequences.” Communications of the ACM 20:5 (May 1977).
3. Leblang, D. B., and R. P. Chase, Jr. “Computer-Aided Software Engineering in a
Distributed Workstation Environment.” Proceedings of the ACM SIGSOFT
/SIGPLAN Software Engineering Symposium on Practical Software Development
Environments, Pittsburgh 84, ACM Software Engineering Notes 9:3 (1994).
4. Reichenberger, C. “Orthogonal Version Management.” Proceedings of the 2nd
International Workshop on Software Configuration Management, ACM SIGSOFT
Software Engineering Notes 17:7 (November 1989).
5. Reichenberger, C. “Delta Storage for Arbitrary Non-Text-Files.” Proceedings
of the 3rd International Workshop on Software Configuration Management,
Trondheim, Norway (June 12-14, 1991). ACM Press (Order Number:
594910).
6. Reichenberger, C. “VOODOO: A Tool for Orthogonal Version Management.”
Proceedings of the 4th International Workshop on Software Configuration
Management, Baltimore, Maryland, USA (May 21-22, 1993).
7. Rochkind, M. J. “The Source Code Control System.” IEEE Transactions on
Software Engineering SE-l:4 (December 1975).
8. Tichy, W. F. “Design, Implementation, and Evaluation of a Revision Control
System”. Proceedings of the 6th International Conference on Software
Engineering, ACM, IEEE, IPS, NBS (September 1982).
9. Tichy, W. F. “The String-to-String Correction Problem with Block Moves.”
ACM Transactions on Computer Systems 2:4 (November 1984).
10. Tichy, W. F. “RCS - A System for Version Control.” Software - Practice and
Experience 15:7 (July 1985).