Symbol Tables
Volume Number: 6
Issue Number: 9
Column Tag: Language Translation
Symbol Tables
By Clifford Story, Mount Prospect, IL
Note: Source code files accompanying article are located on MacTech CD-ROM or
source code disks.
A. Introduction
This month, my series on Language Translation returns to lexical analysis, and I
present the amazing new, improved Canon tool.
Parts of this tool are identical (or nearly so) to code presented in my third,
fourth and fifth installments, and I will not repeat these parts this month (although
they are, of course, included on the code disk).
Specifically, the tool is a filter program; I developed a skeleton filter program in
my third installment. It uses no fewer than six state machines for lexical analysis and
parsing; lexical analysis and state machines were the subject of my fourth part. And it
uses the balanced binary tree routines I developed in my fifth part to implement a
symbol table.
B. What the Tool Should Do
The Canon tool functions as follows: the program reads in a dictionary of
substitutions, then reads input files, performs the substitutions as required, and
writes the result. The difference between this Canon tool and the standard MPW Canon
is that is will not perform substitutions within comments or strings.
The tool is controlled by the MPW command line. It takes several possible
options, which may be in any order.
B(1). The Dictionary File
The dictionary file must be named on the command line, with the “-d
name>” option. If no dictionary is named, the tool will abort.
The dictionary file’s format is simple: each substitution is specified on a
separate line, with the identifier (according to the language’s definition of identifier)
to be replaced first, followed by its replacement (which must also be an identifier).
For example:
blip blop
specifies that the identifier “blip” should be replaced by the identifier “blop”
whereever it occurs.
There is a second form of substitution, which consists of only one identifier. All
identifiers in the input that match the dictionary identifier will be replaced by the
dictionary identifier. This can be used to force canonical capitalization.
Finally, the dictionary can include line comments. The tool will ignore
everything between a ‘#’ sign and the end of the line. It also ignores blank lines.
B(2). The Input Files
Input files may be specified by simply naming them on the command line.
The input files should be either Pascal or C source files. The tool will read them
according to their filename extensions: if the file name ends in “.p”, it will be treated
as a Pascal file, and as a C file if it ends in “.c”.
If there are several input files, some “.p” and some “.c”, the first one named on
the command line controls. If no input file has either a “.p” or a “.c” extension, then
Pascal is the default.
If there are no input files named on the command line, the tool will read from