Matrix Expressions
Volume Number: 8
Issue Number: 3
Column Tag: Pascal Workshop
Parsing Matrix Expressions
A parser for evaluating matrix expressions
By Bill Murray, Annandale, Virginia
Note: Source code files accompanying article are located on MacTech CD-ROM or
source code disks.
About the author
Bill Murray is retired from NASA, having worked at the Goddard Space Flight
Center in Greenbelt, MD for 22 years as an applied mathematician. In the early days of
the Apollo program he was involved with orbital calculations and statistical studies.
Later he worked in the area of mathematical modelling of satellite imagery data from
passive microwave radiometers. He has a BA in Mathematics from Duke University
(1954) and a MS in Applied Mathematics from Catholic University (1962). His main
“outside” activity over the past 18 years has been long distance running, (presently
running shorter distances), swimming, and hiking. He also enjoys playing some of the
old favorites on the piano for the elderly.
Introduction
This article describes a parser for evaluating matrix expressions. A parser for
calculating real numbers has been described in [1]. The present discussion extends the
ideas introduced in [1], using a matrix as a token. Since a real number is defined as a
matrix with one row and column, real number calculations can be made with the present
algorithm. Similar to [1], a Pascal function, eval, is calculated which takes as its only
input a str255 variable containing the matrix expression, and returns the value of a
real number, a pointer to a matrix, or an error message. In addition, the Pascal code
for eval can be incorporated in software code to make an existing program more
flexible.
Equally as important as developing the tools for implementing a matrix parser is
finding a way to efficiently allocate space and store a matrix array. A good bit of the
following discussion, therefore, will be devoted to arrays, pointers, handles, and
arrays of handles.
In order to conserve space on the heap, a handle is used to point to a single (as
opposed to double) subscripted array of reals dynamically allocated with the NewHandle
function. Using NewHandle and long integers for array indices, the maximum number of
elements in an array in the type declaration can be made quite large, and space need
only be allocated for the actual number of elements in a particular array. If the matrix
is large, there is the option to store the elements in an external disk file. In addition, in
order to work with large numbers of matrices, arrays of handles are used, with each
handle pointing to either a single subscripted array or an external disk file. Space for a
matrix array, therefore, is allocated only as the need arises.
The value of a matrix parser can be seen in many scientific and engineering
areas, e specially where linear algebra is involved - solving systems of linear
equations, obtaining the solution to a least squares problem, making vector
computations, and a myriad of other applications. Linear algebra is an exciting field in
its own right and working with matrices can be fun.
The Pascal program listed in the appendix was written using THINK Pascal 2.0.
It is a stand-alone interactive program allowing a user to enter and evaluate matrix
expressions at the keyboard. The listing can be obtained by writing the author or Xplain
Corporation (the publishers of MacTutor).
The typical matrix operations of transpose, multiplication, addition, inverse,
etc., are discussed, as well as array operations and using functions for elements of a
matrix. Examples are given and it will be shown how easily the solution to a least
squares problem can be obtained.
Enjoy! Read on!
Matrix
Visually we represent a matrix as a rectangular array of reals [arranged in
rows and columns] which possesses certain properties. An m x n matrix A [m rows and
n columns] will look like the following
where the ijth element (ith row, jth column) is (aij).
The transpose of a matrix is a matrix with the rows and columns interchanged.
The (i,j)th element of the transpose of matrix A is the (j,i)th element of A. We can
write the transpose of A [ or A'] as
A vector is a matrix having one column (or row). A column vector with m rows
can be written
And, V' will be a row vector with m columns. In the following, when we talk
about a vector we will consider it to be a column vector.
We have become so accustomed to this two-dimensional representation that we
tend to impose the structure in our code by double dimensioning matrix arrays. The
computer, however, works with and stores the elements of a matrix sequentially. In the
following section, we will show how valuable RAM can be saved by using single
subscripted arrays.
Representation of a matrix
We represent a matrix with a handle as this allows us to dynamically allocate and
store the elements in a relocatable block of memory on the heap (and not the stack)
using NewHandle. The stack is a good place for allocating a few objects at a time, using
them, and then removing them. The heap, however, is the best place for a data
structure such as a matrix array. Since we don’t know in advance how many matrices
will be required in our computations, it seems logical to use the NewHandle function to
allocate space for our arrays, creating and disposing of relocatable blocks of memory as
needed.
Let us compare the difference between allocating space for a singly subscripted
array and allocating space for a doubly subscripted array using the NewHandle function.
not the stack) using NewHandle. The stack is a good place for allocating a few objects at
a time, using them, and then removing them. The heap, however, is the best place for a
data structure such as a matrix array. Since we don’t know in advance how many
matrices will be required in our computations, it seems logical to use the NewHandle
function to allocate space for our arrays, creating and disposing of relocatable blocks of
memory as needed.
In allocating space for a double subscripted matrix array, the allocation must be
done in blocks equal to the number of columns. For example, let us define the following
variable types,
matrixdouble = array[1..maxrows, 1..maxcols] of extended;
ptrmatrixdouble = ^matrixdouble;
hdlmatrixdouble = ^ptrmatrixdouble;
where maxrows and maxcols are global constants. Then, if ‘matrix’ is an (m x n)
matrix variable of type ‘hdlmatrixdouble’, space [in number of bytes] is dynamically
allocated by the following
blocksize:= m * maxcols * 10;
matrix:=hdlmatdouble(NewHandle(blocksize));
where blocksize is the Pascal type, Size [longint], and is equal to the number of
required bytes, and m and n are long integers. The elements, matrix^^[i,j], are
extended real numbers [10 bytes for an extended real data type].
Now, let us allocate space for a singly dimensioned matrix variable, and define
the following variable types
matrixsingle = array[1..maxelements] of extended;
ptrmatrixsingle = ^ matrixsingle;
hdlmatrixsingle = ^ptrmatrixsingle;
where ‘maxelements’ is a global constant. Then, if ‘matrix’ is an (m x n) matrix
variable of type ‘hdlmatrixsingle’, space is allocated by the following
blocksize:= m * n * 10;
matrix:=hdlmatrixsingle(NewHandle(blocksize));
where m*n <= maxelements.
It can be seen that no matter how large the dimension of the argument in the type
declaration (maxelements), space need only be allocated for the actual number of
elements in a particular array. (Actually, we allocate an additional 20 bytes, since the
number of rows and columns are embedded in the first two elements of the array). The
percent savings using single instead of double subscripted arrays is 100(1-
(n/maxcols)). For maxcols large and n small, the savings can be considerable. If n =
maxcols, there is no savings.
We note that in declaring a variable, we pre-allocate on the stack that amount of
storage that that type of variable demands. If it is an array of extended reals, for
example, we have to allocate 10 times the number of elements in the array (10 bytes
for an extended real). If our variable, however, is a handle to an array of extended
reals, we only allocate 4 bytes of storage on the stack (4 bytes for a handle). It is the
NewHandle function which then allows us to allocate just the right amount of space
needed for a particular array. We must, however, make the transformation from double
subscripting to single subscripting.
Thus, we are making provision to work with large matrices, but at the same time
are not being penalized for working with smaller ones. Also, we are not held to a
prescribed number of rows and columns. The limiting factor is the total number of
elements in an array. Using NewHandle and long integers to index an array, the global
constant ‘maxelements’ can be made quite large. [In Globals, this is set equal to
2,000,000].
Allocating space for a matrix using the NewHandle function also helps us when we
wish to work with a large number of matrices. In this case we define an array of handles
each of which points to either a singly subscripted array of reals or to an external disk
file of reals. We then define a handle to this array of handles. (This may be confusing,
but there’s a method to our madness!). For N handles in an array, only 4*N bytes of
storage need be allocated on the heap [a handle is worth 4 bytes]. If, instead of a handle,
we had used an array of extended reals as an element of the array, the memory
requirements might easily exceed that needed to run the program. Thus, we can store
our matrices in an indexed array, allocating space for each new one as the need arises.
And, we don’t pay a penalty in the process.
Summing up, we conserve on RAM by: (1) representing a matrix with a handle
which points to a single [as opposed to double] subscripted array of reals dynamically
allocated on the heap using the NewHandle function; (2) storing the matrix [if large] in
an external disk file; (3) using an array of handles [to single dimensioned arrays of
reals or to external disk files of reals] in order to work with [large numbers] of
matrices.
Housekeeping
Since the elements of a matrix are stored either in RAM or in an external file,
there are some housekeeping chores associated with each. The following may appear a
bit cumbersome, but it does the job and keeps track of things pretty well.
Associated with each matrix are a name and a pointer [plus other ancillary
information]. The variable, strvar, is a handle to an array of handles, the ith one,
strvar^^[i], pointing to the string [20] variable, strvar^^[i]^^, the name of the ith
matrix. The long integer, i, is a pointer to either the ith handle to an external disk file
of extended reals, matfile^^[i], or to the ith handle to a singly subscripted array of
extended reals, storematrix^^[i], depending upon the value of the boolean variable,
matrixstoredinfile^^[i].
If the number of elements in, strvar^^[i]^^, is greater than or equal to,
‘bignumber’, (input long integer with a default of 100), matrixstoredinfile^^[i] is
set to true, and the elements are stored in, matfile^^[i]^^. If the number is less than
‘bignumber’, matrixstoredinfile^^[i] is set to false, and the elements are stored in,
storematrix^^[i]^^[j], a relocatable block of memory on the heap [j runs from 1 to the
number of elements plus two, with the number of rows and number of columns being
stored in the first two elements of a matrix array].
If the matrix elements are stored in an external disk file, there are two
additional booleans to contend with- mfilenew^^[i], and mfileopen^^[i]. Mfilenew^^[i]
is true if NewHandle has been called to allocate space for matfile^^[i], and,
mfileopen^^[i] is true if matfile^^[i]^^ is open.
If the matrix elements are stored in RAM [on the heap], there is only one boolean
to contend with - matrixnew^^[i]. Matrixnew^^[i] is true if NewHandle has been called
to allocate space for, storematrix^^[i].
It will be seen later [in the discussion of EvaluateNodes] that node matrices are
stored on the heap, as well as matrices [amat, bmat, and cmat] which are used in
binary and function calculations. Once we are through with these matrices, their
handles are disposed of, and RAM space is restored. It is only when we wish to save a
matrix that we have to be concerned with “where it goes” - in RAM or out of RAM.
And that’s all there is to the housekeeping!
Matrix multiplication and code.
Since a number of published algorithms for matrix operations use doubly
subscripted variables for matrix arrays, we will show how the Pascal code for matrix
multiplication can be modified to use singly subscripted arrays, and will give the linear