Linear Equations
Volume Number: 1
Issue Number: 11
Column Tag: Forth Forum
Solving Systems of Linear Equations
By Jörg Langowski, Chemical Engineer, Fed. Rep. of Germany, MacTutor
Editorial Board
This is the first of a series of columns that will deal with the general problem of
doing numerical calculations in MacForth. Forth's philosophy is to use integer
arithmetic in many cases that would be handled with floating point in other languages.
The reason for this has to be seen historically in the development of Forth, which first
was used almost exclusively as a language to do process control. It was desirable not to
have the ballast of a floating point package in implementations that used 8-bit
processors with a limited amount of memory, and there is, of course, a great speed
advantage in using integer arithmetic.
When used in 'custom-designed' routines for one particular problem, integer
arithmetic can do as well as floating point. However, one has to scale all the numbers
involved so that they fit into the range that is given by the 4 bytes of the Mac's integer
arithmetic (or the 2 bytes of some other system). On the other hand, numbers
shouldn't get too small, either, because accuracy is lost very quickly. The constant need
to haul scaling factors around between parts of the program then makes the code rather
hard to read and bug-prone.
Again this is the old tradeoff between speed and low memory requirement on one
side and flexibility and readability on the other. If we want to write a set of
mathematical routines in Forth that will be useful no matter what the particular
problem is (whether distances are in nanometers or lightyears, weights in tons or
micrograms) the easiest way to do this is to use floating point arithmetic. This is
e specially true on the Macintosh, since we have an excellent floating point package
with 80-bit accuracy built in.
This FP package, also called SANE (Standard Apple Numeric Environment)
conforms to the proposed IEEE standard on floating point arithmetic (see my article in
MacTutor V1#1). MacForth 2.0 offers Forth code for a very slick interface to the
SANE routines, using its own floating point stack and even modifying the interpreter so
that real numbers are accepted as input. There are two problems with this code,
though: First, we cannot print it here for obvious reasons and therefore our program
would run only under MacForth 2.0, which would be a little too restricted for this
Forum. Second, according to my own tests the floating point interface adds so much
overhead that actual calculations are slowed down by a factor of 2 to 3.
The code that we write here uses a more 'direct' approach to FP arithmetic,
which is variable-oriented rather than stack-oriented (see V1#1). It looks a little
more clumsy and is definitely harder to read, but since we want to generate a set of
Forth words for general use which should be fast more than anything, this is justified.
Definition of the problem - fitting experimental data to a theoretical
equation
Enough of the preliminaries, I should tell you now what exactly we want to do.
One of the bread-and-butter problems in experimental science is to extract
theoretical parameters from a set of experimental data points, given a theoretical
equation that can predict those data points from the parameters.
Example: You measure the time response of a physical system, for instance the
voltage across a capacitor C as it is discharged through a resistor R. The time behavior
of the voltage versus time looks like:
U(t) = Uo exp(-t/RC)
or, if the voltage does not drop all the way down to zero (e.g. some bias applied),
U(t) = Uo exp(-t/RC) + U1 .
In practice, we may have measured a series of points Ui at times ti. Our problem
is to get Uo, U1 and RC from that data. Fig. 1 shows how the data and the 'exact'
theoretical curve might look like.
Fig. 1: Fitting a theoretical curve to experimental data
Of course, for all U(t) curves with different U0, U1 and RC, there is only one
that fits the data points best. The quality of the fit is usually checked by summing the
squared differences (the 'residuals') between the data points and the theoretical curve.
We have to vary the parameters U0, U1 and RC in such a way that this sum-of-squares
becomes a minimum.
Iterative least-squares fitting
Let's state the problem in a more general way. We have a function y =
f(t,a1,a2,a3...am) that, given certain values for the parameters a1,a2,a3...am, tells
us the time dependence of some quantity y that can be measured. Furthermore we have
a set of n data points (ti,yi), the y-values that are actually measured at times ti. The
residual for data point i is then
There exists a variety of techniques that one can use to minimize the sum of the
squared residuals in such a case. All of them require that one first estimates initial
values for the parameters that are not too far away from reality; this is usually
possible. From these initial values one can then compute a better estimate of the
parameters, and iterate this process until the fit does not improve anymore.
One rather simple algorithm that solves the fitting problem in such an interative
way is given by T.R.McCalla in his book 'Introduction to Numerical Methods and
FORTRAN Programming' (Wiley & Sons, New York 1967). I won't give the details
here, since we are mainly interested in how to program such an algorithm in Forth.
The only thing we need to know is the final result: a set of linear equations whose
solution gives correction terms ∂ak. These ∂ak have to be added to the initial ak to get
the new estimate.
The linear equations that one gets look like this:
Fig.2: System of linear equations
where the cij are coefficients that one calculates from the set of n data points (ti,yi)
and the derivatives of the function f(ti, a1,a2,a3...am ) at each data point with respect
to the parameters ak. The bi contain the residuals.
So the first problem that we have to solve - and this will be plenty for this
column - is to solve a system of linear equations like the one given above. In later
columns we will build on the basics of floating-point arithmetic that we develop here
and end up with a functional curve-fitting program.
The Gauss Algorithm
A linear equation system like the one above is often solved using the Gauss
algorithm. One starts writing the coefficients on the left and right hand sides of the
equations as a m*m+1 matrix:
(3 by 4 in this example).
The algorithm then converts this matrix into a triangular matrix:
where the bottom left 'triangle' is equal to zero: multiples of the first row are
subtracted from the rows below it until the first column is all zeroes except for the
first row, then multiples of the second row are subtracted from the rows below it until
the second column is all zeroes except for the first two rows, and so on.
After that procedure is completed, the bottom row has become a simple equation
of one variable:
from which ∂a3 can easily be calculated. ∂a3 is then substituted into the equation above
it and ∂a2 obtained, and from ∂a3 and ∂a2 finally ∂a1. This procedure can, of course, be
expanded to be used on any number of equations.
The Gauss algorithm is given as a Pascal program (to improve readability) in
Listing 1. To code it in Forth we first have to give the problem of data representation a
little thought, namely: how are we going to store a matrix?
Data representation for arrays of floating point numbers
The SANE routines work on 80-bit numbers. This is ideal for accurate
calculations, but a little expensive as far as storage goes; a 100 * 100 matrix would
already occupy 80K. If high precision is not needed, large arrays may be stored as
lower precision FP numbers. Single precision uses only 32 bits, less than half of the
standard SANE length. Therefore we are going to store matrices as two-dimensional
arrays of 32-bit long words that contain single precision real numbers. The MATRIX
definition (in the example program in listing 2) is modified from the example released
by Creative Solution on the Forth disk. We have separated the DOES> part that
calculates the address of a matrix element from its indices and defined it as a separate
word, CALC.OFFSET. This was done so that our routine works with any matrix variable
whose address is passed on the stack.
You define a matrix with r rows and c columns by
r c MATRIX X .
When you later execute
i j X ,
the address of the element in row i and column j of matrix x will be on the stack.
When you execute 0 0 X (all rows and columns start with 0), the address of the first
element in the matrix will be on the stack. If we want to write a Gauss algorithm
routine that works with any matrix of any size, we have to be able to calculate the
offset into the matrix from the row and column indices just as the DOES> part of the
MATRIX defining word does. In our definitions, i j addr CALC.OFFSET leaves on the
stack the address of the element at row i and column j of the matrix whose first
element is at addr.
The solution of the linear equation system will be stored in an array z. For this
array we do not need a DOES> part because it is one-dimensional, no need to keep track
of row and column lengths here.
Strategy for floating point calculations using the SANE package
The SANE routines expect addresses of floating point numbers on the stack as
their parameters (see V1#1). All arithmetic operators are two-address operators,
where the first parameter is added to, subtracted from, divided or multiplied into the
second parameter. The second parameter is always 80-bit extended precision, while
the first one may be any precision. So for any calculation we will transfer the
numbers out of the 32-bit variables into 80-bit variables (or add them in etc., if it is
convenient), then do all intermediate calculations in 80-bit precision and at the end
store the 80-bit result into a 32-bit single precision variable again.
The Gauss Algorithm Routine
Listing 2 shows the example program containing the GAUSS routine for solution
of linear equation systems of any size. The routine expects on the stack, from bottom to
top: the address of a solution vector z, which for n unknowns has n 32-bit words
allocated; the address of the n (rows) by n+1 (columns) matrix X that contains the
coefficients of the linear equation system; and n, the number of equations (or
unknowns, respectively). The routine first converts the X matrix into its triangular
form (so X is changed upon exit), then proceeds to calculate the values of the
unknowns, starting in the bottom row of the matrix and working its way up.
The K function: extracting the loop index 2 levels up
The first part of the algorithm has DO..LOOP constructs nested 3 levels deep. The
inner loop needs the outermost loop index, and there is no standard word in MacForth
that handles this. Therefore we define : k rp@ 20 + @ ; which does this job. (There is
also a k defined in machine code; see V1#9).
The example program
Our example calculates the solution of the system of equations
The solution is x1 = 1.2308, x2 = -1.0769, x3 = -0.1538. The word gbm
calculates and prints this solution (it actually calculates n times, with n on top of the
stack, for benchmark purposes).
Listing 1: Gaussian algorithm - Pascal example
program LinEqu;
type matrix = array[1..10, 1..11] of real;
column = array[1..10] of real;
var x : matrix; z : column; n, i : integer;
procedure gaussalg (var x : matrix;
var z : column; n : integer);
var dg, fk, ee : real; i, j, k : integer;
begin
for i := 1 to n - 1 do
begin dg := x[i, i];
for j := i + 1 to n do
begin fk := x[j, i] / dg;
for k := i to n + 1 do
x[j, k] := x[j, k] - fk * x[i, k]
end
end;
for i := 1 to n do z[i] := x[i, n + 1];
for i := n downto 2 do
begin dg := x[i, i]; ee := z[i];
for j := i - 1 downto 1 do
z[j] := z[j] - ee * x[j, i] / dg
end;
for i := 1 to n do z[i] := z[i] / x[i, i]
end;
begin n := 3;
x[1, 1] := 1; x[1, 2] := 1; x[1, 3] := 1; x[1, 4] := 0;
x[2, 1] := 1; x[2, 2] := -1; x[2, 3] := 2; x[2, 4] := 2;
x[3, 1] := 4; x[3, 2] := 1; x[3, 3] := -1; x[3, 4] := 4;
gaussalg(x, z, n);
for i := 1 to 3 do writeln('z[', i : 1, ']= ', z[i] : 7 : 4)
end.
Listing 2: Gaussian algorithm, FORTH example
( Floating point primitives )
( This is part of the SANE interface given in MT V1#1; not all of it
is needed here)
hex a9eb w>mt fp68k ( package 4 )
a9ec w>mt elems68k ( package 5 )
( extended precision operations )
: f+ 0 fp68k ; : f- 2 fp68k ; : f* 4 fp68k ; : f/ 6 fp68k ;
: x2x e fp68k ; : fneg d fp68k ;
( single to extended operations )
: s+ 1000 fp68k ; : s- 1002 fp68k ; : s2x 100e fp68k ;
: s* 1004 fp68k ; : s/ 1006 fp68k ; : x2s 1010 fp68k ;
( long integer to extended operations )
: in+ 2800 fp68k ; : in- 2802 fp68k ;
: in2x 280e fp68k ; : in* 2804 fp68k ;
: in/ 2806 fp68k ; : x2in 2810 fp68k ;
: d2b 9 fp68k ; : b2d b fp68k ;
( decimal <--> binary conversions )
: float create 10 allot ; : integer create 4 allot ;
: wvar create 2 allot ; ( type declarations )
( floating point i/o )
decimal
: numstring create 24 allot ; ( decimal display string )
hex 1000000 constant fixdec decimal
( format style control )
variable zzformat
( internal format for conversion routine )
numstring zzs1 ( internal conversion string )
: dec. ( float\format# -- )
zzformat ! zzformat swap zzs1 b2d
zzs1 dup w@ 255 > if ." -" else ." " then
dup 4+ count type ( mantissa )
2+ w@ ( get exponent )
1 w* ( convert to 32 bit integer )
." E" . ;
( floating point initialization )
: fclear 0 over ! 0 over 4+ ! 0 over 8+ w! drop ;
: sclear 0 swap ! ;
( Matrix Operators )
: calc.offset ( row\col\addr -- addr )
dup>r 4+ @ ( #cols) 4* ( 32-bit )
rot * ( offset to row) swap 4* ( 32-bit )
+ ( offset to element ) r> 8+ + ( add base addr) ;
: matrix ( #rows\#cols -- )
create over , ( #rows ) dup , ( #cols )
* 4* allot ( allot the space for the matrix )
does> calc.offset ;
( Gauss algorithm for linear equations, definitions)
: k rp@ 20 + @ ;
variable nv variable coeff variable solution
( addresses for storing actual parameters)
float one float -one float zero float two float four
1 one ! -1 -one ! 0 zero ! 2 two ! 4 four !
one one in2x two two in2x -one -one in2x
zero zero in2x four four in2x
float fa1 float fa2 float fa3 float fa4
( define some floating accumulators)
float dg float fk float ee
create z 12 allot 3 4 matrix x
: ztest
3 0 do i 4* solution @ + fa1 s2x fa1 5 dec. loop cr ;
( setup coefficient matrix for example)
one 0 0 x x2s one 0 1 x x2s one 0 2 x x2s
zero 0 3 x
x2s
one 1 0 x x2s -one 1 1 x x2s two 1 2 x x2s
two 1 3 x
x2s
four 2 0 x x2s one 2 1 x x2s -one 2 2 x x2s
four 2 3 x
x2s
( Gauss algorithm for linear equations)
: gauss ( z\x\n | --) nv ! 8- coeff ! solution !
nv @ 1- 0 do ( i-loop)
i dup coeff @ calc.offset dg s2x ( diag elem)
nv @ i 1+ do ( j-loop)
i j coeff @ calc.offset fk s2x dg fk f/
nv @ 1+ j do ( k-loop)
k i coeff @ calc.offset fa1 s2x
fk fa1 f* fa1 fneg ( -fk*x[i,k])
j i coeff @ calc.offset dup fa1 s+
fa1 swap x2s
loop
loop
loop
nv @ dup 0 do i over coeff @ calc.offset fa1 s2x
fa1 solution @ i 4* + x2s loop
1 nv @ 1- do
i dup coeff @ calc.offset dg s2x
solution @ i 4* + ee s2x dg ee f/
0 i 1- do i j coeff @ calc.offset fa1 s2x
ee fa1 f* fa1 fneg
solution @ i 4* + dup fa1 s+ fa1 swap x2s
-1 +loop
-1 +loop
nv @ 0 do solution @ i 4* + fa1 s2x
i dup coeff @ calc.offset fa1 s/
fa1 solution @ i 4* + x2s
loop ;
: soln ." The solution is: " ztest ;
: gbm 0 do z 0 0 x 3 gauss loop soln ;