All Databases MacTech Vol 02-1986

Random Access Files

Volume Number: 2

Issue Number: 9

Column Tag: Basic School

Random Access Files

By Dave Kelly, MacTutor Editorial Board

There are two types of data files that can be created and used by your MS Basic

program: sequential files and random access files. Sequential files are used more often

because they are easy to create, but random access files are more flexible and data can

be located faster. A discussion of sequential file I/O operation begins on page 45 of your

MS Basic manual (ver. 2.0 or greater). Random Access File I/O starts on page 48.

Before we begin our discussion of random access file I/O, I suggest that you refer to

those pages.

The purpose of this column is to help you develop an understanding of random

access I/O and how to use it in your own programs. It is very easy to understand how

data is structured in sequential files. It requires more work to organize a random

access file. The organization of the random access file is up to you. I'll try to outline

some steps you can use to help organize your file.

First, you should decide just what data you have to store. For example, if you

were setting up a mail list database you would need one field each for name, address,

city-state, and zipcode. Next decide how many characters will be allowed for each field

(25 for name, 30 for address, 25 for city-state, and 5 for zipcode). The total length of

an individual record would then be 85 characters.

Now decide how many individual records you expect to have in the file. If you

don't require too many records and don't expect to ever expand the file, a sequential file

many be suitable. The is e specially true if you have a lot of RAM to work with and a

comparatively small data file. There are some advantages and disadvantages to using a

sequential file this way. With a sequential file, all records are read into memory so the

disk is only accessed once. The program can then operate on the data much faster than if

it had to access the disk for each record. However, if the data had been changed at all,

the entire file would have to be stored back to the disk or the changes would be lost. In

the event of a power failure or some other system crash, a random access file would

contain all the changes, but a sequential file would not. Generally as files get larger,

they are better handled by random access methods. A large sequential file could take

quite a bit of time just to read and write to the disk.

Next you should consider how you want to access each record of your random

access file. You may want to be able to search for a name or sort the file by zip code. A

long and tedious way to do this would be to read through each and every record until the

desired record is found. If the user knows exactly which record to read then the access

time may be reduced significantly. One way to do this would be to create an index file.

For example, if you wanted to find a specific record and you know the contents of one of

the fields, you could look in the index file to find the matching field and record number.

For a mail list database you might set up an index file containing all of the names and

the record numbers corresponding to the names. Index files may be sequential or

random access (for relation databases) but should contain as few fields as possible to

optimize data access time. If the index is sequential it should be kept in memory and

updated as the random file is updated.

Figure 1

If indexes are used, some thought must be taken as to updating and changing the

index file. If a record is to be deleted, you might want to delete the index, thus

removing any reference to the random access file record. This leaves an available

record for late addition of a new record (if you keep track of which records have been

deleted). If your file isn't expected to change very much you may not mind the wasted

space taken up by the deleted record. Ideally, you should keep track of the locations of

deleted records so that they can be reused when new records are added. Another way to

get rid of the wasted records (if you don't want to go to the trouble of keeping track of

the deleted records) is to write a program to do "Garbage Collection".

Fig. 2 Garbage Collection

A "Garbage Collection" program reads all undeleted records and writes them to a

new file. You only have to do "Garbage Collection" when a lot of records have been

deleted and you need more space to add new records. "Garbage Collection" might be ok to

use if it is automatically performed (with no user intervention). It is NOT desirable

for the user of your program to have to keep track of this kind of file handling (when to

collect garbage and when not to).

When a record is added to the datafile, a new index entry should be created and

the new record should be added to the random access file (either as a new record or

replacing a previously deleted record). If an existing record is edited and changed the

index file should be updated accordingly. You may want to sort the index file before

writing it to the disk. Be sure to save the index program before quiting the program.

Now let's take a look at how the random access file is structured. When you open

a file in basic, a buffer is allocated for each file opened. For random access files the

buffer should be set equal to the length of one record ( the default buffer size is 128

bytes). It is through this buffer that basic reads and writes to the disk. To help you

understand what a random access file "looks like", let's create a sample file to examine.

The Random Access File program included with this column will create a sample random

access file that we can analyze. It creates a random file named "Sample RA File" with a

length of 64 bytes. One advantage of MS Basic random access files is that random access

files require less room on the disk, since Basic stores them in a packed binary format.

Sequential files are stored as a series of sequential ASCII characters.

To facilitate the conversion of numbers to the packed binary format we must use

the MKI$,MKS$,MKD$ commands. To unconvert the numbers we must use

CVI,CVS,CVD commands. These are somewhat easy to remember if you think of the MK

as MaKe and the CV as ConVert. Thus if we want to store an integer number we use

MKI$ to MaKe an Integer string and use CVI to ConVert the Integer back again. The

sample file shows an example of how to use these MaKe and ConVert commands for

integers, single precision and double precision numbers.

As I already mentioned, when the file is opened a buffer is allocated (in this case

the length of all the fields is 64). The fields that we want to use must be memory

mapped to the buffer area. This is accomplished with the FIELD statement. You may use

as many FIELD statements as you like, however, each field statement starts defining the

fields starting at the beginning of the buffer. If you define all your fields on one line

(one FIELD statement) then you won't have any problem, but if you have more fields

than you want to put in one statement then you will want to use a second FIELD

statement. The trick (which the manual does not show you how to do) is to define a

dummy variable with the accumulative length of all the previous field statements

before defining your next field. In the sample program the first FIELD statement

defines three number fields with a total of 14 bytes. (Integer fields are converted to 2

bytes, single precision to 4 bytes, and double precision to 8 bytes). In the second

FIELD statement a dummy string is marking the first part of the buffer which has

already been defined so that the next field will begin after the previously defined fields.

If you didn't know to do this you could have some strange effects when you read your file

back as the field definitions would overlap.

The next important thing that the program must do is to put our data into the

buffer so it can be written to a record on the disk. This is accomplished with the LSET

or RSET statements. LSET will left justify the string within the defined field length (a

variable might be actually shorter than the field has available), The RSET statement

will right justify the string within the field. Every field must be set into the buffer

with one of these commands. You should use a different variable in defining the fields

and setting into the buffer than you use to manipulate your data. Be sure that you don't

use a defined field in an INPUT or LET type statement. This will redefine the location

that the variable points to (we want it to put to the buffer area). If a record is read

from the disk, all the fields defined in the buffer area will contain the data stored on

the disk for that record. You only have to reset those fields that you want to change. All

the rest of the fields will be left untouched until you read another record into the

buffer or set a new value into the field.

To store a record to disk use PUT [# ]filenumber [, record-number ]. To read a

record from the disk use GET [# ]filenumber [, record-number ]. The PUT, GET

statements read and write the entire record in the buffer. You use PUT after you use

the MaKe string statements and use ConVert statements after using GET. You can find

more information on PUT and GET in your Basic manual (pages 220 and 146). Run the

sample program to create a random access file we can examine.

The second program included with this column is a random access utility that I

developed to analyze the data stored in a random access file. I have been saved from alot

of problems with programs like this in the past. I have been able to repair damaged

random access files and determine what buggy random programs were doing with

utilities like this one.

The utility program opens with a menu which will allow you to open your file.

Choosing open from the File menu brings up the standard getfile dialog box from which

you can choose the file you want to examine. (You should choose "Sample RA File" for

this example). Next, the program asks for the length of the random access file record.

If you wrote the program you should have this available, however, if you don't know

what it is you can guess. The sample file is 64 bytes so enter a 64 for the length (then

click OK).

The file menu now has made active a menu item named Edit in the File menu (this

may be confusing - it is NOT the Edit menu). Selecting Edit from the File menu will

bring up a prompt for the record number you want to read. Enter a '1' to read record

number one (the sample file only has one record) (click OK). Next the record is read

into the buffer and displayed on the screen. The first EDIT FIELD shown displays the

file as it looks. Note that some of the ASCII characters are invisible and can't be seen

in the EDIT FIELD. The second EDIT FIELD shows the equivalent ASCII representation of

the record. Invisible characters can be seen (for example a '0' is a null character).

Either of these two fields can be modified or examined as you like.

The hardest thing to analyze is the numbers which have been converted to strings

with the MaKe statements. To make this somewhat easier (though not foolproof) the

program provides a way to convert your numbers from strings to numbers and

numbers to strings to see how these ConVert/MaKe statements work. The third EDIT

FIELD provides the way to enter the number or string to be converted. For example,

enter a 5 in the field and select MKI$(integer) Convert from the Convert menu. The

integer 5 will be converted to the packed binary format string. Note that the first field

stored by our sample file is '0, 5' which was the two byte string made from the integer

5 (see the sample program if you don't follow this). The converted string has been

placed in the third EDIT FIELD. The characters there are invisible (0 and 5 ASCII do

not print). If you select CVI(string) Convert (2-bytes) from the Convert menu, the

string will be converted back to the integer equivalent and displayed.

The rest is up to you as to what you want to do with the utility. It is possible to

modify data in the random record by typing the change in one of the first two EDIT

FIELDs. Then select the button at the top of the window to write the record. When you

select 'OK', the EDIT FIELD which which is active (the EDIT FIELD which the cursor is

blinking) will be stored in place of the record. It is possible to convert a number in the

Convert EDIT FIELD then COPY the contents of the EDIT FIELD and PASTE it into the text

in the first EDIT FIELD. It may be somewhat difficult to COPY/PASTE invisible

characters (because you can't see them to select them) although it is possible. I

recommend that you display the converted ASCII equivalent and enter the ASCII

characters into the second EDIT FIELD and save the record to the disk.

That's all there is on random access files. Hopefully the utility will help you to

learn some things by experimentation about random access. Any questions may be

directed to myself via MacTutor.

' Random Access File

' This program creates a sample Random Access File

Integer%=5: Single!=32769!: Double#=123456789#

Title$="MacTutor, The Macintosh Programming Journal

OPEN "Sample RA File" AS #1 LEN=64

FIELD #1,2 AS I$,4 AS S$,8 AS D$

FIELD #1,14 AS Dummy$,50 AS T$

TEXTFACE(1)

PRINT "Our Variables are: Integer%=";Integer%;"Single!=";Single!

PRINT "Double#=";Double#

PRINT "Title$=";Title$

TEXTFACE(0)

WRIT: PRINT"We will now save them to record 1 ( record

length=64).

LSET I$=MKI$(Integer%)

LSET S$=MKS$(Single!)

LSET D$=MKD$(Double#)

LSET T$=Title$

PUT #1,1

CLOSE #1

PRINT"Now clear all variables... and print them:

Integer%=0:Single!=0:Double#=0:Title$=

TEXTFACE(1)

PRINT "Our Variables are: Integer%=";Integer%;"Single!=";Single!

PRINT "Double#=";Double#

PRINT "Title$=";Title$

TEXTFACE(0)

PRINT "Now read them back again...

OPEN "Sample RA File" AS #1 LEN=64

FIELD #1,2 AS I$,4 AS S$,8 AS D$ , 50 AS T$

GET #1,1

LET Integer%=CVI(I$)

LET Single!=CVS(S$)

LET Double#=CVD(D$)

LET Title$=T$

PRINT"Now close the file and print them all...

CLOSE #1

TEXTFACE(1)

PRINT "Our Variables are: Integer%=";Integer%;"Single!=";Single!

PRINT"Double#=";Double#

PRINT "Title$=";Title$

TEXTFACE(0)

END

' Professor Mac's Random Access Utility

' By Dave Kelly

OPTION BASE 1

DEFINT a-z

WINDOW 1,"",(2,25)-(510,335),3

GOSUB WindowHeader

Recordnumber=1

MENU 1,0,1,"File

MENU 1,1,1,"Open

MENU 1,2,0,"Close

MENU 1,3,0,"Edit

MENU 1,4,1,"Quit

MENU 3,0,0,

MENU 4,0,0,

MENU 5,0,0,

False=0: True= NOT False

Fileopen = False

ON MENU GOSUB MenuEvent

MENU ON

WaitForEvent: GOTO WaitForEvent

MenuEvent:

MenuNumber = MENU(0)

MenuItem = MENU(1):MENU

ON MenuNumber GOSUB Filemenu,Editmenu,Convertmenu

RETURN

Filemenu:

ON MenuItem GOSUB OpenFile,CloseFile,FindRecord,Quititem

RETURN

Referenced by (3):