Data Conversion
Volume Number: 4
Issue Number: 5
Column Tag: Basic School
Quick & Dirty Data Conversion
By Dave Kelly, MacTutor Editorial Board
FILE IMPORT/EXPORT with BASIC
Over the years, one of the most beneficial uses for Basic is the ability to quickly
write programs to import/export data to and from data bases. This also provides a
“quick and dirty” way to convert data or suppress unwanted characters. I’d like to
explain/complain about the process and some of the problems that I have encountered.
My first major data conversion came a couple of years ago when I converted
30,000 names in a mail list from my Apple //e to the Macintosh. Yes, this was quite a
feat, mostly because of the bottle neck which the dinky 143K floppy drives allowed on
the Apple //. And you thought the Mac 400K drives were small! Anyway, on the
Apple //e I was using DB MASTER, a now orphaned product, with some small support
still available from a consulting firm in Northern California. There may be some of
you that are familiar with DB MASTER (sometimes it was referred to as DB
MONSTER). DB MASTER was great for mail lists and still would be today except for
it’s slow speed (that’s partly the Apple //s fault).
The process went something like this: First the data needed to be converted from
DB MASTER file format to ASCII so that I could direct connect the Mac and the //e
together. DB MASTER utilities includes an export utility which would convert files to
either DIF or a “One Field per Line” format. The utilities converted my mail list from
my Sider hard disk to about 50 143K floppy disks (UGH!). Next the ASCII needed to be
converted from Apple // ASCII to ”the rest of the world” ASCII format. ASCII EXPRESS
telecommunications software provides a conversion utility for doing this (If they
didn’t I’d have to use my own Basic program to convert it myself. Next each disk was
converted via direct null modem connection to the Mac. Whew!
To make a long story (and many hours) short, the mail list was now on the Mac,
but somehow during the process some control characters were intermixed in the files.
Omnis 3 would not read the files with these control characters so it was BASIC to the
rescue. There are 128 standard ASCII characters of which the first 31 characters are
control characters, traditionally used for teletype terminals. The problem was easily
solved by writing a Basic routine to read in the file, filter out the bad characters and
rewrite the file. Compile the program to speed it up!
Control Character Filter
‘ ©MacTutor, 1988
‘ By Dave Kelly
‘ MSBASIC
x$=FILES$(1,”TEXT”)
IF x$=”” THEN END
OPEN x$ FOR INPUT AS #1
OPEN “Filter Output” FOR OUTPUT AS #2
WHILE NOT EOF(1)
Char$=INPUT$(1,1)
IF ASC(Char$)>=32 THEN PRINT #2,Char$;
IF Char$=CHR$(13) THEN PRINT #2,Char$;
WEND
CLOSE #1
CLOSE #2
END
That’s ok for simple filtering. Things may need to be more complex if the
original file contains unexpected characters in it. Recently, I came across the need to
convert data stored in a FileMaker Plus file in order to do some simple, but unusual
calculations. As drafting is done for test equipment the progress is logged into a
FileMaker Plus file. Each drawing number is entered with its own charge number and
the date is entered indicating the start or completion of stages of the drawing process.
Occasionally the boss (who is always right because he is the boss) wants a drawing
status report according to charge number (there are multiple drawings per charge
number). FileMaker Plus will give the total number of drawings but only after
searching for each of the charge numbers and dates individually. BASIC to the rescue.
The Drawing Count program was written to count progress of each of the charge
numbers. A file is opened for input and another file for output. Each field was
previously exported to a BASIC formatted file and is read by the Basic program in the
same order which it was written. The charge number is saved, but each of the date
fields is replaced by a flag (0 or 1) which indicated if there was anything in the date
field. These 0s and 1s are added together to get the totals after the data is sorted. The
sort routine could have been beefed up but, for a “quick and dirty” program, it does
the job. The program was compiled with the MS Basic compiler which provided
adequate speed results.
If you are wondering why ZBasic wasn’t used here, so am I. Actually, I ran into a
small problem which appears to be a bug with ZBasic 4.01 while reading the
FileMaker Plus exported text files. Although ZBasic’s INPUT# statement is supposed
to read data from the file until a carriage return, , End-Of-File, or 255
characters are encountered, when ZBasic saw a comma followed by a carriage return,
the null between the comma and carriage return was skipped and the next field (from
the next record) was read in. The data looks like this:
“ITEM1”,”ITEM2",”ITEM3",
“ITEM4”,”ITEM5",”ITEM6",”ITEM7
The null field after item3 was skipped and ITEM4 is read in. This is how data is
formatted when dumped in comma delimited format as with FileMaker Plus. Sorry,
Zedcor.
COORDINATE WINDOW
DEF OPEN “TEXTDAVE”
BREAK ON
filename$=FILES$(1,”TEXT”,,vol%)
IF filename$=”” THEN END
OPEN “I”,#1,filename$,,vol%
INPUT #1,A$,B$,C$,D$
PRINT @(1,10)A$,B$,C$,D$
CLOSE 1
Another file conversion problem to deal with involves the conversion of text
which has been dumped from text fields. FileMaker Plus, for example, inserts an end
of line mark at the end of each line when it exports data to a file. When the data is read
into another program which doesn’t use the end of line character to mark the end of
lines (because the text is word wrapped and reformattable). If you read the FileMaker
text field into Double Helix II for example, the end of line character shows up in the
text field as an undefined character (a box). The filter method explained above can find
these characters if you know what character it is.
Fortunately life is much easier for converting data these days than it was even
five years ago. Now I use file formats like DIF and SYLK whenever I can when
converting between programs that support both formats. The transfer goes a lot less
painfully when the format is well defined as the DIF format is. For those not familiar
with DIF, it is called the “Data Interchange Format” and was originally designed to by
Software Arts (are they still around?) to transfer data from Visicalc (the father of all
spreadsheet programs). The file is completely ASCII but has header information added
to help maintain data integrity. (This saved me a few times). This means that the file
will undoubtedly be larger than without DIF formatting so be sure that you only use it
for file transport rather than data storage e specially if your disk free space is critical.
SYLK is Microsoft’s format. Most Microsoft products and a few data base programs
such as Omnis 3 and FileMaker Plus support SYLK format. The best way to figure out
how these work is to analyze some sample data from a program that exports in one of
these formats.
‘ Drawing Count Program
‘ Converts Drawing Log Data created by FileMaker Plus
‘ and computes total number of drawings under each date
‘ By Dave Kelly
‘ ©1988
‘MS BASIC
DIM Charge$(1000),Date1(1000),Date2(1000),Date3(1000), Date4(1000)
DIM tCharge$(20),tDate1(20),tDate2(20),tDate3(20),tDate4(20)
filename$=FILES$(1,”TEXT”)
IF filename$=”” THEN END
OPEN “I”,1,filename$
OPEN “O”,2,”Drawing Count.DATA”
PRINT
count=0
CALL MOVETO(10,30)
PRINT “Now Reading in Records...”
WHILE NOT EOF(1)
count=count+1
CALL MOVETO (10,50)
PRINT count
Charge$(count)=””
i$=””
A$=””
C$=””
S$=””
INPUT#1,Charge$(count),i$,A$,C$,S$
IF LEN(Charge$(count))>0 THEN
IF RIGHT$(Charge$(count),1)=”)” THEN

Charge$(count)=MID$(Charge$(count),LEN(Charge$(count))-4,4)
ELSE
Charge$(count)=RIGHT$(Charge$(count),4)
END IF
END IF
IF LEN(i$)>0 THEN Date1(count)=1 ELSE Date1(count)=0
IF LEN(A$)>0 THEN Date2(count)=1 ELSE Date2(count)=0
IF LEN(C$)>0 THEN Date3(count)=1 ELSE Date3(count)=0
IF LEN(S$)>0 THEN Date4(count)=1 ELSE Date4(count)=0
WEND
CALL MOVETO(10,70)
PRINT “Now Sorting Data...”
CALL Sort(count,Charge$(),Date1(),Date2(),Date3(),Date4())
k=0
FOR j=1 TO count
IF tCharge$(k)<>Charge$(j) THEN k=k+1
tCharge$(k)=Charge$(j)
tDate1(k)=tDate1(k)+Date1(j)
tDate2(k)=tDate2(k)+Date2(j)
tDate3(k)=tDate3(k)+Date3(j)
tDate4(k)=tDate4(k)+Date4(j)
NEXT j
CALL MOVETO(10,90)
PRINT “Now Writing to File...”
PRINT#2,”Charge# Date1 Date2 Date3 Date4
FOR i=1 TO k
tDate1(i)=tDate1(i)-tDate2(i)
tDate2(i)=tDate2(i)-tDate3(i)
tDate3(i)=tDate3(i)-tDate4(i)
PRINT #2,tCharge$(i),tDate1(i),tDate2(i),tDate3(i),tDate4(i)
PRINT tCharge$(i),tDate1(i),tDate2(i),tDate3(i),tDate4(i)
NEXT i
CLOSE #1
CLOSE #2
PRINT”Click to continue...”
WHILE MOUSE(0)<>1:WEND
END
SUB Sort(count,Charge$(),Date1(),Date2(),Date3(),Date4()) STATIC
flips=1
WHILE flips
flips=0
FOR i=2 TO count
IF Charge$(i-1)> Charge$(i) THEN
flips=1
SWAP Charge$(i-1),Charge$(i)
SWAP Date1(i-1),Date1(i)
SWAP Date2(i-1),Date2(i)
SWAP Date3(i-1),Date3(i)
SWAP Date4(i-1),Date4(i)
END IF
NEXT
WEND
END SUB
HYPER-CORNER
The new update disk for HyperCard version 1.1 contains a read me stack with
three new buttons pertaining to importing and exporting data. If you don’t have the
new update you can get it from your Apple dealer. Version 1.1 is supposed to fix some
bugs that “bugged” some of us.
The read me stack gives directions for using the new buttons, but I’ll try to
explain them a little bit here. To try them out, create a new stack with a few fields in
it such as the one I created; see fig. 1.
Fig 1. Hyper Dialog
HyperCard Import/Export
The arrow keys were added from the Button Ideas stack to make it easy to check
out the data, but you will need to remember the command 1,2,3,4 keys that move you
from card to card or you may use the arrow (or option-arrow) keys to move from
field to field.
Fig. 2 Export Dialog
The most useful of the buttons is the Export Data button. It allows you to select
the background fields you wish to export and the delimiters separating the fields and
records. Most database programs will read the tab delimited records (with carriage
return between each record). NO, the export button will not export card fields, only
background fields. That is understandable since the output needs to be consistent.
I found that the best way to see how the buttons work is to try them out. Clicking
on the Export Data (or Export Text) button will produce this dialog which will let you
select the background field you desire to output. (See figure 2)
Selecting the Fields for Export
After selecting the All button or after selecting which of the background fields
you want to export, you specify the file name to export to. A dialog comes up to ask you
for the name of the text file to export to:
Fig. 3 Editable Dialog
Selecting a Filename for Export
There is an XCMD around someplace for calling the Standard File Dialog which
would be better to substitute for this dialog. The problem with XCMD routines for
general use is that not everyone has them. It would be best to set up your own custom
library of XCMDs to use in your own stacks.
I ran into no difficulty getting my data to export to a file. You can examine your
data with a word processor to see how the text file looks when the data has been
exported. I like using MS Word for examining this kind of file because you can view all
the tabs and carriage returns using the Show ¶ menu item.
The Import Button works well too, except that you can’t import data into existing
background fields. It creates the background fields on the fly. Perhaps someone out
there could create a button script that will allow you to specify which field you would
like to be the first, second and so on field to be read in. AT LEAST, now these buttons
are included with HyperCard. At first, it may appear that the import didn’t work, but
if you read the instructions in the read me stack, it explains that the new background
fields are created one on top of another. To see the fields you will have to peel them
apart. (see figure below)
Imported Fields
Ideally, the importing will be done first thing before the stack layout is designed
if possible. Importing/Exporting of text with HyperCard has been one of the weak
links. With all the interest in HyperCard these days, there should be more XCMD
routines and tools we can use to improve the usefulness of HyperCard.