All Databases MacTech Vol 04-1988

Data Conversion

Volume Number: 4

Issue Number: 5

Column Tag: Basic School

Quick & Dirty Data Conversion

By Dave Kelly, MacTutor Editorial Board

FILE IMPORT/EXPORT with BASIC

Over the years, one of the most beneficial uses for Basic is the ability to quickly

write programs to import/export data to and from data bases. This also provides a

“quick and dirty” way to convert data or suppress unwanted characters. I’d like to

explain/complain about the process and some of the problems that I have encountered.

My first major data conversion came a couple of years ago when I converted

30,000 names in a mail list from my Apple //e to the Macintosh. Yes, this was quite a

feat, mostly because of the bottle neck which the dinky 143K floppy drives allowed on

the Apple //. And you thought the Mac 400K drives were small! Anyway, on the

Apple //e I was using DB MASTER, a now orphaned product, with some small support

still available from a consulting firm in Northern California. There may be some of

you that are familiar with DB MASTER (sometimes it was referred to as DB

MONSTER). DB MASTER was great for mail lists and still would be today except for

it’s slow speed (that’s partly the Apple //s fault).

The process went something like this: First the data needed to be converted from

DB MASTER file format to ASCII so that I could direct connect the Mac and the //e

together. DB MASTER utilities includes an export utility which would convert files to

either DIF or a “One Field per Line” format. The utilities converted my mail list from

my Sider hard disk to about 50 143K floppy disks (UGH!). Next the ASCII needed to be

converted from Apple // ASCII to ”the rest of the world” ASCII format. ASCII EXPRESS

telecommunications software provides a conversion utility for doing this (If they

didn’t I’d have to use my own Basic program to convert it myself. Next each disk was

converted via direct null modem connection to the Mac. Whew!

To make a long story (and many hours) short, the mail list was now on the Mac,

but somehow during the process some control characters were intermixed in the files.

Omnis 3 would not read the files with these control characters so it was BASIC to the

rescue. There are 128 standard ASCII characters of which the first 31 characters are

control characters, traditionally used for teletype terminals. The problem was easily

solved by writing a Basic routine to read in the file, filter out the bad characters and

rewrite the file. Compile the program to speed it up!

Control Character Filter

‘ ©MacTutor, 1988

‘ By Dave Kelly

‘ MSBASIC

x$=FILES$(1,”TEXT”)

IF x$=”” THEN END

OPEN x$ FOR INPUT AS #1

OPEN “Filter Output” FOR OUTPUT AS #2

WHILE NOT EOF(1)

Char$=INPUT$(1,1)

IF ASC(Char$)>=32 THEN PRINT #2,Char$;

IF Char$=CHR$(13) THEN PRINT #2,Char$;

WEND

CLOSE #1

CLOSE #2

END

That’s ok for simple filtering. Things may need to be more complex if the

original file contains unexpected characters in it. Recently, I came across the need to

convert data stored in a FileMaker Plus file in order to do some simple, but unusual

calculations. As drafting is done for test equipment the progress is logged into a

FileMaker Plus file. Each drawing number is entered with its own charge number and

the date is entered indicating the start or completion of stages of the drawing process.

Occasionally the boss (who is always right because he is the boss) wants a drawing

status report according to charge number (there are multiple drawings per charge

number). FileMaker Plus will give the total number of drawings but only after

searching for each of the charge numbers and dates individually. BASIC to the rescue.

The Drawing Count program was written to count progress of each of the charge

numbers. A file is opened for input and another file for output. Each field was

previously exported to a BASIC formatted file and is read by the Basic program in the

same order which it was written. The charge number is saved, but each of the date

fields is replaced by a flag (0 or 1) which indicated if there was anything in the date

field. These 0s and 1s are added together to get the totals after the data is sorted. The

sort routine could have been beefed up but, for a “quick and dirty” program, it does

the job. The program was compiled with the MS Basic compiler which provided

adequate speed results.

If you are wondering why ZBasic wasn’t used here, so am I. Actually, I ran into a

small problem which appears to be a bug with ZBasic 4.01 while reading the

FileMaker Plus exported text files. Although ZBasic’s INPUT# statement is supposed

to read data from the file until a carriage return, , End-Of-File, or 255

characters are encountered, when ZBasic saw a comma followed by a carriage return,

the null between the comma and carriage return was skipped and the next field (from

the next record) was read in. The data looks like this:

“ITEM1”,”ITEM2",”ITEM3",

“ITEM4”,”ITEM5",”ITEM6",”ITEM7

The null field after item3 was skipped and ITEM4 is read in. This is how data is

formatted when dumped in comma delimited format as with FileMaker Plus. Sorry,

Zedcor.

COORDINATE WINDOW

DEF OPEN “TEXTDAVE”

BREAK ON

filename$=FILES$(1,”TEXT”,,vol%)

IF filename$=”” THEN END

OPEN “I”,#1,filename$,,vol%

INPUT #1,A$,B$,C$,D$

PRINT @(1,10)A$,B$,C$,D$

CLOSE 1

Another file conversion problem to deal with involves the conversion of text

which has been dumped from text fields. FileMaker Plus, for example, inserts an end

of line mark at the end of each line when it exports data to a file. When the data is read

into another program which doesn’t use the end of line character to mark the end of

lines (because the text is word wrapped and reformattable). If you read the FileMaker

text field into Double Helix II for example, the end of line character shows up in the

text field as an undefined character (a box). The filter method explained above can find

these characters if you know what character it is.

Fortunately life is much easier for converting data these days than it was even

five years ago. Now I use file formats like DIF and SYLK whenever I can when

converting between programs that support both formats. The transfer goes a lot less

painfully when the format is well defined as the DIF format is. For those not familiar

with DIF, it is called the “Data Interchange Format” and was originally designed to by

Software Arts (are they still around?) to transfer data from Visicalc (the father of all

spreadsheet programs). The file is completely ASCII but has header information added

to help maintain data integrity. (This saved me a few times). This means that the file

will undoubtedly be larger than without DIF formatting so be sure that you only use it

for file transport rather than data storage e specially if your disk free space is critical.

SYLK is Microsoft’s format. Most Microsoft products and a few data base programs

such as Omnis 3 and FileMaker Plus support SYLK format. The best way to figure out

how these work is to analyze some sample data from a program that exports in one of

these formats.

‘ Drawing Count Program

‘ Converts Drawing Log Data created by FileMaker Plus

‘ and computes total number of drawings under each date

‘ By Dave Kelly

‘MS BASIC

DIM Charge$(1000),Date1(1000),Date2(1000),Date3(1000), Date4(1000)

DIM tCharge$(20),tDate1(20),tDate2(20),tDate3(20),tDate4(20)

filename$=FILES$(1,”TEXT”)

IF filename$=”” THEN END

OPEN “I”,1,filename$

OPEN “O”,2,”Drawing Count.DATA”

count=0

CALL MOVETO(10,30)

PRINT “Now Reading in Records...”

WHILE NOT EOF(1)

count=count+1

CALL MOVETO (10,50)

PRINT count

Charge$(count)=””

i$=””

A$=””

C$=””

S$=””

INPUT#1,Charge$(count),i$,A$,C$,S$

IF LEN(Charge$(count))>0 THEN

IF RIGHT$(Charge$(count),1)=”)” THEN

Charge$(count)=MID$(Charge$(count),LEN(Charge$(count))-4,4)

ELSE

Charge$(count)=RIGHT$(Charge$(count),4)

END IF

IF LEN(i$)>0 THEN Date1(count)=1 ELSE Date1(count)=0

IF LEN(A$)>0 THEN Date2(count)=1 ELSE Date2(count)=0

IF LEN(C$)>0 THEN Date3(count)=1 ELSE Date3(count)=0

IF LEN(S$)>0 THEN Date4(count)=1 ELSE Date4(count)=0

WEND

CALL MOVETO(10,70)

PRINT “Now Sorting Data...”

CALL Sort(count,Charge$(),Date1(),Date2(),Date3(),Date4())

k=0

FOR j=1 TO count

IF tCharge$(k)<>Charge$(j) THEN k=k+1

tCharge$(k)=Charge$(j)

tDate1(k)=tDate1(k)+Date1(j)

tDate2(k)=tDate2(k)+Date2(j)

tDate3(k)=tDate3(k)+Date3(j)

tDate4(k)=tDate4(k)+Date4(j)

NEXT j

CALL MOVETO(10,90)

PRINT “Now Writing to File...”

PRINT#2,”Charge# Date1 Date2 Date3 Date4

FOR i=1 TO k

tDate1(i)=tDate1(i)-tDate2(i)

tDate2(i)=tDate2(i)-tDate3(i)

tDate3(i)=tDate3(i)-tDate4(i)

PRINT #2,tCharge$(i),tDate1(i),tDate2(i),tDate3(i),tDate4(i)

PRINT tCharge$(i),tDate1(i),tDate2(i),tDate3(i),tDate4(i)

NEXT i

CLOSE #1

CLOSE #2

PRINT”Click to continue...”

WHILE MOUSE(0)<>1:WEND

END

SUB Sort(count,Charge$(),Date1(),Date2(),Date3(),Date4()) STATIC

flips=1

WHILE flips

flips=0

FOR i=2 TO count

IF Charge$(i-1)> Charge$(i) THEN

flips=1

SWAP Charge$(i-1),Charge$(i)

SWAP Date1(i-1),Date1(i)

SWAP Date2(i-1),Date2(i)

SWAP Date3(i-1),Date3(i)

SWAP Date4(i-1),Date4(i)

END IF

WEND

END SUB

HYPER-CORNER

The new update disk for HyperCard version 1.1 contains a read me stack with

three new buttons pertaining to importing and exporting data. If you don’t have the

new update you can get it from your Apple dealer. Version 1.1 is supposed to fix some

bugs that “bugged” some of us.

The read me stack gives directions for using the new buttons, but I’ll try to

explain them a little bit here. To try them out, create a new stack with a few fields in

it such as the one I created; see fig. 1.

Fig 1. Hyper Dialog

HyperCard Import/Export

The arrow keys were added from the Button Ideas stack to make it easy to check

out the data, but you will need to remember the command 1,2,3,4 keys that move you

from card to card or you may use the arrow (or option-arrow) keys to move from

field to field.

Fig. 2 Export Dialog

The most useful of the buttons is the Export Data button. It allows you to select

the background fields you wish to export and the delimiters separating the fields and

records. Most database programs will read the tab delimited records (with carriage

return between each record). NO, the export button will not export card fields, only

background fields. That is understandable since the output needs to be consistent.

I found that the best way to see how the buttons work is to try them out. Clicking

on the Export Data (or Export Text) button will produce this dialog which will let you

select the background field you desire to output. (See figure 2)

Selecting the Fields for Export

After selecting the All button or after selecting which of the background fields

you want to export, you specify the file name to export to. A dialog comes up to ask you

for the name of the text file to export to:

Fig. 3 Editable Dialog

Selecting a Filename for Export

There is an XCMD around someplace for calling the Standard File Dialog which

would be better to substitute for this dialog. The problem with XCMD routines for

general use is that not everyone has them. It would be best to set up your own custom

library of XCMDs to use in your own stacks.

I ran into no difficulty getting my data to export to a file. You can examine your

data with a word processor to see how the text file looks when the data has been

exported. I like using MS Word for examining this kind of file because you can view all

the tabs and carriage returns using the Show ¶ menu item.

The Import Button works well too, except that you can’t import data into existing

background fields. It creates the background fields on the fly. Perhaps someone out

there could create a button script that will allow you to specify which field you would

like to be the first, second and so on field to be read in. AT LEAST, now these buttons

are included with HyperCard. At first, it may appear that the import didn’t work, but

if you read the instructions in the read me stack, it explains that the new background

fields are created one on top of another. To see the fields you will have to peel them

apart. (see figure below)

Imported Fields

Ideally, the importing will be done first thing before the stack layout is designed

if possible. Importing/Exporting of text with HyperCard has been one of the weak

links. With all the interest in HyperCard these days, there should be more XCMD

routines and tools we can use to improve the usefulness of HyperCard.

Referenced by (4):