Mar 93 Challenge
Volume Number: 9
Issue Number: 3
Column Tag: Programmers’ Challenge
Programmers’ Challenge
By Mike Scanlin, MacTech Magazine Regular Contributing Author
Note: Source code files accompanying article are located on MacTech CD-ROM or
source code disks.
Count Unique Words
Most word processors these days have a Count Words command. The quality in
terms of accuracy and speed of these commands varies quite a bit. I tested three leading
word processors with a document containing 124,829 characters and got three
different answers ranging from 18446 words to 18886 words and times ranging from
4 seconds to 11 seconds. I’m not sure what the correct answer was for that document;
it depends on how you define what a word is.
For purposes of this month’s challenge, a word is defined as an unbroken set of
one or more letters. The input text will only contain upper and lower case letters a to
z, spaces, carriage returns, periods and commas (for a total of 56 possible byte
values). No digits, hyphens, tabs, other punctuation, etc. Since counting words using
this simplified definition is rather trivial, you’re going to count the number of unique
words instead.
The prototype of the function you write is:
unsigned short CountUniqueWords(textPtr, byteCount)
Ptr textPtr;
unsigned short byteCount;
Your function should return the number of unique words (case insensitive) in
the input text. The maximum word length for individual words in the input text is 255
characters.
This is my 7th programmer’s challenge that I’ve posed to MacTech readers. I
have received approximately very little feedback as to what you think of these
challenges. Are they too easy, too hard, too uninteresting, or what? Do you want hard
core numerical analysis puzzles (like write a fast sqrt function) or do you want
Mac-specific problems (like write a fast TileAndStackWindows function) or are things
okay as they are? If you have any ideas for future challenges, please send them in
(credit will be given in this column if I use one of your ideas). Thanks.
Two Months Ago Winner
The winner of the “Travelling Salesman” challenge is Ronald Nepsund
(Northridge, CA) whose solution was the only one of the five I received which gave
correct results. The time intensive part of solutions to this class of problems is the
distance between two points calculation, which involves a square root. Ronald uses a
precomputed sqrt table for values 0 to 25 to eliminate much of this time.
A couple of people chose the algorithm of “find the closest city to where we
currently are and move to that city; repeat until all cities have been visited” which is
not correct. An example set of input data that broke everyone but Ronald’s solution is:
numCities = 8, startCityIndex = 5, *citiesPtr = {1,1}, {2,1}, {3,1}, {2,2}, {1,3},
{2,3}, {3,3}, {2,4}. If you draw it and work it out by hand (through trial and error)
you can see that the minimum path distance is 8.66. There is more than one correct
ordering for the optimal path but all of the optimal paths will have that same length.
Here is Ronald’s winning solution to the January Challenge:
//***********************************
// Travelling Salesman
// by Ronald M. Nepsund
#include
#define fracBase 0x20000000
//There are two 32 by 20 arrays of longs
//which together give the distance betwean
//any two cities.
//Instead of using Array[i,j] to access
//the array Array[(i<<5)+j] is used
//and two longs are needed to accurately
//measure the distance betwean cities
//so two arrays of longs are used.
//gDistanceFrac is used to hold the
//fractional part of distance in 1/0x20000000
//of a unit.
long gDistanceInt[640],
gDistanceFrac[640];
//these are used to represent a path betwean
//the cities.
Byte gNextCity[20],gOptPath[20];
//how long is the currently selected best
//path so far.
long gBestPathLength,gFracBestPathLength;
unsigned short gNumCities;
unsigned short gStartCityIndex;
//precalculated square root for zero to 25
long qSquTableInt[] =
{0,1,1,1,2,2,2,2,2,3,3,3,3,3,3,3,
4,4,4,4,4,4,4,4,4,
5,5,5,5,5,5,5,5,5,5,5};
//precalculated square root - fractional part
//in 1/0x20000000 of a whole unit
long qSquTableFrac[] =
{0,0,222379212,393016784,0,126738030,
241317968,346685095,444758425,0,
87122155,169986639,249162657,
325102865,398174277,468679365,0,
66091829,130266726,192682403,
253476060,312767944,370664138,
427258795,482635936,0 };
void DoPath(short cityIndex, long InttPathLength,
long fracPathLength);
//The recursive routien that actually finds
//the shortest path.
void DoPath(register short cityIndex,
long InttPathLength,
long fracPathLength)
{
register short i;
Boolean lastCity;
long offset;
if (fracPathLength > fracBase) {
//the fractional value variable has
//exceeded the value of one whole
//unit
InttPathLength += 1;
fracPathLength -= fracBase;
}
//Has the path has become longer than the
//shortest path we have already found?
if (InttPathLength > gBestPathLength ||
((InttPathLength == gBestPathLength) &&
(fracPathLength >= gFracBestPathLength)))
return;
//lastCity is used to tell if all the
//cities have been visited
lastCity = TRUE;
//for each city
for(i = 0; i
//if not to same city or already
//visited city
if ( i != cityIndex &&
gNextCity[i] == 0xFF) {
//not at the end of the path
lastCity = FALSE;
//path from city ‘cityIndex’ to ‘i’
gNextCity[cityIndex] = i;
//offset into distance arrays
offset = (cityIndex << 5) + i;
//go to next city adding the
//distance to that city to the
//path length
DoPath(i,
InttPathLength+gDistanceInt[offset],
fracPathLength+gDistanceFrac[offset]);
} //end if and for
// if this is the last city in the chain and
// is a shorter path than the previous best
if ((lastCity) &&
((InttPathLength < gBestPathLength) ||
((InttPathLength == gBestPathLength) &&
(fracPathLength < gFracBestPathLength))
) ) {
// make this the current best path
register long *LPnt1,*LPnt2;
//this is the current shortest path
//length now
gBestPathLength = InttPathLength;
gFracBestPathLength = fracPathLength;
//copy path to ‘optPath’
LPnt1 = (long *)&gNextCity;
LPnt2 = (long *)&gOptPath;
for (i= ((3+gNumCities) >> 2); i>0; i-)
*LPnt2++ = *LPnt1++;
} else
//this city is no longer connected to
//the next city
gNextCity[cityIndex] = 0xFF;
}
void InitDistances(unsigned short numCities
Point *citiesPtr);
//initialize two arrays which will give the
//distance betwean any two cities.
void InitDistances(
unsigned short numCities,
Point *citiesPtr)
{
short i,j,offset;
register long *LPntl1,*LPntF1,
*LPntI2,*LPntF2;
long dist;
short deltax,deltay;
double X;
//The distance from city i to j is the same
//as from city j to i.
//Use pointers into the arrays
//We will add a constant to the pointers to
//step through the array
//instead of doing a multiplication to find
//the wanted entries in the array
//how far is it betwean any two cities
for (i=0; i
LPntl1 = gDistanceInt + i;
LPntF1 = gDistanceFrac + i;
offset = i << 5;
LPntI2 = gDistanceInt + offset;
LPntF2 = gDistanceFrac + offset;
for (j=0; j<=i; j++)
if (i==j) {
//both pointers are pointing
//to the same locations in the array
//distance to the same city is zero
*LPntI2++ = 0;
*LPntl1 = 0; LPntl1 += 32;
*LPntF2++;
LPntF1 += 32;
} else {
//calculate horizontal and vertical
//distance betwean city ‘i’ and ‘j’
deltax = citiesPtr[i].h-
citiesPtr[j].h;
deltay = citiesPtr[i].v-
citiesPtr[j].v;
//The distance betwean the cities is
// squareRoot( deltax*deltax +
// deltay*deltay)
//Where you can, do multiplications
//using shorts instead of long’s -
//They are faster.
if (-255< deltax && deltax<256)
if (-255< deltay && deltay<256)
dist = ((long)(deltax*deltax) +
(long)(deltay*deltay));
else
dist = ((long)(deltax*deltax) +
(long)deltay*deltay);
else
if (-255< deltay && deltay<256)
dist = ((long)deltax*deltax +
(long)(deltay*deltay));
else
dist = ((long)deltax*deltax +
(long)deltay*deltay);
//do squareRoot
if (dist <= 25) {
//use sqrt lookup tables for
//0 to 25
*LPntI2++ = *LPntl1 =
qSquTableInt[dist];
LPntl1 += 32;
*LPntF2++ = *LPntF1 =
qSquTableFrac[dist];
LPntF1 += 32;
} else {
X = sqrt(dist);
//gDistanceInt[(i<<5) + j] = X;
//gDistanceInt[(j<<5) + i] = X;
//integer part of distance
// between points
dist = X;
*LPntl1 = *LPntI2++ = dist;
LPntl1 += 32;
//gDistanceFrac[i<<5 + j] =
// (X - dist) * $20000000;
//gDistanceFrac[j<<5 + i] =
// (X - dist) * $20000000;
// fractional part
dist = (X - dist) * fracBase;
*LPntF2++ = *LPntF1 = dist;
LPntF1 += 32;
}
}
}
}
void OptimalPath(unsigned short numCities
unsigned short startCityIndex,
Point *citiesPtr,Point *optimalPathPtr);
void OptimalPath(numCities,startCityIndex,citiesPtr,
optimalPathPtr)
unsigned short numCities;
unsigned short startCityIndex;
Point *citiesPtr;
Point *optimalPathPtr;
{
register short i,j;
long time,index;
double X;
//generates the tables for the distances
//betwean any two cities.
//This routien takes up most of the time.
InitDistances(numCities,citiesPtr);
//OxFF means that there is no path from
//this city to another
for (i=0; i
//no paths betwean cities
gNextCity[i] = 0xFF;
gNumCities = numCities;
gStartCityIndex = startCityIndex;
//any path done by DoPath will be shorter
//than this
gBestPathLength = 0x7FFFFFFF;
gFracBestPathLength = 0;
//find the best path
DoPath(startCityIndex,0,0);
//put the best path into the form
//desired for ‘optimalPath’
j=startCityIndex;
for(i=0; i
optimalPathPtr[i] = citiesPtr[j];
j = gOptPath[j];
}
}
Rules
Here’s how it works: Each month there will be a different programming challenge
presented here. First, you must write some code that solves the challenge. Second, you
must optimize your code (a lot). Then, submit your solution to MacTech Magazine
(formerly MacTutor). A winner will be chosen based on code correctness, speed, size
and elegance (in that order of importance) as well as the postmark of the answer. In
the event of multiple equally desirable solutions, one winner will be chosen at random
(with honorable mention, but no prize, given to the runners up). The prize for the
best solution each month is $50 and a limited edition “The Winner! MacTech Magazine
Programming Challenge” T-shirt (not to be found in stores).
In order to make fair comparisons between solutions, all solutions must be in
ANSI compatible C. All entries will be tested with the FPU and 68020 flags turned off
in THINK C. When timing routines, the latest version of THINK C will be used (with
ANSI Settings plus “Honor ‘register’ first” and “Use Global Optimizer” turned on) so
beware if you optimize for a different C compiler.
The solution and winners for this month’s Programmers’ Challenge will be
published in the issue two months later. All submissions must be received by the 10th
day of the month printed on the front of this issue.
All solutions should be marked “Attn: Programmers’ Challenge Solution” and
sent to Xplain Corporation (the publishers of MacTech Magazine) via “snail mail” or
preferably, e-mail - AppleLink: MT.PROGCHAL, Internet: progchallenge@xplain.com,
and CompuServe: 71552,174. If you send via snail mail, please include a disk with
the solution and all related files (including contact information). See page 2 for
information on “How to Contact Xplain Corporation.”
MacTech Magazine reserves the right to publish any solution entered in the
Programming Challenge of the Month and all entries are the property of MacTech
Magazine upon submission. The submission falls under all the same conventions of an
article submission.