All Databases MacTech Vol 13-1997

Oct 97 Challenge

Volume Number: 13

Issue Number: 10

Column Tag: Programmer's Challenge

Programmer's Challenge

by Bob Boonstra, Westford, MA

Who Owns the Zebra?

I'm finishing up this column while at the beach on vacation. The place we're staying is

kind of interesting, in that all of the condominiums look exactly alike, except that each

has a different color door. Even more interesting, each condominium is occupied by a

person of a different nationality, and each person owns a different kind of pet. There

are three condominiums, with red, green, and blue doors. They are occupied by an

American, a Canadian, and an Australian (but not necessarily in that order). The

American vacationer lives in the house with the red door. The person in the house with

the blue door owns a dog. The person who owns the cat lives in the middle house. And

the house with the green door is immediately to the right of the house with the blue

door. So, who owns the zebra?

Well, I really am on vacation, but pets aren't allowed, the doors are all the same color,

and I don't know the nationalities of my neighbors. But I did run across a zebra puzzle

recently, and it seemed like a good logic problem for the Challenge. In the above

example, you can reason through the four clues to rule out most of the 216 possible

combinations and conclude that the American owns the zebra. Problem complexity

grows rapidly with the number of variables - in a problem with 5 variables, there

are more than 24 thousand million (that's 24 billion for Americans) combinations,

but the zebra can be found with as few as 14 clues.

Your Challenge this month is to write a program that will reason through a set of clues

and provide a solution consistent with all of the clues. The problem will be provided in

a stilted syntax - for example, the sample problem above would be given as follows:

American ISA person

Canadian ISA person

Australian ISA person

redDoor ISA house

greenDoor ISA house

blueDoor ISA house

dog ISA pet

cat ISA pet

zebra ISA pet

person lives_in house

person owns pet

American lives_in redDoor

blueDoor owns dog

cat IS_LOCATED IN_MIDDLE

greenDoor IMMED_RIGHT_OF blueDoor

SOLVE person owns zebra

ANSWER person house pet

The nine ISA relations define the variables (person, house, pet) and the values those

variables assume in the problem. The next two statements define the relations

(lives_in, owns) between selected pairs of variables. The next four statements are the

clues describing relations between values of variables, discussed further below. The

SOLVE statement defines the question that you are to answer, and the ANSWER

statement defines the format that your solution should take.

The prototype of the code you should write is:

void WhoOwnsZebra(

long problemDimension, /* number of problem variables */

long numClues, /* number of clues provided */

CStr255 clues[], /* the clues */

CStr255 solution[] /* storage for problemDimension result

strings */

);

The problemDimension parameter describes the number of variables in the problem

you are to solve (in the example above, problemDimension was 3). The number of

clues provided is given as numClues (17 in the example). The solution is to be

provided as a sequence of problemDimension n-tuples that form a solution to the

problem, where each n-tuple is a sequence of values in the order described by the

ANSWER clue. In the example given above, one solution would be:

Australian blueDoor dog

Canadian greenDoor cat

American redDoor zebra

The clues will consist of a sequence of case-sensitive tokens separated by spaces. The

clues will take one of the following forms, where tokens in all caps are reserved

words:

value ISA variable

variable relation variable

value relation value

value IS_LOCATED [AT_LEFT | IN_MIDDLE | AT_RIGHT]

value [NEXT_TO | IMMED_RIGHT_OF | IMMED_LEFT_OF] value

SOLVE variable relation value

ANSWER variable ... variable

The ISA reserved word is used to define the variables in the problem and associate legal

values with those variables. The relation statement takes two forms, one that defines a

relationship between two variables, and one that associates a value taken by one

variable with a value taken by another. These associations are transitive (e.g., if the

American lives_in the redDoor house, and the person in the redDoor house owns the

zebra, then the American owns the zebra). The relations associate values, and the

specific words used to define a relation have no meaning except to make the problem

more readable. In addition to the relations defined by the problem, there is a

left-to-right ordering of the n-tuples in the solution. The special predefined

NEXT_TO, IMMED_RIGHT_OF, and IMMED_LEFT_OF relations provide information

about the relative left-to-right ordering of values. The predefined IS_LOCATED

relation associates values with three fixed points in the left-to-right ordering:

AT_LEFT | IN_MIDDLE (middle position, meaningful only for odd values of

problemDimension), and AT_RIGHT (rightmost position).

There may be more than one set of n-tuples that solve the problem, so the solution you

report need not be unique, as long as it is consistent with all the clues. Enough clues

will be provided to uniquely answer the question that you are asked to SOLVE, and you

may use this fact in directing your search. The n-tuples provided in the solution

should be provided in left to right order.

There are no memory restrictions on this problem, except that it must run on my

96MB 8500/200. You should deallocate any dynamically allocated memory before

returning. This will be a native PowerPC Challenge, using the latest CodeWarrior

environment. Solutions may be coded in C, C++, or Pascal.

Now, back to the beach to find that zebra....

Three Months Ago Winner

Congratulations once again to Ernst Munter (Kanata, Ontario) for submitting the

fastest solution to the July Disambiguator Challenge. The problem this month was to

implement a partial string matching algorithm similar to that used in Apple's

QuickView utility, with the problem complicated by the addition of wildcards. Eight

people / teams submitted solutions, and five of those worked correctly. Ernst's

winning solution was more than 15% faster than the second place solution by the team

of Peter Lewis and Eric Gundrum, whose solution was in turn 50% faster than the next

solution.

My test cases used two dictionaries of more than 25000 words each, using more than

800 strings to be matched between the two cases. Some of the strings resulted in a

single match, while others generated several dozen, and a few as many as 2500.

Overall, the test cases generated more than 170000 matches.

There are two key elements to the speed of the winning solution. First, Ernst creates a

digest bitmap for each word in the dictionary. The digest indicates whether the word

contains a single occurrence of particular characters, multiple occurrences, or no

occurrences. The second key feature is the aggregation of these digests into pages of up

to 32 words of equal length. The page digests allow groups of words to be eliminated

from detailed consideration if the words do not contain the correct characters. The

very detailed commentary provides more insight into other optimizations and special

cases in the solution.

The table below lists the execution times in seconds for the combined test cases

required by each correct entry. The number in parentheses after the entrant's name is

the total number of Challenge points earned in all Challenges to date prior to this one.

Name Time Code Data Language

Ernst Munter (266) 1.86 4924 496 C++

Peter Lewis (32) 2.21 2788 455 C++

Eric Gundrum (10) 2.21 2788 455 C++

Ludovoc Nicolle (21) 3.36 2948 72 C

Jonathan Kleid 13.92 3876 16 C++

Randy Boring (37) 22.96 4072 242 C

A. F. errors 2428 220 C++

D. L. errors 4772 32 Pascal

D. H. errors 1068 24 C++

Top 20 Contestants

Here are the Top Contestants for the Programmer's Challenge. The numbers below

include points awarded over the 24 most recent contests, including points earned by

this month's entrants.

You might notice that one of the entries this month was a team solution. One previous

winner was from a team, and in that case I gave the full point award to both members

of the team. I've decided that this approach is inappropriate, so I divided the points for

second place this month between the two members of the team. I also retroactively

split the point award for the previous winning team entry.

You might also notice that our Editor-in-Chief participated in this month's Challenge.

Eric participates for the enjoyment. (In fact, during MacHack '97, Peter and Eric

worked on their Disambiguator entry as well as on their hacks for the Hack Contest.)

To avoid any appearance of conflict of interest, the prize for any Challenge that Eric

might win will be awarded to the author of the second-place entry.

Rank Name Points

1. Munter, Ernst 216

2. Gregg, Xan 63

3. Cooper, Greg 54

4. Lengyel, Eric 40

5. Boring, Randy 39

6. Lewis, Peter 37

7. Mallett, Jeff 30

8. Murphy, ACC 30

9. Nicolle, Ludovic 28

10. Larsson, Gustav 27

11. Antoniewicz, Andy 24

12. Picao, Miguel Cruz 21

13. Day, Mark 20

14. Higgins, Charles 20

15. Slezak, Ken 20

16. Studer, Thomas 20

17. Gundrum, Eric 15

18. Hart, Alan 14

19. O'Connor, Turlough 14

20. Karsh, Bill 12

There are three ways to earn points: (1) scoring in the top 5 of any Challenge, (2)

being the first person to find a bug in a published winning solution or, (3) being the

first person to suggest a Challenge that I use. The points you can win are:

1st place ›20 points

2nd place ›10 points

3rd place ›7 points

Referenced by (4):