Jun 94 Challenge
Volume Number: 10
Issue Number: 6
Column Tag: Programmers’ Challenge
!seealso: "May 94 Challenge" " Jul 94 Challenge
Programmers’ Challenge
By Mike Scanlin, MacTech Magazine Regular Contributing Author
Note: Source code files accompanying article are located on MacTech CD-ROM or
source code disks.
The rules
Here’s how it works: Each month there will be a different programming challenge
presented here. First, you must write some code that solves the challenge. Second, you
must optimize your code (a lot). Then, submit your solution to MacTech Magazine
(formerly MacTutor). A winner will be chosen based on code correctness, speed, size
and elegance (in that order of importance) as well as the postmark of the answer. In
the event of multiple equally desirable solutions, one winner will be chosen at random
(with honorable mention, but no prize, given to the runners up). The prize for the
best solution each month is $50 and a limited edition “The Winner! MacTech Magazine
Programming Challenge” T-shirt (not to be found in stores).
In order to make fair comparisons between solutions, all solutions must be in
ANSI compatible C (i.e., don’t use Think’s Object extensions). Only pure C code can be
used. Any entries with any assembly in them will be disqualified (except for those
challenges specifically stated to be in assembly). However, you may call any routine
in the Macintosh toolbox you want (i.e., it doesn’t matter if you use NewPtr instead of
malloc). All entries will be tested with the FPU and 68020 flags turned off in THINK C.
When timing routines, the latest version of THINK C will be used (with ANSI Settings
plus “Honor ‘register’ first” and “Use Global Optimizer” turned on) so beware if you
optimize for a different C compiler. All code should be limited to 60 characters wide.
This will aid us in dealing with e-mail gateways and page layout.
The solution and winners for this month’s Programmers’ Challenge will be
published in the issue two months later. All submissions must be received by the 10th
day of the month printed on the front of this issue.
All solutions should be marked “Attn: Programmers’ Challenge Solution” and
sent to Xplain Corporation (the publishers of MacTech Magazine) via “snail mail” or
preferably, e-mail - AppleLink: MT.PROGCHAL, Internet: progchallenge@xplain.com,
CompuServe: 71552,174 and America Online: MT PRGCHAL. If you send via snail
mail, please include a disk with the solution and all related files (including contact
information). See page 2 for information on “How to Contact Xplain Corporation.”
MacTech Magazine reserves the right to publish any solution entered in the
Programming Challenge of the Month. Authors grant MacTech Magazine the
non-exclusive right to publish entries without limitation upon submission of each
entry. Copyrights for the code are retained by the author.
FACTORING
Being able to factor quickly is an important part of breaking secret codes, I
mean, writing cool Mac games. This month’s challenge, therefore, is to factor a 64-bit
number into the two primes that were multiplied together to produce it.
The prototype of the function you write is:
/* 1 */
void Factor64(lowHalf, highHalf
prime1Ptr, prime2Ptr)
unsigned long lowHalf;
unsigned long highHalf;
unsigned long *prime1Ptr;
unsigned long *prime2Ptr;
highHalf and lowHalf are the 64-bit input number split into two pieces (bit zero
of lowHalf is bit 0 of the input number and bit 31 of highHalf is bit 63 of the input
number). The input number is guaranteed to be the product of two primes, each of
which is 32 bits or less. Your routine will store one prime at *prime1Ptr and the
other one at *prime2Ptr (in either order).
Remember, solutions must be in C to qualify for entry into the Challenge but
assembly versions might get mentioned if they’re wicked fast. Also, if anyone has a
nice routine for factoring even larger numbers (like, say, 256-bit numbers) into
composite primes and wouldn’t mind sharing it with MacTech readers then send it on
in. The best one might get published along with the winning solution.
TWO MONTHS AGO WINNER
The competition for the Swap Blocks challenge was unusually tough. There were
several very high quality entries. Congratulations to Bill Karsh (Chicago, IL) for
winning with the fastest entry. It was only last month that I declared Bob Boonstra
(Westford, MA) the Programmer Challenge Champion for having the most number of
first place showings but now he and Bill are tied for that elusive title (with three wins
each). Jorg Brown (San Francisco, CA) deserves praise for his second place showing.
His code size was just over half of Bill’s winning solution and was nearly as fast.
Here are the code sizes and times for two different tests. The first time test was
for random size inputs (according to the distribution stated in the problem). The
second time test was for blocks that were roughly, but not exactly, equal in size
(again, with the given distributions but with both sizes coming from the same size
category). Numbers in parens after a person’s name indicate how many times that
person has finished in the top 5 places of all previous Programmer Challenges, not
including this one:
Name time 1 time 2 code size
Bill Karsh (3) 170 219 642
Jorg Brown 174 242 366
Jim Lloyd 209 408 1642
Lorn Olsen 239 350 670
Ted Krovetz 243 247 88
Stepan Riha (6) 243 347 452
Bob Boonstra (8) 247 443 480
Jeffry Spain 248 397 234
Greg Landweber (1) 264 491 300
Martin Weiss 281 601 210
Christopher Suley 299 321 110
Dave Darrah 299 681 284
Ernst Munter 315 414 632
Xan Gregg 340 1260 484
Michael Anderson 359 942 156
Allen Stenger (5) 393 436 156
Michael Panchenko 409 465 82
Danny Stevenson 449 583 424
Eric Bennett 493 1478 284
Arnold Woodworth 595 729 206
Bob Boonstra 212 418 400
(assembly)
The SwapBytes problem is really a multi-byte rotate problem. Think about it
this way: If you had a 32-bit register and you wanted to swap the low 7 bits with the
upper 25 bits you could just rotate it 7 bit positions to the right. The rotate
instruction is like a SwapBits operation where size1 + size2 always equals 32.
Almost everyone who entered used a variant of this observation. The fifth place
entry by Ted Krovetz (Santa Cruz, CA) illustrates it nicely:
/* 2 */
void SwapBlocks (void *p1, void *p2,
void *swapPtr, ulong size1,
ulong size2, ulong swapSize)
{
long *lp1 = (long *)p1;
long *lp2 = (long *)p2;
ulong s1 = size1 >> 2;
ulong s2 = size2 >> 2;
ulong count;
long temp, *tempp1, *tempp2;
do {
if (s1 < s2) {
count = s1;
tempp1 = lp1;
s2 -= s1;
tempp2 = lp2 + s2;
}
else {
count = s2;
tempp1 = lp1;
tempp2 = lp2;
lp1 += s2;
s1 -= s2;
}
do {
temp = *tempp1;
*(tempp1++) = *tempp2;
*(tempp2++) = temp;
} while (--count);
} while (s1);
}
Because Bill’s winning solution is so general purpose and macro-ized it is not the
easiest code to read (although I commend his generality in making a useful piece of
reusable and portable code). He has compile-time flags that let you build a large fast
version (over 600 bytes, which was the version timed) or a small slower version
(less than 100 bytes). And you can optionally change the 4 byte alignment assumption
into a 2 byte or 1 byte alignment assumption (by redefining AtomSize).
I used Think C’s preprocessor command to see what all those #defines would boil
down to. The core swap code for those cases where you can’t use the temporary swap
space (cause it’s too small) ends up looking like this:
/* 3 */
switch( (short)q ) {
case 0:
while( --nS ) {
q = *pL;
*pL++ = *pR;
*pR++ = q;
case 7:
q = *pL;
*pL++ = *pR;
*pR++ = q;
case 6:
q = *pL;
*pL++ = *pR;
*pR++ = q;
case 5:
q = *pL;
*pL++ = *pR;
*pR++ = q;
case 4:
q = *pL;
*pL++ = *pR;
*pR++ = q;
case 3:
q = *pL;
*pL++ = *pR;
*pR++ = q;
case 2:
q = *pL;
*pL++ = *pR;
*pR++ = q;
case 1:
q = *pL;
*pL++ = *pR;
*pR++ = q;
} /* end while */
}; /* end switch */
This illustrates some interesting loop unrolling syntax that’s possible in C. As
the code shows, it’s legal to spread a while statement over several case labels in a
switch statement. Which nicely solves the problem of “How do you handle the
remainder?” when you unroll a loop 8 times. In this example nS is the number of
times to swap divided by 8 and q is numTimesToSwap mod 8. So if numTimesToSwap is
10 then q is 2 and nS is 1. When the switch statement is executed it will branch to case
2 which does 2 swaps and then loops back to the top of the while loop. It runs through
one set of 8 swaps and then stops. Pretty cool syntax.
Here’s Bill’s winning solution:
SwapBlocks
Response to Apr 94 MacTech Programmer's Challenge.
by Bill Karsh
Object: Exchange contents of two adjacent memory blocks.
Redirection: This is an interesting problem, but what would make this guy really
useful? As stated, the blocks for the challenge are 4i bytes long and start on 4j aligned
addresses. These are special circumstances which apply to Memory Manager blocks,
and then, only on 68020 or later cpu's. Memory blocks on the 68000 are merely
even aligned and even length. Further, this could be a word processor tool for swapping
runs of bytes, but we would have to relax the alignment and size restrictions even
further to arbitrary address and length since we would almost always be pointing to
characters interior to a handle.
I have written the routine to give its best performance, subject to a specified
minimum enforced alignment and atom size (smallest unit to move). This is controlled
at compile time by:
/* 4 */
typedef long Atom, for len = 4i, addr = 4j,
typedef short Atom, for len = 2i, addr = 2j,
typedef Byte Atom, for len = any, addr = any.
Note - due to an ancient law of portability, preprocessor directives are not
allowed to compare enums, types, sizeof()s or anything else that has machine
dependency hidden in it. This means you have to #define the AtomSize manually. This is
needed to select the proper performance crossover points for that type.
But wait there’s more... You might not tolerate a 644 byte dedicated word
swapper in your text editor, but a 96 byte one might fit. We handle that.
You can tailor the routine to your requirements for execution speed vs. code size
by setting the JobMode constant according to this table:
JobMode Buffers MonsterCopies MonsterSwaps
Smallest No ‰No No
Small No ‰No Yes
Fast Yes ‰No Yes
Fastest Yes ‚Yes Yes
- billKarsh
/* 5 */