December 94 - Newton Q & A: Ask the Llama
Newton Q & A: Ask the Llama
Newton Developer Technical Support
Q I could really use some help with speeding up my Newton application. Have you got
any tips on performance?
A You're not the only one who wants this; my llama senses have recently been
overwhelmed by a call for information on performance. All the questions in this
issue's column will relate to performance in some way. Take a look and see if there's
something here that will help you.
There are two important points to remember:
1. None of these tips will work by themselves; you must measure your
code. Use Ticks, use the trace global (see below), use Print. Find out where
your code is slow, or where your application is bloated.
2. There is no silver bullet for a problem; you must experiment with
different solutions.
In the words of my wise programming master: "When is a llama not a llama? . . .
When it is a guanacos." Or, "When you can snatch these coconuts from my hand, then it
will be time for me to leave.
Q I'm building an application that has a large set of static data. I search on a key term
(a string) and get all the data associated with that string. Mike Engber's "Lost In
Space" article (in the May 1994 issue of PIE Developers magazine) says that I should
include this data in my package and things will be fast. But this doesn't seem to be the
case. I have thousands of frames of data. Each frame contains one or more slots with
strings that contain the key terms. I use FindStringInFrame to find all references to a
key term but this takes a long time. Am I doing something wrong?
A This may seem like a simple question, but it isn't. The root of the problem is that
you've made an assumption that functions provided in the ROM are fast, so they'll solve
your problem. In this case, you assumed that FindStringInFrame would be fast. You're
both right and wrong.
FindStringInFrame is fast, but it still has to linearly search every slot in every
frame recursively. That means that if you have thousands of entries, it's checking
thousands of frames. You can talk about how long something will take by calculating the
worst case. FindStringInFrame has to search all your data frames (thousands of
items), and for each frame it has to check each slot to see if it's a string. If so, it then
has to check to see if the string you gave it matches the string it's looking for (step by
step down the string). So if you had n strings (not just data items), and the average
length of a string was m characters, that's n *m checks. In computer science terms,
you would say that FindStringInFrame is an O( n *m ) operation; this is called Big-Oh
notation and, in its simplest form, refers to the worst-case time.
This means you should think about other data structures and methods of accessing
them. In your case, a simple change of data representation would result in a massive
speedup. The idea is to make the expression in the Big-Oh notation have the smallest
possible value. One way to do this is to reduce the search time for your key phrases.
Since you have a fixed set of data, you can sort them and use a binary search algorithm.
You can store the actual data in arrays and store indexes along with the key items.
The nice thing about a binary search is that you're always cutting your search space
in half. On average, you only have to check log to the base 2 of the data. In Big-Oh
notation, that's O(log n ). Of course you still have to do the individual string
comparisons, so you end up with O( m log n ). So for 1000 items, FindStringInFrame
takes 1,000,000 time units, but the modified method takes 3000, a speedup of 300
times! It's unlikely that a function implemented at a low level performs 300 times
faster than custom NewtonScript code.
This excursion into computer science should make you think about your data
structures and how you access them. Of course an academic exercise can take you only
so far. You also have to get your feet wet and test the code. You can use Ticks to get
rough estimates of time, and Stats (after a GC) to get estimates of memory.
Q The following is a viewClickScript from a pickList button in my application. Why
does it take so long to execute?
viewClickScript.func(unit)
begin
currentPickItems := [];
for i := 0 to Length(defaultPickItems) - 1 do
if i = currentSelectedItem then
AddArraySlot(currentPickItems,
{item: defaultPickItems[i], mark: kCheckMarkChar});
else
AddArraySlot(currentPickItems, defaultPickItems[i]);
if :TrackHilite(unit) then
DoPopUp(currentPickItems, :LocalBox().right+3,
:LocalBox().top, self);
end
A There are several possible reasons why your code would execute slowly. Since they
potentially apply to lots of code out there, I'll go through each one separately. At the
end is a rewritten function that should execute considerably faster.
• Lookup costs. Assuming that currentPickItems, currentSelectedItem, and
defaultPickItems are slots somewhere in your view hierarchy, at best they're
slots in the pick button, at worst they're in your base application view.
Remember that each access to a variable requires an inheritance lookup: check
locals, then globals, then current context, then the _proto chain, then the
_parent chain. This cost isn't high for single references but can be deadly in
loops. Every cycle through your loop, you're doing three lookups; that's a lot
of overhead. The solution is to use local variables for faster access.
• Unnecessary object creation. The AddArraySlot call will grow, and
potentially copy, the array on the NewtonScript heap, resulting in a lot of
unnecessary memory movement. Since you know the length of the
currentPickItems array in advance, you should preallocate the array and use
the array accessor (that is, [n]) to add array elements. You can use the Array
function call to allocate the array:
local pickItems := Array(Length(defaultPickItems), nil);
• Unnecessary execution. You need to create a new pick list only if the call
to TrackHilite succeeds. You should make the TrackHilite conditional be the
outer conditional:
if :TrackHilite(unit) then
begin
// construct pick list and DoPopUp
...
end;
• Inefficient variable initialization. It's inefficient to use a loop for
initializing currentPickItems from defaultPickItems, because
currentPickItems has only minor differences. It's better to use Clone for
initialization. This way you get a new array whose elements are references
back to the array items in defaultPickItems. All you need to do is replace the
individual references in currentPickItems with their new or modified values.
It's the difference between an O( n ) operation (traversing all the array items
in defaultPickItems) and an O(1) operation (accessing only the changed
item). In other words, expect about an order of magnitude difference.
• Unnecessary slot. In this case you don't need to have a currentPickItems
slot since its value is recreated each time the viewClickScript is executed.
You're better off using a local variable.
The modified code is shown below. To illustrate the savings, I ran a brief test using a
defaultPickItems array of ten elements. Each function is called 100 times (note that
TrackHilite was always true). I found the following code to be over six times faster
than the original code.
viewClickScript.func(unit)
begin
if :TrackHilite(unit) then
begin
local pickItems := Clone(defaultPickItems);
local selectedItem := currentSelectedItem;
local l := :LocalBox();
if selectedItem then
pickItems[selectedItem] :=
{item: pickItems[selectedItem], mark: kCheckMarkChar};
DoPopUp(pickItems, l.right+3, l.top, self);
end;
end
Q I've written my own IsASCIIAlpha, IsASCIINumeric, etc. functions. They seem to be
really slow. Why is that? Here's my IsASCIIAlpha:
// returns true if s is an alpha string (i.e., between a..z or A..Z)
IsASCIIAlpha.func(s)
begin
local c := Upcase(Clone(s));
local i;
for i := 0 to StrLen(c) - 1 do
if (StrCompare(SubStr(c, i, 1), "A") < 0) or
(StrCompare(SubStr(c, i, 1), "Z") > 0) then
return nil;
true;
end;
A The main source of the slowness is that you're using string functions when
character functions would be faster. The distinction is subtle but important. In the
code above, you loop through each length 1 substring of the target string to determine
whether it's an alpha character. All this takes time. The Upcase call is O( n ), as are
the SubStr and StrCompare. Of course, the StrCompare isn't really that slow, but it's
still slower than you need.
The SubStr call is returning a single character at a time, but in the form of a string.
That means there is a memory allocation for at least two characters (the content and
the null terminator) for each call to SubStr. A better way is to compare each character
of the string. In certain circumstances you can access a character at a time with the
array accessor (that is, []). An example of a function that does this is IsASCIIAlpha3
(see the code on this issue's CD). In general, when you need either a single character
from a string or character-by-character access, the array-like syntax is faster.
Note that the final fix to the code is that it doesn't do any preprocessing of the string;
instead it uses a lookup in an pregenerated array of valid alphabetic ASCII characters.
That gives it a significant speed advantage. Since timing in the Inspector is a useful
technique, the code to do the timings and print results is included on the CD. Also note
that this function is specifically for ASCII characters, so characters like É and ß
would fail. Something else to note: Newton is a Unicode-based device. ASCII is a subset
of Unicode (from 0x0000 to 0x007F), but Unicode characters up to 0xFFFD are
documented. Your routine is checking only some of the characters on page 0 (that is,
characters of the form 0x00 nn ), but it must deal with all characters.
Q I'm trying to use the trace global to get information on what methods are called.
But I get lots of output that doesn't start or end where I want. What can I do?
A There are really two questions here: how to use trace effectively, and how to use
the output. Usually you would turn tracing on inside a method, then turn it off later on
in the code. Unfortunately, you need to do more than just set the value of trace; you
also have to force the interpreter to notice that trace has changed. The PIE Developer
Technical Support NewtonScript Q&A on debugging (on this issue's CD, among other
places) tells you how to do this.
// to turn tracing on for functions
trace := 'functions;
// force interpreter to notice change in state of trace variable
Apply(func () nil, []);
// to turn tracing off
trace := nil;
Apply(func () nil, []);
Once you have the trace output, you should cut and paste it into a text processor. There
are three main bits of information you can get from a trace:
• You can look at how many messages are generated from an apparently
simple call. You can use trace in conjunction with function call timings made
using Ticks to see why a particular call takes so long. Using the find feature of
your text processor, you can jump to the function call you're looking at.
• You can look at the values passed in and returned by function calls.
• Perhaps most useful of all, you can use the text processor to strip away
all the extraneous information (things like the lines specifying return values
-- that is, lines that contain the string "=>" as the first non-whitespace
entry) so that you're left with the messages sent. Then you can sort the
messages and get a histogram of the results. This process is easier if you have
a text processor that supports grep-like text substitution (regular
expressions) and sorts.
Q I'm using the Newton Toolkit layout editor to organize my data object classes in my
application. I have 20 classes with one layout per object type. To access the objects, I
declare each class layout to the main application. This gives me the benefits of parent
inheritance. Unfortunately, even my test applications are memory hogs. I would expect
a time penalty, but why is there such a large space penalty?
A The space penalty is much larger than it needs to be. You're using a layout editor to
edit your classes so that you can graphically edit the classes' slots. But this has the
disadvantage that you have to specify each class as some sort of view class or
prototype, perhaps a simple clView. It's the cause of your space problem, because you
also carry all the memory and runtime allocation that goes with a view. Since your
layouts are declared to your base application view, and since the default for a clView is
visible, each of your classes is also a full runtime view. That can take a large amount
of space on the NewtonScript heap. For a clView, the penalty is roughly 40 bytes, so
that's an extra 800 bytes of NewtonScript heap that you can free.
A better solution is to avoid using the NewtonScript heap for your class (after all,
that's one of the advantages of prototype inheritance). You can do this in one of two
ways:
• If you still want to use a layout editor to edit your class, you can use a
user prototype instead of a layout. At run time, you'll have access to the data
class using the PT_ syntax documented in the Newton Toolkit User's
Guide (page 4-25). Remember that the user prototype will be read-only.
• The other option is to textually define the class. You can do this in your
Project Data file, or use the Load command to read in a different text file. See
the PIE Developer Technical Support NewtonScript Q&A document on this
issue's CD for more information.
The llama is the unofficial mascot of the Developer Technical Support group in
Apple's Personal Interactive Electronics (PIE) division.*
Send your Newton-related questions to NewtonMail DRLLAMA or AppleLink
DR.LLAMA. The first time we use a question from you, we'll send you a T-shirt.*
Thanks to our PIE Partners for the questions used in this column, and to jXopher,
Bob Ebert, Mike Engber, Kent Sandvik, Jim Schram, and Maurice Sharp for the
answers. *
Have more questions? Need more answers? Take a look at PIE Developer Info on
AppleLink. *