Mar 91 Mousehole
Volume Number: 7
Issue Number: 3
Column Tag: Mousehole Report
RAM Cache and THINK C
By Larry Nedry, Mousehole BBS
From: Dave
Re: THINK C Compilation and RAM cache
Anyone know why my compiles get slower as my RAM cache gets bigger? There are
multi-second pauses as THINK C reads header files when my cache is at 1024K, and this
is without VIRTUAL!
From: Istewart
Re: THINK C Compilation and RAM cache
Well, if you’re running under finder, it could be that you’re reducing the available
heap space by about a megabyte! Just a guess, but that may make the compiler work
harder paging things to and from disk?
Unless you have no other use for a megabyte of memory, it seems a bit extreme to
assign it to a cache!? Try 64 or 128K ...
From: Dave
Re: THINK C Compilation and RAM cache
Well, I have 5 meg on my machine, and I leave plenty of space for System, Finder, and
THINK C to operate even when I have a 1 Meg cache. With all the header files I’m
including, I thought it would be a really good idea to cache them so they don’t have to be
read each time I compile a new file. Am I just being thick or is there something screwy
here?
From: Istewart
Re: THINK C Compilation and RAM cache
I’m assuming that you’re referring to the cache that you set from the control panel ...
(please correct me if I’m wrong on that point!)
Here’s my THEORY (for what it’s worth, someone out there please set me straight if
I’ve got it wrong):
The RAM cache saves a copy of the most recently accessed disk sectors. If it’s still
there when the next request to read the same sector is processed, the OS can just use
the copy in RAM instead of having to wait to read it from disk.
The RAM cache is smaller than the disk; once it’s full, it’ll purge out the least
recently accessed sector to make room for the next one that has to be physically read
from disk.
Several things will affect the efficiency of this process:
1) The type of access you’re doing. If you read each sector once, and don’t re-read it
before it gets purged out, the buffer hasn’t helped you at all. In fact, the overhead of
maintaining it is your loss. If, however, you’re repeatedly accessing a few sectors, they
can all be kept in memory, and the saving should be substantial.
2) The size of the cache. If it’s too small, then blocks will be purged out too often
without them being used. If it’s too large, then the system may spend more time
searching for the sector in the buffer than it would have if it had just requested it from
disk. Compounded with that, you may incur this overhead just to find that the sector
ISN’T in the cache, and then have to read it from disk anyway! This is most likely on
long sequential accesses.
3) The relationship between the speed of the cache search and the length of time it
would take to just read the sector directly from the disk. The cache search speed is
affected by CPU speed and the efficiency of the search algorithm. If you have a slow
search time and a fast HD, the point of diminishing (and subsequently negative) returns
would be reached earlier than it would if you had a faster CPU or/and a slower disk.
In your case, I think the cache size is so big that it’s counter- productive. You’ve got
1Mb of cache, enough for about 2000 512K sectors. If it’s doing a linear search of
this, then it’s possible that it’s spending more time searching the cache than it’s saving
you on disk accesses!
That’s my theory, for what it’s worth! I’d suggest experimenting to find the optimum
size for your setup. My own SE (4MB, slowish HD) has it set at 128K, and I’m happy
enough with that setting. I think my MacII at work (5MB, faster HD) is set to 128K
also, though I use that mostly for WP and running a terminal emulator
(If anyone out there really KNOWS what’s going on inside there, please let us know!!
If I had to design it, I’d keep a doubly linked list, with the most recently used sector on
one end, and the least recently on the other. When it searched for a sector, it would
start with the most recently used, progressing towards the least. To discard a sector to
make room for a new one, it would simply drop off the one at the other end of the chain!
Anyone have any better ideas?)
From: Dave
Re: THINK C Compilation and RAM cache
I have a better idea, I think. I would keep a fixed-size hash table for all blocks in the
cache, reducing search time to a constant. E specially for compilation, where header
files have a certain order of precedence, a least recently used cache replacement
algorithm will just waste time if the cache is too small. I think I want an init that tries
to do some special kind of caching for compilation.
From: Istewart
Re: THINK C Compilation and RAM cache
I thought of hashing, it would be great for finding a sector already in the cache.
However, I had a problem figuring out how to determine the least recently used sector
quickly.
Remember that the cache is general purpose - it’s not necessarily designed to optimize
one specific type of task!
I wonder if anyone’s created an INIT that does anything more specific for compilation?
I think I saw one on AOL that claimed to speed up something to do with Think C, but I
never got further than the title, so I don’t know how it works
From: Dave
Re: THINK C Compilation and RAM cache
Remember, LRU is not necessarily a good thing unless your cache is very large.
Header Files have an implicit ordering on them (e specially in object-oriented
programs), because they often must include each other. If the cache is smaller than the
total size of all headers, you will often replace the lowest files in the ordering at the
end of compiling one C file, only to proceed to the next C file and replace the files you
will need later. This leads to a vicious “MISS-EVERY-TIME” cycle. Think about it. You
really want something like “Most often used”. That’s an easier criterion to meet.
From: Btoback
Re: THINK C Compilation and RAM cache
But “most-often-used” is the keep criterion, which means “least-often used” is the
discard strategy. That is the same as “least-recently used” unless cache miss statistics
are kept on the sectors that aren’t in the cache. For that to be useful, the cache has to be
resettable, or cache statistics have to be kept on every sector on the disc. In practice, if
LRU isn’t good enough, the cache isn’t going to help anyway.
From: Istewart
Re: THINK C Compilation and RAM cache
I remember that in my last post I pointed out that the cache is general-purpose, and
not specifically designed for the type of access made by compilers.
I agree with you, this general-purpose scheme is not helpful when processing header
files.
If I was devising a scheme specifically for this situation, it would probably be based on
good old fashioned double buffering, if the hardware/OS will support asynchronous disk
access, though I guess there must be plenty of approaches that can be applied in specific
situations.