All Databases MacTech Vol 07-1991

Mar 91 Mousehole

Volume Number: 7

Issue Number: 3

Column Tag: Mousehole Report

RAM Cache and THINK C

By Larry Nedry, Mousehole BBS

From: Dave

Re: THINK C Compilation and RAM cache

Anyone know why my compiles get slower as my RAM cache gets bigger? There are

multi-second pauses as THINK C reads header files when my cache is at 1024K, and this

is without VIRTUAL!

From: Istewart

Re: THINK C Compilation and RAM cache

Well, if you’re running under finder, it could be that you’re reducing the available

heap space by about a megabyte! Just a guess, but that may make the compiler work

harder paging things to and from disk?

Unless you have no other use for a megabyte of memory, it seems a bit extreme to

assign it to a cache!? Try 64 or 128K ...

From: Dave

Re: THINK C Compilation and RAM cache

Well, I have 5 meg on my machine, and I leave plenty of space for System, Finder, and

THINK C to operate even when I have a 1 Meg cache. With all the header files I’m

including, I thought it would be a really good idea to cache them so they don’t have to be

read each time I compile a new file. Am I just being thick or is there something screwy

here?

From: Istewart

Re: THINK C Compilation and RAM cache

I’m assuming that you’re referring to the cache that you set from the control panel ...

(please correct me if I’m wrong on that point!)

Here’s my THEORY (for what it’s worth, someone out there please set me straight if

I’ve got it wrong):

The RAM cache saves a copy of the most recently accessed disk sectors. If it’s still

there when the next request to read the same sector is processed, the OS can just use

the copy in RAM instead of having to wait to read it from disk.

The RAM cache is smaller than the disk; once it’s full, it’ll purge out the least

recently accessed sector to make room for the next one that has to be physically read

from disk.

Several things will affect the efficiency of this process:

1) The type of access you’re doing. If you read each sector once, and don’t re-read it

before it gets purged out, the buffer hasn’t helped you at all. In fact, the overhead of

maintaining it is your loss. If, however, you’re repeatedly accessing a few sectors, they

can all be kept in memory, and the saving should be substantial.

2) The size of the cache. If it’s too small, then blocks will be purged out too often

without them being used. If it’s too large, then the system may spend more time

searching for the sector in the buffer than it would have if it had just requested it from

disk. Compounded with that, you may incur this overhead just to find that the sector

ISN’T in the cache, and then have to read it from disk anyway! This is most likely on

long sequential accesses.

3) The relationship between the speed of the cache search and the length of time it

would take to just read the sector directly from the disk. The cache search speed is

affected by CPU speed and the efficiency of the search algorithm. If you have a slow

search time and a fast HD, the point of diminishing (and subsequently negative) returns

would be reached earlier than it would if you had a faster CPU or/and a slower disk.

In your case, I think the cache size is so big that it’s counter- productive. You’ve got

1Mb of cache, enough for about 2000 512K sectors. If it’s doing a linear search of

this, then it’s possible that it’s spending more time searching the cache than it’s saving

you on disk accesses!

That’s my theory, for what it’s worth! I’d suggest experimenting to find the optimum

size for your setup. My own SE (4MB, slowish HD) has it set at 128K, and I’m happy

enough with that setting. I think my MacII at work (5MB, faster HD) is set to 128K

also, though I use that mostly for WP and running a terminal emulator

(If anyone out there really KNOWS what’s going on inside there, please let us know!!

If I had to design it, I’d keep a doubly linked list, with the most recently used sector on

one end, and the least recently on the other. When it searched for a sector, it would

start with the most recently used, progressing towards the least. To discard a sector to

make room for a new one, it would simply drop off the one at the other end of the chain!

Anyone have any better ideas?)

From: Dave

Re: THINK C Compilation and RAM cache

I have a better idea, I think. I would keep a fixed-size hash table for all blocks in the

cache, reducing search time to a constant. E specially for compilation, where header

files have a certain order of precedence, a least recently used cache replacement

algorithm will just waste time if the cache is too small. I think I want an init that tries

to do some special kind of caching for compilation.

From: Istewart

Re: THINK C Compilation and RAM cache

I thought of hashing, it would be great for finding a sector already in the cache.

However, I had a problem figuring out how to determine the least recently used sector

quickly.

Remember that the cache is general purpose - it’s not necessarily designed to optimize

one specific type of task!

I wonder if anyone’s created an INIT that does anything more specific for compilation?

I think I saw one on AOL that claimed to speed up something to do with Think C, but I

never got further than the title, so I don’t know how it works

From: Dave

Re: THINK C Compilation and RAM cache

Remember, LRU is not necessarily a good thing unless your cache is very large.

Header Files have an implicit ordering on them (e specially in object-oriented

programs), because they often must include each other. If the cache is smaller than the

total size of all headers, you will often replace the lowest files in the ordering at the

end of compiling one C file, only to proceed to the next C file and replace the files you

will need later. This leads to a vicious “MISS-EVERY-TIME” cycle. Think about it. You

really want something like “Most often used”. That’s an easier criterion to meet.

From: Btoback

Re: THINK C Compilation and RAM cache

But “most-often-used” is the keep criterion, which means “least-often used” is the

discard strategy. That is the same as “least-recently used” unless cache miss statistics

are kept on the sectors that aren’t in the cache. For that to be useful, the cache has to be

resettable, or cache statistics have to be kept on every sector on the disc. In practice, if

LRU isn’t good enough, the cache isn’t going to help anyway.

From: Istewart

Re: THINK C Compilation and RAM cache

I remember that in my last post I pointed out that the cache is general-purpose, and

not specifically designed for the type of access made by compilers.

I agree with you, this general-purpose scheme is not helpful when processing header

files.

If I was devising a scheme specifically for this situation, it would probably be based on

good old fashioned double buffering, if the hardware/OS will support asynchronous disk

access, though I guess there must be plenty of approaches that can be applied in specific

situations.

Referenced by (2):