LIVING WITH DOS:  DISK CACHES       
by Barry Simon

Copyright (c) 1987, Capital PC User Group Inc.
This material may be republished only for internal use
by other not-for-profit user groups.

Posted on Compuserve with permission of CPCUG.  May not be
reproduced without including the above copyright notice.

Published in the March 1987 issue of the Capital PC Monitor; 
discussion of extended memory has been changed from the published 
article.


I/O, I/O, Its Off to Work We Go!

There is much noise made about running 286 based machines at 8, 10 
or even 12 Megahertz.  While running your computer's 
microprocessor at a faster speed will make a difference, for many 
tasks the difference is bounded because the limiting factor is 
often the speed of your input and output devices known 
collectively as I/O.  That these devices slow down the CPU is seen 
by the typical times involved.  8 MHz means that the CPU goes 
through 8 million cycles per second.  Since a single instruction 
on the 80xx family of chips takes from two to over twenty cycles, 
a CPU in the current generation of MS-DOS machine can run at 
roughly 1 MIPS (millions of instructions per second). 

Memory chips are rated at speeds of 70-200 nanoseconds.  A
nanosecond is a billionth of a second which means that such chips
are capable of speed comparable to CPU speeds.  That the speeds
are slightly less is shown by the need for "wait states", which
slow down the CPU to allow access to memory at its speed; RAM
speeds, however, are roughly equal to those of the CPU.  I/O
speeds are considerably less.  Even a fast hard disk rated at 20
milliseconds has a rated speed 100,000 times the speeds associated
to RAM.  Of course, because the RAM speed is a statement about
each access and hard disk access times involve the first access of
a disk sector, the actual ratios are not that bad.

But memory access, even by slow memory chips, is much faster than
even speedy hard disks; diskettes are even slower.  While disk
transfer rates are slower than RAM exchanges, they are speedy
compared to output through parallel or serial ports, where
transfer rates are measured in 100's of bytes per second.  (1200
baud, for example, means roughly 120 characters per second.)  And
your console, the name for the combined keyboard/monitor I/O
device must interface the computer's slowest component -- you;
its speeds are often the slowest of all.

There are software tools to try to speed up I/O especially by
using RAM for certain operations.  This month, I'll discuss one
category of those tools -- disk caches; programs that can
substantially speed up disk access.

In this article, I discuss six commercial and one shareware disk
cache programs; the programs are:

o  Emmcache, a shareware product by Frank Lozier;

o  Lightning from the Personal Computer Support Group;

o  Polyboost from Polytron;

o  Quickcache from Microsystems Developers, Inc.;

o  Speedcache from FSS Ltd;

o  Super PC-Kwik from Multisoft Corp.; and

o  Vcache from Golden Bow Systems.


What Is a Disk Cache?

Disk caches are based on the idea that you are likely to want to
access a file that you accessed recently.  This is not only true
for obvious data files like a database which you might search
several times in a row, but also for program overlays and for the
files that DOS often consults to locate other files: the FAT and
the various directories, especially the root directory.

Thus every time that a file is accessed, a cache will keep a copy
of that file in memory set aside especially for that purpose.
Since this special memory is limited, the cache has to have an
algorithm to decide which parts of the cache to clear out to make
room for new sectors.  All the caches under discussion use the
algorithm of discarding those parts of the cache which were least
recently accessed; that is, not the ones that were first read the
longest ago but rather than ones which were needed longest ago.
Whenever DOS calls for a sector from disk, the cache program
intercepts the call to check if the requested material is in the
cache memory.  If it is, the copy in memory is used and a disk
access is saved.  A cache can avoid anywhere from one-third to
two-thirds of your disk accesses.  To allow a large cache, it is
natural to put the data part of the cache (that is, the copies of
the sectors which were read rather than code that controls this
data) in extended or expanded memory.

For safety's sake, you would not want these programs to delay
writing to disk material that DOS wants to write to disk; this is
called keeping dirty buffers and none of these programs keep dirty
buffers.  However, as I'll explain, DOS does some of its own disk
caching and it does keep dirty buffers which can produce problems.

Do not confuse keeping dirty buffers, that is delaying writing to
disk, with caching writes.  The latter means that the cache writes
to disk but keeps a copy of the material which is written to disk
if it is different from the copy that was read previously.  For
example, if you load a file in your word processor, change it and
save it, a program that caches writes will save a copy of the
final file version in its cache while one that does not, will not
keep such a copy.  All the commercial programs discussed in this
article cache writes, but Emmcache does not.

When I first started using a cache, I found the experience eerie.
I'd do some action that I often did and wondered why my disk
access light wasn't going on.


Types of Memory

In our discussion of caching, various references will be made to
the different kinds of memory that are available to microcomputer
users.  These include:

o  Conventional memory, the 640K of Random Access Memory (RAM)
that is readily accessible by most 8088/8086/80186 computers.

o  Extended memory, the memory above 1 megabyte (up to 16
megabytes) that is accessible by 80286 computers.  This memory
is not normally accessible for use as conventional memory but is
generally used for RAM disks, disk caches or print spoolers.  

o Lotus/Intel/Microsoft Expanded Memory Specification (LIM EMS) and 
supporting memory boards (up to 8 megabytes) are paged in and out 
of conventional memory, thereby providing the user with additional 
memory for supported software. 


Not a Memory Cache

You should be careful to distinguish between a disk cache and 
memory caches.  There are circumstances where it may happen that 
some of your RAM runs at a higher speed than most of your RAM.  In 
that case, it may pay to cache some of the reading of instructions 
from the slow RAM to speed up programs with loops. Two situations 
are where you add a speedup (usually 80186- or 80286-based) board 
to a PC with lots of old RAM typically at 200 nanoseconds or with 
386 machines where RAM that keeps up with the processor should be 
rated at 100 or even 70 nanoseconds.  In any event, these 
situations involve a memory cache, not a disk cache which is the 
subject of this article. 


Caches Versus RAM Disks

You can also cut down on access to a physical disk by using a RAM
disk, that is by setting aside a part of RAM as a virtual disk
which DOS accesses as if it were an ordinary disk.  There are
several differences between RAM disks and disk caches.  Accessing
files from a RAM disk is often slightly faster as our time tests
will show.  Moreover, the first access of a file with a cache will
be slower than later accesses.  On the negative side, you must
decide in advance which files you'll want on the RAM disk; you'll
also have to be sure to copy any changed data files from the RAM
disk to a real disk or risk losing them when you power down or if
your system crashes.

Which should you use?  That depends on how you use your computer.
If you only use a few programs without extensive data files, a RAM
disk is probably better if you can make one large enough to hold
what it needs to.  In other circumstances, a cache may be
preferable.  If you have the RAM, there may be sense in using
both:  a RAM disk for your common programs and a cache to take up
the slack.  Most of the cache programs have built-in procedures
to avoid caching programs from the RAM disk, allowing you to save
valuable cache space for files from your physical disks.


Read Ahead

Many caches will "read ahead", that is, read in an entire track
whenever any reading takes place.  If your files are large and not
fragmented, this can give you a real speed advantage but if not,
your cache will fill up with unused material.  On a hard disk
with many isolated bad sectors, read ahead can actually slow down
disk access because of phantom disk errors.  Lightning, Super
PC-Kwik, and Vcache have read ahead while the others do not.
Super PC-Kwik has the advantage of having read ahead as an option
that you can turn off.  The makers of Polyboost maintain that
since most hard disks have errors and fragmented files, their
lack of read ahead is a gain over the competition, but I think it
will depend very much on your individual setup.  In my own case,
for example, I have turned read ahead off when running on my main
machine because of the isolated bad sectors on my hard disk.


Are Caches Dangerous?

If your word processor fouls up a file write, all you are likely
to lose is the file you wanted to save.  Typically, the files in
your cache include the FATs and root directories of your disks.
If these go bad, you are likely to have real problems getting to
any of the data on your entire disk.  There are various tools
which can help you recover from such a disaster, but they may not
always work.  This means that caches have an inherent danger to
them.  Of course, since DOS is also writing these files all the
time, you could make the argument that caches are no more
dangerous than DOS; perhaps even less so, since DOS keeps dirty
buffers.

I cannot answer the questions about whether disk caches are really 
dangerous.  I can report that I've met several users who are sure 
that problems they've had with FATs were caused by cache programs. 
This may well be true, although it is also true that if you have 
any problems with the logical structure of your disk and you have 
a cache, you are likely to blame the cache.  During the testing of 
cache programs which went over six months, I lost the contents of 
one of my hard disks three times.  Two seemed to be hardware 
problems solved in one case by a low level reformat and in the 
other by a disk replacement.  But the third one involved a piece 
of software crashing the system; after rebooting, the root 
directory on the hard disk was chopped liver.  I'm suspicious that 
the culprit was the cache I was using but maybe it was DOS' dirty 
buffers or the program that crashed in the first place.  All I can 
say is that caching may be risky.  You should be sure to back up 
often but especially so if you have a cache.  In fact, unless you 
are willing to back up regularly, I recommend strongly against a 
cache.  On the other hand, caches are rather useful.  I'm still 
using a cache in spite of the problems that I had and some of 
those who are certain that they had cache related problems are 
still using them.  And I've met people who feel that caches are 
among their most important utilities. 


Non-standard Setups

Because of the inherent dangers in caching and because caching 
involves modifications of the disk BIOS, you need to be extremely 
careful if your disk setup is non standard.  You may need to 
consult the vendors.  Super PC-Kwik explicitly says not to use it 
if you have a Bernoulli Box while Vcache says that it supports 
these devices.  The publishers of Vcache warned me not to use 
Vcache with my 60 Meg Priam disk which I partitioned with Priam's 
software into two 30 Meg drives; only large disks handled with the 
VFEATURE program they they publish are compatible with Vcache.  On 
the other hand, Super PC-Kwik warns against disks with non-
standard sector sizes but said that it should work with software 
making multiple standard DOS partitions.  I was warned that they 
had not tested the program with the Priam software but I can 
report that it worked perfectly.  Here, my advice is to check with 
the publishers, be sure that you are backed up and run CHKDSK 
several times a day when you first try a caching program with 
anything non-standard. 

With these programs, you cannot cache a network by having a cache
on your work station although you can sometimes cache the network
disks with a cache on the server.  These are complex issues and
before attempting to use caches on machines connected to LANs,
you should be sure to speak with both the cache vendor and the
network vendor.

There is a second warning that needs to be made about using these
programs with AT extended memory, an option that is only available
with Polyboost, Super PC-Kwik and Vcache.  Unfortunately, there
is no memory management protocol for AT extended memory provided
by the current versions of DOS.  This lack of a standard means
there is potential for programs that you try to load there to not
know of each others existence and to therefore overwrite each
other.  Since IBM publishes the source code for VDISK, all these
programs know about its protocol and can avoid clobbering it.
The situation is not so good for other virtual disk programs.
I've seen complaints about problems with AST's SUPERSPL program
and I've had problems with a cache in extended memory overwriting
a RAM disk set with the RAMDRV program included with Microsoft
Windows and with some versions of MS-DOS.  It is unfortunate that
Microsoft has not published the specifications that this program
uses to access extended memory.  So, if you are using any other
programs in extended memory and using an extended memory cache,
be sure to check out the operation of the other programs after
the cache is loaded.  Super PC-Kwik and Vcache have a command
line parameter which you can use to give the program an absolute
address in extended memory at which to load and so avoid the
conflict "by hand".  That they have to resort to such a kludge
speaks to the rather sorry state of extended memory support in
DOS 3.x.

A second aspect of caches in extended memory is that access of
extended memory involves features in the ROM BIOS that are not
often used in the current generation of AT software.  Thus, the
operation may be improper on some AT clones.  In fact, Vcache
comes with a program to test the BIOS access of extended memory.
If there is a problem, the clone maker must correct it.
Given the advent of a DOS that will access extended memory, it is
essential to get such problems rectified.

Two of the programs Speedcache and Quickcache load as device
drivers rather than as com files.  Conventional wisdom would hold
that device drivers are somewhat less prone to compatibility
problems but I don't know if that is valid in these cases.


Use Your Free Cache

If you don't purchase and use one of these stand alone caching
programs, you should at least be sure to make use of the free
cache that comes with DOS.  The cache size is set in units of 512
bytes called buffers.  The default number, which DOS uses if you
don't specify otherwise, is two for 8088 machines and three for
80826 based machines; both are woefully inadequate.  To increase
the number of buffers you must include a line

  buffers=nn

in your config.sys file.  Here nn is the number of buffers that
you want and the recommended numbers tend to be from 15 to 20.

Why not take buffers=99?  The algorithms that DOS uses are not as
efficient as those in commercial caches so that the time it takes
to search the buffers to see if the proper sector is in the buffer
negates the time saved once the number of buffers becomes too
high.

What are the disadvantages of using buffers for a cache?  First
there is the issue of dirty buffers.  Actually, just using a
commercial cache doesn't effect this since caches still use DOS
for reading and writing and so the DOS buffers will still get
used.  However, a cache that lets you decrease the number of
buffers that you use will force DOS to write its buffers to disk
more often because of space considerations.  Another disadvantage
of DOS buffers is that since it is based on 512 byte chunks, if a
program requests more than that at once, DOS will always go to
disk and not check to see if the request is residing in its
buffers.  Finally, there is the size issue that I mentioned; for
really large caches, you'll need a commercial program.

In short, if you don't use a commercial caching program, be sure
to put a line like

  buffers=20

into your config.sys file.


Parameters

Once loaded, cache programs act in the background and require no
action or input from the user.  But some of these programs have
option switches which you'll need to study carefully to load the
program to operate in an optimal manner.  For many, the defaults
will be correct, but you'all at least want to adjust the cache
size.

What is the proper size?  That's a trade off-between what else you
want to use your RAM for and how you use your machine.  I have the
impression that unless your cache is at least 60K, you may be
better served by DOS buffers although for some operations, a 20K
cache will show a noticeable improvement.

Lightning has the annoying feature of using EMS memory if you have
it, even if you'd prefer to use conventional memory; it does not
support AT extended memory.  As the name implies, Emmcache uses
only EMS memory.  Speedcache supports the special bank switching
protocol on the Tall Tree JRAM boards as well as conventional
and EMS memory.  For the other programs, you'll have to decide
whether your cache will reside in conventional, EMS or AT extended
memory and how much memory it will take.  Be warned that some of
the programs default to rather unreasonable values of cache size,
such as all the remaining EMS memory or all the conventional
memory except for 232K for your remaining programs.  Other
parameters vary from program to program and concern things like
what drives to cache and what algorithms to use in specific
cases.  For all but the what and how much memory to use, you can
probably get away with using the defaults initially.

Super PC-Kwik has many switches and it may pay to vary some of
the switches and do some testing if some aspects of performance
seem below what you expect.  For example, on the Kaypro 286i,
changing the diskette parameter from the default /d+ to /d-
resulted in an improvement of the diskettes test by a factor of
more than 4!


Memory Usage

Table 1 shows memory usage of the cache; it lists the amount of
conventional memory used by the control part of the software
exclusive of the memory taken by the cache.  If you put the cache
in conventional memory, the amount in this table will be
overwhelmed by the amount of memory taken by the cache itself
but, if you place the cache in EMS or extended memory, this
figure will be quite important.  For some of the conventional
memory caches, you pick only the total size of cache plus
controlling code.  For these, the amount of memory in the control
part cannot be determined; these are indicated in the Table with
an *.  All numbers are in kilobytes except for the first row.
For those that allow you to decrease the number of DOS buffers,
the second row can show a rather significant savings.  The
figures for diskette cache give the amount needed to cache two
diskette drives; for several of the programs, diskette caching is
automatic and this amount is then listed as zero.  Polyboost
suggests that you won't need to cache diskette drives if you have
a hard drive; depending on your mode of operation, that may be
true.

All the programs except for Polyboost will cache several hard
disks from the same cache with only one loading of the control
software.  Polyboost requires multiple loading of its hard disk
cache which has two unfortunate consequences: you double the
overhead involved with the cache control software and you must
dedicate memory as associated with either one hard disk or the
other; this isn't useful if you tend to work on one hard disk for
a while and then switch to the other.  Polyboost's caching is
limited to two hard disks.  Two of the programs, Quickcache and
Speedcache, use an "advanced" EMS call not supported in the
current version of the Xebec Amnesia board software which I was
using; therefore, I am not able to report their memory usage.  In
this instance, Speedcache printed an error message and exited
without loading and Quickcache crashed the system.

     (Table 1 goes here)


Time tests

Table 2 shows the results of time tests.  The tests are intended
to be "real world" tests.  Tests 1-4 are tests of cache read
functions.  Test 1 is the time to sort a 140K database that I
had just sorted a different way.  This demonstrates the savings
you would get from repeated access to a database.  Test 2 is the
time to spell check a 40K document through the first pass which
checks for possible misspellings.  Test 3 is the time it took to
convert a 500K database from one version of a database I had to
another.  Test 4 is the time to compile, link and EXE2BIN a 100K
file which I had just treated by MASM, LINK and EXE2BIN on a hard
disk and edited.  This is typical of a situation where you may
get a compiler error, correct the source file, and then
recompile.

Test 5 and 6 test the ability to speed up disk writing.  Test 5
is a PC Magazine "write random sectors" test.  This test writes
the same data repeatedly to sectors which may be the same and so
it is particularly sensitive to the trick that caches use of
suppressing a rewrite of identical data to what was earlier
written to disk.  Test 6 is a patched version of test 5 which
writes different data each time.  It was supplied to me by the
publisher of Super PC-Kwik but I think it is a more significant
test than the original test 5.

The remaining tests attempt to check cache overhead or special
elements and are not as significant.  Test 7 is the time it took to
copy 10 files adding to 350K from a hard disk to a floppy and
test 8 is the same for a floppy to floppy copy.

Tests 9, 10 and 11 are Norton's disk test program on a hard disk,
1.2 megabyte floppy, and regular floppy, respectively.  The
Norton tests are included because the results are so dramatic.
These dramatic speed increases over DOS are due mainly to read
ahead as can be seen by running Super PC-Kwik with this option
turned off.  The copy tests check on whether there is time lost
because of cache overhead.

The three columns listing DOS nn are tests done with no cache and
nn buffers.  Tests 1, 2, 4, 5 and 6 were also done from a 1
megabyte RAM disk and Test 3 using two 1 megabyte RAM disks.  For
vague comparison purposes only, three other times are reported
within asterisks:  The time for a Norton disk test on a 2.4 Meg
RAM disk (#9), and the times to copy the same set of files used
in Tests 7 and 8 from a hard disk to a RAM disk (as #7) and from
one RAM disk to another (as #8).

All the tests are done on a Kaypro 286i with a Xebec EMS board.
To check how much overhead EMS causes, I ran the tests for Super
PC-Kwik in both EMS and conventional memory.  This overhead is
due to the lack of DMA support in EMS and not to the bank
switching.  Since I could not get Quickcache and Speedcache to
run under this EMS setup, I did their tests in conventional
memory which gives them a slight advantage.  I used the
recommended number of DOS buffers with buffers=20 in those cases
with no recommendation about decreasing the number of buffers.  I
used 256K of cache.  For all the tests but Tests 8, 10 and 11,
the cache was only hard disk for those programs (Polyboost,
Vcache) with separate diskette caches.  For Vcache, I used a 240K
vs. 24K split between disk and diskette caches and for Polyboost,
which requires separate caches for each diskette, I used a 256K
hard disk cache and 16K for each diskette.

    (Table 2 goes here)

First, the test results illustrate the importance of increasing
buffers above the default 2 or 3 if you are not using a cache;
they also illustrate that there is a break point where too many
buffers can hurt you.  On things that caches do well (Tests 1-4),
caches are competitive with RAM disks.

On Test #1 which is the most typical application of a cache, the
cache programs all showed the same rather substantial gain.  While
there is a some spread on the other figures, the read tests really
don't distinguish between the different caches.  On writing, I'd
give the nod to Super PC-Kwik and note that none of the tests
adequately check for caching writes.  The lack of this feature in
Emmcache made me lean towards Super PC-Kwik.  While Super PC-Kwik
stands out as special in a positive way on writes, it also stands
out negatively on diskette copies.

While on the subject of time tests, I should mention that
Lightning allows you to call up a screen which tells you how much
time you have saved by using the cache.  Its figures are pure
fairy tale!  I found that often it told me that I'd saved time in
situations where I'd actually taken more time than using
buffers=20.  Presumably, it was using some algorithm giving me a
comparison on some kind of slow 8088 based machine with buffers=2.
Super PC-Kwik and Vcache will give you the more accurate listing
of the number of accesses that have been from the cache as
opposed to disk accesses.


Screen Speedup

Polyboost and Vcache come with screen speedup programs;
Polyboost also has a keyboard speedup program which I did not
test.  Table 3 shows tests that I did in typing the same 111K
file to the screen that I used in my earlier articles on console
software.  RAW is a program which turns on DOS' raw mode (see
February Monitor).  The tests with the CRTBOST and EGABOOST
programs that come with Polyboost are done with their optional
parameters set to 1 and to 5. Setting this parameter to 6 is
equivalent to setting it to 5 and turning RAW on. Setting the
parameter to 1 is recommended for most users.  Times are given in
seconds.  For comparison, times are given for some of the other
screen management programs that I have considered.  Fansi Console
has a "quick" parameter which can be turned on and off.

     (Table 3 goes here)


While the times on EGA/CRTBOOST are impressive, it has some bugs.
When EGABOOST was installed, even with its speed parameter set to
the slowest value (1), I was unable to change monitors on a two
monitor system with either DOS' MODE command or a public domain
program that I use.  There are programs that require me to use
Fansi's capability to turn Q=1 on and off from BATch files.  These
programs do not work properly with CRTBOOST at its highest
settings.  You can change to a setting where they do work but only
with a menu driven utility.  Finally, both CRTBOOST and VSCREEN
suffer from the defect that screen speedup can be a disadvantage
if you don't also have screen scrolling memory.  I have not
tested all screen scrolling memory programs with these two speedup
programs but I'd expect at least some incompatibilities.  Fansi
comes with its own screen scrolling memory which even supports
EMS.


Summary

Lightning comes in both copy protected and unprotected versions;
indeed, the price difference is so great that I'd call it
ransomware.  Because you'll want to load the program as part of
your autoexec.bat and the copy protection is of the key disk
version, you will really need the unprotected variety.  All the
other programs are not copy protected.

It seems to me that these programs, as a group, are somewhat
overpriced.  They are subtle but not that complicated as can be
seen by the fact that the main programs are typically about 5K.
Indeed, in cost per byte, they may be the most expensive class of
programs on the market.

On the basis of time tests alone, it is difficult to pick one
among these programs.  Your choice will have to depend on factors
like the amount of conventional memory they use, the particular
characteristics of your system as they relate to issues like read
ahead, and price.

Emmcache is a free program by Frank Lozier of Cleveland State
University.  It is available to CPCUG members in a file called
EMMCACHE.ARC on the MIX BBS, (301) 480-0350.

Lightning is published by the Personal Computer Support Group,
11035 Harry Hines Blvd., #206, Dallas, TX  75229, (214) 351-0564.
The non-copy protected version is $89.95 and the copy protected
version is $49.95.

Polyboost is published by POLYTRON, 1815 Northwest 169th Place,
Suite 2110, Beaverton, OR  97006 (503) 645-1150 and lists for
$79.95.  The package includes screen and keyboard speedup in
addition to the caching software.

Quickcache is published by Microsystems Developers, Inc., 214-1/2
West Main Street, St. Charles, IL  60174; it lists for $49.95.

Speedcache is published by FSS Ltd, 2275 Bascom Ave., Suite 304,
Campbell, CA  95008, (408) 371-6242 and lists for $69.95.

Super PC-Kwik is published by Multisoft Corp., 18220 SW Monte
Verdi, Beaverton, OR  97007, (503) 642-7108 and lists for $79.95.
Also available is a conventional memory cache called Personal
PC-Kwik for $39.95 and a cache without all the options and
"advanced support" called Standard PC-Kwik for $49.95.

Vcache, which includes the Vdiskette and Vscreen programs, is
published by Golden Bow Systems, P.O. Box 3039, San Diego, CA
92103, (619) 298-9349 and lists for $49.95.