Second Cache-off: Official Tests

IRCache Cache-Offs

1. Overview

Our major objective in selecting the test suite is to give participants an opportunity to show off their products in two distinct environments:

Each ``environment'' corresponds to a single performance test. A vendor can run both performance tests or just one. The results of the two performance tests are not meant to be compared directly due to significant differences in the workload parameters.

A performance test is supplemented by three other experiments. All these kinds of tests are discussed below.

2. MSL test

Polyteam will use msl_test tool that comes with Polygraph to verify that a product complies with the minimum MSL setting requirement (the MSL value must be at least 30 seconds).

The MSL test must be run prior to all other tests.

3. Filling-the-cache

We have simplified filling-the-cache rules to reduce the number of ``cheats'' and arguments. The filling rules for the second cache-off are:

  1. A participant clears (flushes, empties) the cache using whatever means a Box has.

  2. A participant declares the total size of disk cache, called S. This size is not affected by high watermarks that some caches may be configured with. S is simply the sum of disk blocks where a Web object may reside.

  3. Polyteam runs the filling-the-cache test configured to generate a fill stream (i.e., cachable misses) totaling 2S.

  4. When the test is over, the cache must be full, subject to high watermarks and such.

Our assumption is that a box would cache most of the cachable traffic. Pumping more data than needed to fill the cache (a) ensures that the cache is indeed full and (b) leaves the cache in a state where replacement policy has been exercised, and the placement of objects on disk is not artificially ideal.

Parameters for the filling-the-cache workload will be announced later, but the workload will resemble filling-the-cache workload used at the first cache-off. Participants will not be allowed to adjust the parameters.

Note that there is no special time limit for this test. A participant is only constrained by the time allocated to finish all the tests.

4. Performance Test

This test uses the PolyMix-2 workload. A performance test must be preceded by filling-the-cache experiment.

Performance tests must report at least 25% document hit ratio (averaged over a phase) and at most 3% of transaction errors (within a phase) for all reported test phases.

5. Downtime test

This test is intended to measure the downtime a user would experience if a Box unexpectedly reboots. The test also helps to estimate for how long the outgoing network links will have to handle increased (no hits) traffic. Note that a Box includes all network gear in participant setup. The test plan follows.

  1. A box is subjected to the best-effort workload for 5 minutes.
  2. All power to the Box components is turned off and then turned on in about 5 seconds, simulating a typical short power outage.
  3. When power goes off, the workload changes so that Polygraph emits about 2 requests every second.
  4. The test then terminates after detecting first successful miss and first successful hit.

Time-to-first-miss and time-to-first-hit (counting from the power outage) are measured and may be reported with appropriate rounding.

Using all sorts of UPS devices will not be allowed during this test because our intention is to measure the downtime when proxy gets rebooted (for whatever reason). Flipping power is just the simplest and natural way to simulate the right conditions. The results of the test may be used by a customer to decide if UPS or extra power supplies are required for her particular setup.

The downtime test must be run against a full cache, after completing performance tests.



$Id: tests.sml,v 1.2 1999/08/12 18:06:09 rousskov Exp rousskov $