Third Cache-off: Meeting Minutes

IRCache Cache-Offs

IRCache hosted an organizational meeting for interested parties in Boulder, Colorado on April 3, 2000.

This document is intended to represent the discussion that took place. It does not represent any final decisions. Almost everything written here is still open for further discussion, and may be changed at a future date. If you want to add to, or clarify something in this document (even if you didn't attend the meeting), please let us know.

Table of Contents

1. Executive Summary
    1.1 Resolved issues
    1.2 Pending issues
2. Attendees
3. Schedule
4. Inter-cache-off tests
5. Changes and Additions to Cache-off Rules
    5.1 Minimum hit ratio
    5.2 Limits on number of entries per vendor/software
    5.3 Cache size calculations
    5.4 Testing time
    5.5 Failed Entries
    5.6 Bailout loopholes
6. Cache-off #3 test suite
7. PolyMix-3 Workload Features
8. PolyMix-3 Workload Parameters
    8.1 Life-cycle model settings
9. Performance Metrics
10. Miscellaneous

1. Executive Summary

This section summarizes meeting results. The rest of this document gives the details.

1.1 Resolved issues

The following decisions were made at the meeting mostly as a result of a compromise or agreement among participants.

  1. Cache-off #3 schedule was proposed.
  2. Inter-cache-off test schedule was proposed.
  3. Polyteam intends to have no artificial cache-off participation limits.
  4. Polyteam guarantees at least 55 hours of testing time.
  5. Presentation format for failed entries was set.
  6. Failed entries will be given an option to be re-tested within two weeks after the cache-off.
  7. Cache-off test suite was set. Downtime test is likely to get a new component to test fault-tolerant clusters of caches.
  8. PolyMix-3 workload features were prioritized.
  9. Some Object Life Cycle model settings were set.
  10. Polyteam will try to use relative (to no-proxy) metrics for response time and similar measurements in the official report. Response time percentiles should be reported. Bandwidth savings per cost metric will be added, provided BHR model is improved. Rack space will be reported.
  11. The ``miscellaneous'' section lists a few other decisions.

While still open for discussion, one would have to convince Polyteam and the majority of vendors that an alternative approach should be used for a given issue.

1.2 Pending issues

Below are some of the issues that remain open either because they were not discussed at the meeting or because there was no consensus among the participants.

  1. Minimum hit ratio was not raised from 25% to 50%.
  2. Maximum response time requirement was not discussed.
  3. Cache size calculations for the fill phase remain unclear.
  4. What distributions for object modification times should we use?
  5. Should Polyteam continue to take photos at the cache-offs?

Eventually, Polyteam will have to decide on these issues based on the meeting discussions and after-meeting feedback.

2. Attendees

The following people were present:
  • Duane Wessels, IRCache
  • Alex Rousskov, IRCache
  • Becky Larsen, Novell
  • Amnon Horowitz, Microsoft
  • Kenichi Chinen, NAIST
  • Ray Schlott, Eolian
  • Carlos Maltzahn, NetApp
  • Solom Heddaya, InfoLibria
  • Henry Guillen, Compaq
  • Pei Cao, Cisco
  • Chris Riegel, Stratacache
  • Tom Morris, Alpha Processor Inc.

Polyteam presented these slides to the group.

3. Schedule

One of the first controversial points was the schedule and timeline for cache-off #3. At the previous organizational meeting, there was rough consensus that cache-offs should occur every six months. Thus, IRCache had tentatively planned for July 17 as the first day of cache-off #3.

A number of attendees felt that six months is too frequent. Some say that the six month schedule is overly demanding on human resources. Some say that the popular media does not care about cache-offs unless something significant occurs (such as a drastic new workload?). There were a number of people proposing nine months between cache-offs.

A couple attendees felt that six months is good. They envision cache-offs becoming obsolete eventually, to be replaced with a SPEC/TPC type system. Frequent cache-offs accelerate this process and evolve the rules and workloads faster.

There was general concern about Polyteam's ability to freeze code and workload with enough lead time for participants to become comfortable with the tool. A number of people were critical of Polyteam's previous behavior in this regard.

After some time, a proposal was made for a cache-off date of Sept 18th, which is eight months after cache-off #2. The resulting schedule is sort of ``adjusted'' to account for Labor Day weekend, which is the first weekend in September:

Date Weeks Before
the Cache-off
Milestone(s)
April 3
9:00 MDT
24 Organizational meeting in Boulder.
May 15
15:00 MDT
18 Feature freeze and workload beta. By this date, Polyteam knows and has announced all of the features that will be present in the PolyMix-3 workload. Polyteam have begun to implement these features in Polygraph. A draft (beta) specification of the PolyMix-3 workload will be distributed.
June 19
15:00 MDT
13 Polygraph Beta Release. By this date, Polygraph is able to successfully complete a PolyMix-3 run with a proxy cache. Any known bugs should be fixed by this time. Polyteam will distribute documentation that is sufficient to run a PolyMix-3 test. This is the participant's cue to take the software and begin running tests in their own labs. During the next four weeks, participants must provide feedback on the code and documentation.
July 10
17:00 MDT
10 Soft Registration. Participants must provide (a) an auto-generated report from a successful PolyMix-3 run or a detailed description explaining why no report can be generated, and (b) a non-refundable deposit of US$2,000. The purpose of the report is prove to Polyteam and the participant that Polygraph can complete the PolyMix-3 run without bugs and/or errors. If a participant has an issue with the PolyMix-3 workload, the Polygraph code, or the documentation, it must be brought to Polyteam's attention by this date, or the complaints may not be considered. The contents of auto-generated reports are kept absolutely confidential by Polyteam.
July 24
17:00 MDT
8 Everything frozen. By this date, all problems brought to light during the ``beta test'' phase must be fixed by Polyteam. After this date, there are to be no changes to the source code and workload parameters. Documentation frozen too?
Aug 18
17:00 MDT
4 Firm Registration. Participants must pay the full entry fee to Polyteam by this date. Polyteam needs four weeks lead time to rent computer and networking equipment.
Sept 18
08:00 MDT
0 First day of Cacheoff #3. When participants come to the cache-off they must provide an auto-generated polygraph report. This report must be generated between July 25th and September 17th. The report must prove they have completed a successful PolyMix-3 run. The first official cache-off test for a participant must be at this same, or lower, peak request rate to ensure that there will be at least one successful run at the cache-off.
Oct 4
17:00 MDT
  Cache-off report draft released to the participants.
Oct 10
11:00 MDT
  ``Vendor comments'' sections are due. Vendors may submit official comments to be included into the report. Vendors may also submit private comments, of course. Private comments will be taken into consideration by Polyteam in order to improve the quality of the report.
Oct 11
11:00 MDT
  Cache-off report released to the public.
Dec 12
00:00 MST
  New official PolyMix-3 results can be published for cache-off participants.
Jan 12
00:00 MST
  New official PolyMix-3 results can be published for any vendor.

4. Inter-cache-off tests

We had a good discussion about the importance and consequences of inter-cache-off official tests. There was general consensus on the following points:

Some attendees feel that we should not place any restrictions on inter-cache-off test results. However, most feel that it makes it easy for a company to skip the cache-off and show better results than the competition.

Some attendees felt that we should never publish an official result after a cache-off. In other words, the cache-off becomes the only forum for producing an official test result (for now).

The point was made more than once that if Polyteam does not perform inter-cache-off tests, then some other third party will instead. Many felt that such third party tests are harmful to the overall process of having comparable official results and official workloads. Besides, it may leave Polygraph development underfunded.

The following proposed as a compromise to both extremes. In this timeline, all dates refer to publication dates. Tests would always take place 2-4 weeks prior to result publication. All dates are approximate, rounded to whole-month increments. This example assumes an eight month span between cache-offs.

Months Since Cache-off
Result Publication
What
-1 Cache-off
0 Polyteam releases the official report.
2 New results can be published for cache-off participants.
3 New results can be published for anyone.
5 No new results will be published.
7 Next Cache-off testing occurs.
8 Next Cache-off results published.

Using this scheme, for an eight month cache-off period, we have three months between cache-offs to publish new results. For two months after, and three months before cache-off result publication, no new results can be published.

Using a seven month cache-off period, the new-result publication window shrinks to just two months. For a six month period, we would need to revise this proposal.

5. Changes and Additions to Cache-off Rules

5.1 Minimum hit ratio

Polyteam received a suggestion via email that the minimum hit ratio requirement should be increased from 25% to 50%. The point was made that of the 2nd cache-off results, the majority had quite low hit ratios (less than 40%?) and thus it doesn't look much like a caching cache-off. The concern is that products can boast a higher throughput at a lower hit ratio, and many customers reading cache-off results only look at throughput.

Some participants feel that the throughput/hit-ratio tradeoff is entirely a design decision and the testing rules should place as few restrictions as possible on hit ratios. An e-mail exchange between Solom and Alex provides more information.

There was a proposal to display results separately for transparent and non-transparent configurations. This could imply to the reader that a given product works only in a non-transparent configuration, and is not compatible with L4 switches, or vice-versa.

There was a proposal to display results separate for products that achieved higher than 50% and lower than 50% hit ratios. Some felt this was extremely arbitrary.

5.2 Limits on number of entries per vendor/software

We again raised the issue that many cache-off participants may all use the same software. Some are still frustrated by this, but there was no good proposal for a fair way to limit the number of entries. Polyteam expects there will be no limits for cache-off #3.

5.3 Cache size calculations

Caching products can be configured to use a fraction (e.g. 50%) of available disk space. Many products exhibit higher throughput when not using the entire disk capacity.

At cache-off #2, some vendors were surprised to find that Polyteam interpreted the rule differently than they had. Polyteam used the full physical disk capacity as the parameter for the cache fill run. Vendors expected to use the configured cache size.

The issue was raised here, and a poll indicated a 60/40 split in favor of using physical disk capacity.

5.4 Testing time

At cache-off #2, some participants felt cheated out of testing time because Polyteam was not ready when they were, and/or Polyteam incorrectly filled some entries due to configuration errors.

Polyteam is willing to guarantee cache-off participants 55 hours of testing time, which also includes 15 minutes time to prepare a run. If the participant shows up late, or is not ready for some reason, time is subtracted from their 55 hours until they are ready for testing. Practice shows that many vendors will have more than 55 hours of test time.

5.5 Failed Entries

There was consensus to make failed entries more obvious in the report, and especially the executive summary. Failed entries will be listed in the executive summary table with measurements replaced with "n/a".

5.6 Bailout loopholes

Because of events at cache-off #2, there is a concern that participants will scour documentation and other materials for loopholes that allow them to bail out by blaming Polyteam. The consensus was that an entry failure always looks bad, no matter what, and that Polyteam should deal with cases as they arise and should not make any specific rules at this time.

6. Cache-off #3 test suite

The test suite for the third cache-off remains the same:

  1. MSL (time wait) test
  2. PolyMix-3, which now combines cache-fill and performance test.
  3. Downtime test.

Some felt that the downtime test did not show off features of fault tolerant configurations (e.g., a multi-head cluster behind a L4 switch). Polyteam may add a test whereby one system from a cluster configuration is taken down while the other(s) remain(s) running.

7. PolyMix-3 Workload Features

Given the following 13 Polygraph/workload additions/changes, vendors came up with this prioritization:

Polyteam will work on Polygraph improvements according to the assigned priorities, subject to other projects and deadlines.

8. PolyMix-3 Workload Parameters

8.1 Life-cycle model settings

PolyMix-2 uses simple, static life cycle settings. Every response was last modified a year in the past, and expires a year in the future. We agreed to use something more sophisticated for PolyMix-3.

Every response will still include an Expires header, but there will be a distribution of timestamp values. Similarly for Last-modified headers. Every response will still include a date header. All timestamps will be perfectly accurate: If Polygraph says the Expiration time is at time T, then an IMS request before time T always returns NOT MODIFIED, and returns new content at and after time T. The distributions are yet to be proposed.

20% of polygraph client requests will include an If-Modified-Since header. Since the polygraph server is perfectly accurate, a proxy could be configured to return cache hits for all IMS requests. A cache that does not take that approach is going to be penalized with a higher response time.

Polyteam feels strongly that a realistic OLC model must include some uncertainty, and therefore the possibility of stale content. Caches should choose the tradeoff: better response time or less stale responses. This idea was not received well. Some attendees say that realistic OLC parameters are not known, and difficult to know. Published results are old and now invalid.

Some say that the Web is becoming very black and white: Some Web object are always dynamic and uncachable (e.g., HTML pages of a news Web site) while other objects are always static, cachable, and are never modified (e.g., images with unique persistent URLs). Content providers never reuse filenames, but always create new URIs instead. Polyteam feels real traffic is still far from this description.

9. Performance Metrics

Some vendors want to see additional performance metrics developed and used in the cache-off official report. There is a concern that throughput (or throughput per dollar) is given too much prominence and nobody considers other parameters.

A number of cache-off #2 entries had higher response time than a no-proxy test, but this might be overlooked by some readers because of the way the numbers are presented.

It might be good to have "bandwidth savings per cost" metric.

It might be good to have "latency reduction per cost" metric. This would probably have to be "total time saved per $1000" rather than "seconds per request saved per $1000" because one can not spend more money to reduce the response time past the theoretical minimum.

Can we publish a response time delta from the no-proxy case? Perhaps we can use each entry's no-proxy response time, since we run this test already, and it accounts for whatever switch the participant brings.

There was a request to bring back response time histograms (CDFs?) with 25,50,75,95 percentiles.

Report rack space requirements.

Many vendors felt that no photos should be takes at the cache-off. For example, some products were prototypes and shipping versions are in much smaller enclosures. Polyteam's primary intention was to document the products ``as tested'' rather than the products ``as sold''.

10. Miscellaneous

Please send feedback to Polyteam.



$Id: minutes.sml,v 1.11 2001/05/21 16:39:52 wessels Exp $