| Third Cache-off: Meeting Minutes |
|---|
| IRCache Cache-Offs |
IRCache hosted an organizational meeting for interested parties in Boulder, Colorado on April 3, 2000.
This document is intended to represent the discussion that took place. It does not represent any final decisions. Almost everything written here is still open for further discussion, and may be changed at a future date. If you want to add to, or clarify something in this document (even if you didn't attend the meeting), please let us know.
1. Executive Summary
1.1 Resolved issues
1.2 Pending issues
2. Attendees
3. Schedule
4. Inter-cache-off tests
5. Changes and Additions to Cache-off Rules
5.1 Minimum hit ratio
5.2 Limits on number of entries per vendor/software
5.3 Cache size calculations
5.4 Testing time
5.5 Failed Entries
5.6 Bailout loopholes
6. Cache-off #3 test suite
7. PolyMix-3 Workload Features
8. PolyMix-3 Workload Parameters
8.1 Life-cycle model settings
9. Performance Metrics
10. Miscellaneous
This section summarizes meeting results. The rest of this document gives the details.
The following decisions were made at the meeting mostly as a result of a compromise or agreement among participants.
While still open for discussion, one would have to convince Polyteam and the majority of vendors that an alternative approach should be used for a given issue.
Below are some of the issues that remain open either because they were not discussed at the meeting or because there was no consensus among the participants.
Eventually, Polyteam will have to decide on these issues based on the meeting discussions and after-meeting feedback.
The following people were present:
|
|
Polyteam presented these slides to the group.
One of the first controversial points was the schedule and timeline for cache-off #3. At the previous organizational meeting, there was rough consensus that cache-offs should occur every six months. Thus, IRCache had tentatively planned for July 17 as the first day of cache-off #3.
A number of attendees felt that six months is too frequent. Some say that the six month schedule is overly demanding on human resources. Some say that the popular media does not care about cache-offs unless something significant occurs (such as a drastic new workload?). There were a number of people proposing nine months between cache-offs.
A couple attendees felt that six months is good. They envision cache-offs becoming obsolete eventually, to be replaced with a SPEC/TPC type system. Frequent cache-offs accelerate this process and evolve the rules and workloads faster.
There was general concern about Polyteam's ability to freeze code and workload with enough lead time for participants to become comfortable with the tool. A number of people were critical of Polyteam's previous behavior in this regard.
After some time, a proposal was made for a cache-off date of Sept 18th, which is eight months after cache-off #2. The resulting schedule is sort of ``adjusted'' to account for Labor Day weekend, which is the first weekend in September:
|
We had a good discussion about the importance and consequences of inter-cache-off official tests. There was general consensus on the following points:
Some attendees feel that we should not place any restrictions on inter-cache-off test results. However, most feel that it makes it easy for a company to skip the cache-off and show better results than the competition.
Some attendees felt that we should never publish an official result after a cache-off. In other words, the cache-off becomes the only forum for producing an official test result (for now).
The point was made more than once that if Polyteam does not perform inter-cache-off tests, then some other third party will instead. Many felt that such third party tests are harmful to the overall process of having comparable official results and official workloads. Besides, it may leave Polygraph development underfunded.
The following proposed as a compromise to both extremes. In this timeline, all dates refer to publication dates. Tests would always take place 2-4 weeks prior to result publication. All dates are approximate, rounded to whole-month increments. This example assumes an eight month span between cache-offs.
|
Using this scheme, for an eight month cache-off period, we have three months between cache-offs to publish new results. For two months after, and three months before cache-off result publication, no new results can be published.
Using a seven month cache-off period, the new-result publication window
shrinks to just two months. For a six month period, we would need to revise
this proposal.
5. Changes and Additions to Cache-off Rules
5.1 Minimum hit ratio
Polyteam received a suggestion via email that the minimum hit ratio requirement should be increased from 25% to 50%. The point was made that of the 2nd cache-off results, the majority had quite low hit ratios (less than 40%?) and thus it doesn't look much like a caching cache-off. The concern is that products can boast a higher throughput at a lower hit ratio, and many customers reading cache-off results only look at throughput.
Some participants feel that the throughput/hit-ratio tradeoff is entirely a design decision and the testing rules should place as few restrictions as possible on hit ratios. An e-mail exchange between Solom and Alex provides more information.
There was a proposal to display results separately for transparent and non-transparent configurations. This could imply to the reader that a given product works only in a non-transparent configuration, and is not compatible with L4 switches, or vice-versa.
There was a proposal to display results separate for products
that achieved higher than 50% and lower than 50% hit ratios.
Some felt this was extremely arbitrary.
5.2 Limits on number of entries per vendor/software
We again raised the issue that many cache-off participants may all use the
same software. Some are still frustrated by this, but there was
no good proposal for a fair way to limit the number of entries.
Polyteam expects there will be no limits for cache-off #3.
5.3 Cache size calculations
Caching products can be configured to use a fraction (e.g. 50%) of available disk space. Many products exhibit higher throughput when not using the entire disk capacity.
At cache-off #2, some vendors were surprised to find that Polyteam interpreted the rule differently than they had. Polyteam used the full physical disk capacity as the parameter for the cache fill run. Vendors expected to use the configured cache size.
The issue was raised here, and a poll indicated a 60/40 split in favor
of using physical disk capacity.
5.4 Testing time
At cache-off #2, some participants felt cheated out of testing time because Polyteam was not ready when they were, and/or Polyteam incorrectly filled some entries due to configuration errors.
Polyteam is willing to guarantee cache-off participants 55 hours of testing
time, which also includes 15 minutes time to prepare a run. If the
participant shows up late, or is not ready for some reason, time is subtracted
from their 55 hours until they are ready for testing. Practice shows that many
vendors will have more than 55 hours of test time.
5.5 Failed Entries
There was consensus to make failed entries more obvious in the report, and
especially the executive summary. Failed entries will be listed in the
executive summary table with measurements replaced with "n/a".
5.6 Bailout loopholes
Because of events at cache-off #2, there is a concern that participants will
scour documentation and other materials for loopholes that allow them to bail
out by blaming Polyteam. The consensus was that an entry failure always looks
bad, no matter what, and that Polyteam should deal with cases as they arise
and should not make any specific rules at this time.
6. Cache-off #3 test suite
The test suite for the third cache-off remains the same:
Some felt that the downtime test did not show off features of fault tolerant configurations (e.g., a multi-head cluster behind a L4 switch). Polyteam may add a test whereby one system from a cluster configuration is taken down while the other(s) remain(s) running.
Given the following 13 Polygraph/workload additions/changes, vendors came up with this prioritization:
Top priority items:
Medium priority items:
Low priority items:
Polyteam will work on Polygraph improvements according to the assigned priorities, subject to other projects and deadlines.
PolyMix-2 uses simple, static life cycle settings. Every response was last modified a year in the past, and expires a year in the future. We agreed to use something more sophisticated for PolyMix-3.
Every response will still include an Expires header, but there will be a distribution of timestamp values. Similarly for Last-modified headers. Every response will still include a date header. All timestamps will be perfectly accurate: If Polygraph says the Expiration time is at time T, then an IMS request before time T always returns NOT MODIFIED, and returns new content at and after time T. The distributions are yet to be proposed.
20% of polygraph client requests will include an If-Modified-Since header. Since the polygraph server is perfectly accurate, a proxy could be configured to return cache hits for all IMS requests. A cache that does not take that approach is going to be penalized with a higher response time.
Polyteam feels strongly that a realistic OLC model must include some uncertainty, and therefore the possibility of stale content. Caches should choose the tradeoff: better response time or less stale responses. This idea was not received well. Some attendees say that realistic OLC parameters are not known, and difficult to know. Published results are old and now invalid.
Some say that the Web is becoming very black and white: Some Web object are always dynamic and uncachable (e.g., HTML pages of a news Web site) while other objects are always static, cachable, and are never modified (e.g., images with unique persistent URLs). Content providers never reuse filenames, but always create new URIs instead. Polyteam feels real traffic is still far from this description.
Some vendors want to see additional performance metrics developed and used in the cache-off official report. There is a concern that throughput (or throughput per dollar) is given too much prominence and nobody considers other parameters.
A number of cache-off #2 entries had higher response time than a no-proxy test, but this might be overlooked by some readers because of the way the numbers are presented.
It might be good to have "bandwidth savings per cost" metric.
It might be good to have "latency reduction per cost" metric. This would probably have to be "total time saved per $1000" rather than "seconds per request saved per $1000" because one can not spend more money to reduce the response time past the theoretical minimum.
Can we publish a response time delta from the no-proxy case? Perhaps we can use each entry's no-proxy response time, since we run this test already, and it accounts for whatever switch the participant brings.
There was a request to bring back response time histograms (CDFs?) with 25,50,75,95 percentiles.
Report rack space requirements.
Many vendors felt that no photos should be takes at the cache-off. For
example, some products were prototypes and shipping versions are in much
smaller enclosures. Polyteam's primary intention was to document the products
``as tested'' rather than the products ``as sold''.
10. Miscellaneous
Please send feedback to Polyteam.
$Id: minutes.sml,v 1.11 2001/05/21 16:39:52 wessels Exp $