QoS in Web Caching
Manuel Afonso |
Alexandre Santos |
Vasco Freitas |
March 31, 1998
Campus Gualtar
4710 Braga - Portugal
Abstract
Nowadays, hierarchical organisation of caches is commonly used. The main objective is to overcome the cumbersome paths of communications among servers, trying to improve the response times experienced by end-users. The efforts are mostly focused on the still exponentially growing Web. There is an interest in the analysis of different caching architectures using always the same basic hardware, software and network setups. In order to be able to analyse the influence of architectural changes, caching server QoS specification must be studied.
This paper presents an approach to evaluate, from users point of view, Web Caching parameters to be used in QoS characterisation.
Several tools have been developed in order to evaluate QoS in Web Caching. Those tools have been used with a case study and obtained results are analysed.
Keywords: Proxy caching architectures, QoS, Web caching, ICP, HTTP.
Contents
2.1 Simple caches
2.2 Co-operating
2.2.1 The role of protocols
2.2.2 Known problems
2.2.3 A new proposal: HTCP
5.1 Proxy-www
5.2 RCCN proxies
Internet is now a widespread mean of interaction among people from everywhere. The World Wide Web, along with associated servers and browsers is, no doubt, the mostly popular Internet set of applications. At first glance, it seems quite nice, but a more attentive analysis shows some technological problems. As we all know, the Web is resource intensive, consuming a lot of bandwidth when documents are transferred - documents which can be as small as some Kbytes or as big as some Mbytes (specially sound, clips and image files). Of course, we can always think of upgrading circuits for a higher bandwidth, buying faster computers, extend memory, disks... Nevertheless, this solution is almost always economically impracticable, so high can costs grow as compared with the short/medium term benefits; demand is also always growing.
The actual and most commonly used solution to overcome the lack of bandwidth for such a high number of Web requests is Web Caching. This technique uses the knowledge acquired by several analysis on servers access logs and by looking into Web users behaviour, both individually and as members of an organisation, to reduce the latency experienced by end users when trying to fetch some documents through their Web-browser [1, 2].
The basic concept of caching is intermediate storage of copies of popular Web documents close to end users. It's taking advantage of temporal locality [3] on accesses - for example, in our University itīs very probable that several users will read the morning e-newspaper titles in a short period of time. Normally, Web documents are requested much more than once.
2. Basics of Internet object caching
Andrew Cormack [4] considers two distinguished types of caches: simple caches and co-operating caches. In simple caching, the communication is only possible hierarchically, through TCP connections; caches at the same level are not accessible. In co-operative caching, all caches can participate in the process of satisfying user requests.
Simple caches are being abandoned because they lead to wastage of both bandwidth and disk space. With this type of caches, if an object is not present, a request will be issued to the cache one level up in the hierarchy. ICP [5, 6] is not used, so it is no longer a good solution.
Co-operating caches, unlike the simple ones, admit richer co-operation which make them quite powerful. Nevertheless, there are some unwanted effects that need further analysis.
Next sections are devoted to the analysis of several aspects related with the behaviour of these caches.
2.1 Simple caches
Without any caching mechanisms, when a browser needs to get an object from a specified host (both present in URL), it will just make a direct connection to that host and tries to retrieve the object. Of course, if the object doesn't exist the end user will receive an error message. One advantage of using caching (either simple or co-operative) is that these messages can be avoided. Most recent proxies can use Proxy Automatic Configuration, PAC. The browser is configured to use a script and it can have pre-configured alternatives for the case where a proxy-cache or origin server is not available. Even thought this technique can increment the response times, this is negligible when compared with the augmented availability of information. PAC is also used for load balancing purposes in clusters of caches.
With simple caching there is the possibility of making requests to a cache. Each time a user makes a request through its browser, a TCP connection is made to the cache server instead of doing a connection to the original server. With this we expect to reduce the time needed to service requests. Within an organisation it is highly probable that more than one user requests the same object. So the cache can intercept requests for the same objects and avoid direct connections to the origin server for each of them, doing only one request. This technique can reduce both the wastage of bandwidth and latency serving requests.
Another useful aspect is that if the contacted cache doesn't have the requested object, it is able to forward the request either to a parent cache or to the original server.
However, these kinds of operations have some limitations and problems.
First, the hierarchy can not have more than two or three levels [7] because an object retrieved from an origin server (or from an upper level cache) using intermediate caches will be stored in all the caches used to convey the object to the user. This means that caches at higher levels need a lot of disk space, otherwise objects will be discarded, most of the time using Least-Recently Used technique, before they become stale.
The second disadvantage is the need for a TCP connection (which requires at least the exchange of eight packets) each time we want to retrieve an object. This is quite heavy. A better solution is using ICP for querying neighbour caches. ICP however has its own limitations, most of them caused by the lack of information in ICP headers; just part of the information in HTTP headers are in ICP headers. In order to solve these problems, ICP Working Group is developing the Hyper Text Caching Protocol, HTCP [8] - also known as HoT CraP.
HTCP messages are richer than ICP ones. Particularly, there are special headers carrying information about caching.
2.2 Co-operating caches
An institution with several departments may wish to have some caches at the same level (one per department, for example) being able to co-operate among them for serving requests. This co-operation is possible using the protocol ICP.
There are several possible methods of co-operation. Depending on the way they collaborate, caches are known as siblings or parents. The difference between these two types of co-operation is straightforward: parents can help to serve a request they receive even if they don't have a copy of that object; siblings can only serve a request if they already have a local copy. The way proxy-caches are organised defines a particular caching architecture.
The protocol HTTP/1.0 [9], used for Web transfers is quite complex and heavy. ICP, a simpler lightweight protocol, was designed for querying purposes among caches (and HTCP is under development...)
2.2.1 The role of protocols
Any proxy caching server (shortly named cache) able to co-operate with other caches is said to be a peer or neighbour of such caches. Peers admit further classification: both parent/child and sibling/peer relationships.
Communication among peers is, for the time, being accomplished by means of ICP protocol, as stated before.
As an example let's consider relationships among peers as pictured in figure 1, where C1 has two siblings (C3 and C4) and one parent (C2). Each time cache C1 receives a request, it can send queries to caches C2 , C3 and C4 using the message ICP_QUERY. (It sends a message to the origin server too, considering this as part of the selection process to serve a request). Case one of the peers of C1 had a fresh copy of the requested object it will reply with an ICP_HIT or ICP_HIT_OBJ. If one peer doesn't have the object or it will be stale in the next 30 seconds it will return an ICP_MISS. Figure 2 shows a detailed diagram that explains what happens when a cache receives an ICP-request (opcode ICP_QUERY). The message ICP_DENIED should only occur if the client cache is not authorised to communicate with the cache receiving the ICP-request. In such cases, administrators should contact each other to solve the problem, which normally means changing the configuration file's ACLs. Another possible message is ICP_MISS_NOFECTH and occurs when a parent cache is not able to forward requests, maybe due to network connection problems. |
However, this cache continue to receive ICP-queries (to determine when the problems are solved) and is able to serve requests when objects are present in the cache.
Let's look now at how ICP-replies are processed. Figure 3 shows the way ICP-replies are processed in order to select one peer cache for getting a particular object.
According to Figure 3, one of the following situations can happen:
|
An HTTP-request contains a method, an URL and some headers. The available methods and options depend on the version of protocol HTTP.
The important point here is that some of the fields present in HTTP-headers are not in ICP queries. So an ICP-reply can indicate that a particular object is present in cache and fresh (a HIT) and when the HTTP request is made a response can be issued indicating that the request cannot be fulfilled. Next sub-section discuss some of these problems.
2.2.2 Known problems
HTTP/1.1 is progressively being introduced but HTTP/1.0 is still the most used. So, let's stick with this version and analyse some of the options that can affect the behaviour of caching mechanisms. The options here presented are not present in ICP messages but do are in HTTP ones.
An object can be encoded in a way not accepted by the cache/browser that did the request. A TCP_MISS will occur when one cache previously sent an ICP message indicating that it had the object.
It indicates the time after which the object will be considered stale. The object can be cached for that period of time. These header can cause freshness problem as there is no means for caches to negotiating freshness parameters. A cache with more strict freshness rules can get stale data from another peer with relaxed rules of freshness. It's a TCP_MISS and the ICP message pointed to the existence of requested object. Normally, administrators agreed in similar freshness parameters.
Permits to ask if an object has been modified since the indicated date. It is used with method GET. In the case where the demanded object has not been modified a response "not modified" will be issued. Once again, the freshness problem can arise.
It's the way a user can authenticate itself with a server. Responses containing an authorisation field are never cached. Squid.1.1 [11] considers objects with this header as private.
Field enabling the calculation of TTL values. Generally a percentage of object's age.
When the directive "no-cache" is present in a request message, the cache-chain must not be used. Instead, the request must follow to the origin server. This directive is sometimes issued by users when they press the browsers "reload" button.
2.2.3 A new proposal: HTCP
Attempts to solve these problems are being made. ICP Working Group proposes a new protocol, the already referenced HTCP. It's purposes are wider, pointing for a complete change in today's philosophy of caching. It's a kind of proposal for migration from client-driven caching (users' requests determine cached objects - if cachable) to "pro-active" caching.
HTCP has full HTTP request and response headers and extra useful caching information headers. Namely, Cache-Location:, Cache-Policy: and Cache-Expiry: headers are particularly important in what concerns caching.
Cache-Location:
header adds flexibility. One cache can indicate alternative suppliers for the requested object, augmenting their availability of information.Cache-Policy:
header determines, for instance, if an object is cachable and/or can be shared (similar but more efficient than the Squid.1.1 private/public notion of objects).Cache-Expiry:
header indicates for how long an object can be considered fresh.In spite of these, some researchers say that the long term solution will be "Adaptive Web
Caching" [12]. Briefly, this technique would use the theory of communication in groups, taking advantage of IP multicast and reaching reliability through Scalable Reliable Multicast protocol, SRM [13 ].
3. Architecture testbed
Commonly used caching architectures have only peers that behave as parents or siblings. Portugal has four top-level domain proxy-caches (There are two in Porto and another two in Lisbon. They are called the RCCN [14] proxies) and most of (if not all) the higher education institutions have their own proxy-cache co-operating with one of these top-level caches.
University of Minho has about 1000 teachers and around 14000 students accessing WWW, either by LAN access or dial-up. Our University is sharing the 10 Mbps of RCCN backbone with other education institutions. The link connecting the campus to this backbone has 4 Mbps bandwidth. With the European TERENA project [15] our international connections are quite better but still not enough for the for high volume of WWW traffic and still growing.
The easiest and cost effective solution to have better response times is reached, of course, by the use of caching techniques.
The actual architecture is composed by one proxy-cache (proxy-www) connecting the University to the "Internet world" - parent caches or remote servers - and a large number of children and siblings are attached to this cache. As there is a firewall, proxy-www is the mean to accessing servers outside this firewall. "Inside firewall" accesses are done through direct connections.
Keeping the same set of servers and studying the effects of establishing new architectural relations among servers is the goal. Being able to analyse QoS variation, along with those architectural changes is another goal.
Figure 5 shows, by means of accumulative values, typical distribution of documents' sizes accessed at the University of Minho (results obtained by analysing 42 non-consecutive days randomly selected) through a proxy/cache Web server.
4. Measuring the web caching QoS
There are several ways of evaluating the performance of co-operating proxy-caches. Some of the approaches use information concerning the utilisation of computational resources, such as memory, disk space, cpu usage, ... Other approaches consider bandwidth utilisation or latency perceived by the end user.
This section describes a new way of measuring the proxy-caches QoS in terms of response time, i.e., how long it takes to serve end user requests.
At first glance, it may seem that computing the average response time per request would be easy. However, in order to compute a significant measure, useful to compare performance of different proxy-caches architectures, some other considerations had to be taken into account.
As the objective is testing performance of different architecture configurations for a particular proxy-cache, some decisions were taken:
Classe i |
Sizes |
0 |
0KB-1KB |
1 |
1KB-5KB |
2 |
5KB-10KB |
3 |
10KB-50KB |
4 |
50KB-100KB |
5 |
100KB-500KB |
6 |
500KB-1MB |
7 |
1MB-5MB |
8 |
5MB-10MB |
9 |
>= 10 MB |
Table I - Size categories
The local hits are those requests which have in the field "Log Tags" - logs from Squid software - the following results:
The category of hits at the most one level up in hierarchy corresponds to those requests that resulted in one of the next responses:
The case of use of hierarchy at more than one level up occurs when all the peers have responded negatively to ICP queries for an object. We assume that parents are working and so we only consider those requests with hierarchical access log tag FIRST_PARENT_MISS (l=7). This premise is adequate for the architectures we plan to test but other access log tags could be considered. For example, we could have considered the case where only one parent is available (SINGLE_PARENT).
The last case, direct access to origin servers considers all the requests with hierarchical access log code DIRECT (l=8).
Other considerations could be applied and are still under study. For example our institution has a firewall, but the discussion around this is not relevant for the objective of testing architecture performance of the proxy-cache because these aspects of the configuration will remain unchanged, although the architecture will indeed change.
For the purpose of calculating caching server QoS within an architecture we obtain for each (i,j,k,l) the following values:
Finally, the measure of QoS is obtained doing the following calculation:
where:
As it is well known the mean as representative measure has some limitations. For example, it can be affected by extreme values. So, we complement the QoS, as defined above, by two other measures, CVB and CVE, which show the degree of variability in size and in response time. They are defined as follow:
5. Preliminary results
The analyse was done to proxy cache at University of Minho, proxy-www, and to RCCN Web proxies located in Porto.
5.1 Proxy-www
The logs of last 8 days (20 to 27 March 98) were used to compute the defined measures and some other useful information were obtained. During this period, 414,242 requests were made (392,422 ICP and 21,820 TCP). All the requests were considered but only those related to HTTP are presented. The table II summarises some of the results - each cell value is the division of the amount of time by the respective number of bytes, in category (category QoS). There are some results that could be expected but others needs further study. For instance, why are origin servers QoS so low for all the classes but the first one? Probably it's due to the cost of making a TCP connection with small amount of exchanged data. However it needs further analysis.
Classe of sizes |
Local HITs UDP_HIT /TCP_HIT |
Hierarchical use (only PARENT_HIT) |
Hierarchical use (+1 level up) |
Origin servers accesses |
|
0KB-1KB |
0.00429 |
0.28624 |
20.53164 |
34.12768 |
51.30882 |
1KB-5KB |
- |
0.12103 |
1.10786 |
4.69322 |
0.56243 |
5KB-10KB |
- |
0.04898 |
2.55196 |
2.05087 |
0.05316 |
10KB-50KB |
- |
0.01870 |
0.19991 |
1.19320 |
0.01646 |
50KB-100KB |
- |
0.17426 |
0.05325 |
1.37516 |
- |
100KB-500KB |
- |
1.78998 |
0.99472 |
0.34784 |
- |
500KB-1MB |
- |
0.13229 |
- |
0.63952 |
- |
1MB-5MB |
- |
0.12732 |
- |
0.00097 |
- |
5MB-10MB |
- |
- |
- |
- |
- |
>= 10 MB |
- |
- |
- |
- |
- |
Table II - QoS by categories for HTTP (values in mili-seconds/byte transferred). Dash (-) indicates that no requests existed in that category.
The overall performance of proxy-www is characterised by the following values:
This value was obtained considering for each (i,j,k,l) the biggest ratio latency/size taking into account all the requests in category.
5.2 RCCN proxies
The object of analysis is the access logs of two RCCN proxies (let's call them proxy-1 and proxy-2).
For these two caches, we started by doing a matrix, considering the variables size and latency, both with several categories and then the other measures were calculated.
Tables III shows the proxy-1's number of requests aggregated by categories. In the period of analysis, we see that from the 2,718,861 requests (2,284,075 ICP and 434,787 TCP totalising 3.884 Gbytes) about 90% are objects smaller than 1 Kbyte. Considering only the first cell, where response times are below 200 ms, it has 85% of the total requests. These results encourages the concentration of efforts on these requests.
Sizes |
0-1KB |
1-5KB |
5-10KB |
10-50KB |
50-100KB |
100-500KB |
0.5-1MB |
1-5MB |
5-10MB |
>= 10 MB |
Latency |
||||||||||
< 200 ms |
2306568 |
17624 |
6689 |
5243 |
5 |
0 |
0 |
0 |
0 |
0 |
< 500 ms |
18846 |
9267 |
2274 |
2355 |
59 |
0 |
0 |
0 |
0 |
0 |
< 1 s |
34432 |
19718 |
3781 |
2361 |
121 |
3 |
0 |
0 |
0 |
0 |
< 2 s |
20789 |
21988 |
7043 |
5931 |
148 |
35 |
0 |
0 |
0 |
0 |
< 3 s |
7972 |
7633 |
3201 |
3938 |
186 |
40 |
0 |
0 |
0 |
0 |
< 4 s |
13605 |
8653 |
2796 |
2578 |
122 |
29 |
0 |
0 |
0 |
0 |
< 5 s |
10414 |
10179 |
3237 |
2723 |
118 |
33 |
0 |
0 |
0 |
0 |
< 6 s |
3647 |
4626 |
2698 |
2762 |
83 |
20 |
0 |
0 |
0 |
0 |
< 7 s |
3011 |
3123 |
1774 |
2174 |
75 |
21 |
1 |
0 |
0 |
0 |
< 8 s |
2402 |
2721 |
1540 |
1955 |
80 |
14 |
2 |
0 |
0 |
0 |
< 9 s |
1422 |
2041 |
1397 |
1890 |
65 |
16 |
2 |
0 |
0 |
0 |
< 10 s |
3460 |
2322 |
1225 |
1710 |
91 |
19 |
2 |
0 |
0 |
0 |
< 15 s |
6840 |
8761 |
4871 |
7258 |
316 |
52 |
8 |
1 |
0 |
0 |
< 20 s |
2486 |
3425 |
2405 |
4945 |
325 |
56 |
9 |
3 |
0 |
0 |
< 25 s |
2223 |
3262 |
1769 |
3570 |
291 |
44 |
11 |
1 |
0 |
0 |
< 30 s |
808 |
1558 |
1169 |
2773 |
300 |
41 |
6 |
5 |
0 |
0 |
< 35 s |
784 |
1122 |
775 |
1969 |
278 |
49 |
8 |
3 |
0 |
0 |
< 40 s |
346 |
590 |
478 |
1493 |
248 |
39 |
5 |
4 |
0 |
0 |
< 45 s |
343 |
622 |
403 |
1224 |
215 |
44 |
2 |
4 |
0 |
0 |
< 50 s |
568 |
701 |
415 |
1042 |
219 |
33 |
1 |
4 |
0 |
0 |
>= 50 s |
4551 |
3335 |
2898 |
7477 |
2175 |
1208 |
165 |
249 |
32 |
7 |
2445517 |
133271 |
52838 |
67371 |
5520 |
1796 |
222 |
274 |
32 |
7 |
|
90.35% |
4.92% |
1.95% |
2.49% |
0.20% |
0.07% |
0.01% |
0.01% |
0.00% |
0.00% |
Table III - Requests of proxy-1 distributed by size and response times - only HTTP
The QoS by categories is presented in table IV (values represent ms/byte).
Classe of sizes |
Local HITs UDP_HIT | TCP_HIT |
Origin servers accesses |
|
0KB-1KB |
0.02549 |
0.34049 |
51.46991 |
1KB-5KB |
0.00215 |
0.15918 |
5.31436 |
5KB-10KB |
0.00047 |
0.11093 |
2.91213 |
10KB-50KB |
0.00045 |
0.21652 |
1.85093 |
50KB-100KB |
- |
0.60325 |
1.68454 |
100KB-500KB |
- |
1.10225 |
3.34183 |
500KB-1MB |
- |
1.06230 |
1.74949 |
1MB-5MB |
- |
1.27303 |
0.56491 |
5MB-10MB |
- |
0.36965 |
0.02444 |
>= 10 MB |
- |
- |
0.11946 |
Table IV - QoS by categories for proxy-1 - only HTTP.
The overall metrics are:
The cache proxy-2 gave us similar results. During the period of analysis 2,338,865 requests were made (1,933,970 ICP and 404,896 TCP) which represents 3.300 Gbytes. As in proxy-1, almost 90% of the requests are objects smaller than 1 Kbyte and of these around 81% have response times below 200 ms (table V).
Sizes |
0-1KB |
1-5KB |
5-10KB |
10-50KB |
50-100KB |
100-500KB |
0.5-1MB |
1-5MB |
5-10MB |
>= 10 MB |
Latency |
||||||||||
< 200 ms |
1881406 |
9951 |
3442 |
1953 |
0 |
0 |
0 |
0 |
0 |
0 |
< 500 ms |
71718 |
5637 |
1415 |
1516 |
12 |
0 |
0 |
0 |
0 |
0 |
< 1 s |
29922 |
14372 |
2697 |
1690 |
68 |
5 |
0 |
0 |
0 |
0 |
< 2 s |
23546 |
20892 |
6451 |
4933 |
111 |
9 |
0 |
0 |
0 |
0 |
< 3 s |
10518 |
10374 |
3869 |
3763 |
203 |
14 |
0 |
0 |
0 |
0 |
< 4 s |
9995 |
8723 |
2818 |
2658 |
183 |
20 |
0 |
0 |
0 |
0 |
< 5 s |
8828 |
9453 |
3262 |
2595 |
178 |
30 |
0 |
0 |
0 |
0 |
< 6 s |
5421 |
5468 |
2526 |
2329 |
140 |
38 |
0 |
0 |
0 |
0 |
< 7 s |
4196 |
4208 |
1848 |
1865 |
108 |
35 |
0 |
0 |
0 |
0 |
< 8 s |
3247 |
3620 |
1657 |
1721 |
107 |
28 |
0 |
0 |
0 |
0 |
< 9 s |
2426 |
2715 |
1483 |
1685 |
85 |
31 |
0 |
0 |
0 |
0 |
< 10 s |
2759 |
2671 |
1306 |
1431 |
83 |
27 |
1 |
0 |
0 |
0 |
< 15 s |
10047 |
10528 |
5223 |
6313 |
280 |
103 |
0 |
0 |
0 |
0 |
< 20 s |
4374 |
4764 |
2875 |
4241 |
240 |
74 |
2 |
0 |
0 |
0 |
< 25 s |
3251 |
4094 |
1957 |
3031 |
193 |
52 |
5 |
0 |
0 |
0 |
< 30 s |
1846 |
2474 |
1235 |
2283 |
183 |
60 |
4 |
1 |
0 |
0 |
< 35 s |
1411 |
1782 |
902 |
1620 |
160 |
43 |
6 |
1 |
0 |
0 |
< 40 s |
916 |
1091 |
553 |
1214 |
133 |
38 |
4 |
1 |
0 |
0 |
< 45 s |
712 |
873 |
394 |
1007 |
117 |
48 |
2 |
2 |
0 |
0 |
< 50 s |
893 |
920 |
425 |
862 |
124 |
33 |
5 |
0 |
0 |
0 |
>= 50 s |
10314 |
6297 |
2979 |
6220 |
1284 |
854 |
208 |
263 |
29 |
4 |
2087746 |
130907 |
49317 |
54930 |
3992 |
1542 |
237 |
268 |
29 |
4 |
|
89.64% |
5.62% |
2.12% |
2.36% |
0.17% |
0.07% |
0.01% |
0.01% |
0.00% |
0.00% |
Table V - Requests of proxy-2 distributed by size and response times - only HTTP
The QoS by categories is given in table VI.
Classe of sizes |
Local HITs UDP_HIT | TCP_HIT |
Origin servers accesses |
|
0KB-1KB |
0.43797 |
0.49242 |
91.17730 |
1KB-5KB |
- |
2.99400 |
6.92480 |
5KB-10KB |
- |
0.66970 |
3.17078 |
10KB-50KB |
- |
0.48805 |
1.70526 |
50KB-100KB |
- |
0.63836 |
0.89883 |
100KB-500KB |
- |
0.55416 |
1.61413 |
500KB-1MB |
- |
0.41508 |
0.63983 |
1MB-5MB |
- |
0.42522 |
0.88920 |
5MB-10MB |
- |
0.21700 |
0.43159 |
>= 10 MB |
- |
0.17303 |
0.07662 |
Table VI - QoS by categories for proxy-2 - only HTTP.
The overall metrics are:
6 Conclusions and further work
The proposed measures may give some information about the performance, in terms of response times to requests.
However, there are some aspects that need further analysis. Probably, the category of requests below 200 ms should be split in order to have more detailed results; this approach reveals a lot of requests aggregated in a single category (which means loosing information). The presented latency times are absolute values; perhaps relative values, in mili-seconds per byte transferred, could be more useful.
In what concerns objects' size, there are some research pointing that requested objects with size greater than some number of standard deviation should not be considered. Probably this may be a better solution. However, it may be difficult to determine the optimum number of standard deviation. This needs further studies.
Another, not negligible, aspect is the day time at which requests are made. It's known, for instance, the accesses are faster at late hours and slower at working hours. For these reasons, probably day time should be considered in the analysis.
In spite of this, for the purpose of tuning one particular cache for better performance, i.e., choosing the better architecture, it is believed that these results can give important help.
At Minho University, planned experiences will evaluate the performance of ICP multicast based architectures and the results will be compared with actual architecture, where there is no use of multicast and proxy-www has only parents.
Also interesting, but maybe difficult to achieve, would be characterisation of access patterns. Knowing the characteristics, at least some, of the Portuguese community's access patterns to WWW could be rewarding. This knowledge could be very useful for international or transcontinental accesses - load balancing could be done based on rigorous data and the use of pre-fetching could improve the response times of end user. Caches could be specialised by domains.