Abstract
Well-organized proxy caching systems can greatly reduce the user perceived latency and decrease the network bandwidth consumption. In this paper, we propose a new hash based web caching architecture, Tulip. Tulip extends the locality-based algorithm in UCFS [15] as the basic data grouping scheme in hash based proxy systems, uses it to aggregate web objects which are likely to be accessed together into object clusters and uses these clusters as the primary access units between memory and disk. The overhead of slow disk I/Os is greatly reduced. It also presents a simple and efficient data duplication scheme. Along with the local caching strategy, Tulip can achieve both fault tolerance and load balance with minimal overhead introduced. Our simulation results show Tulip is scalable and robust, it has better performance than previous approaches.