Abstract
We tackle the problem of maintaining materialized top-k views in this paper. Top-k queries, including MIN and MAX as important special cases, occur frequently in common database workloads. A top-k view can be materialized to improve query performance, but in general it is not self-maintainable unless it contains all tuples in the base table. Deletions and updates on the base table may cause tuples to leave the top-k view, resulting in expensive queries over the base table to "refill" the view. In this paper, we propose an algorithm that reduces the frequency of refills by maintaining a top-k' view instead of a top-k view, where k' changes at runtime between k and some kmax ≥ k. We show that in most practical cases, our algorithm can reduce the expected amortized cost of refill queries to O(1) while still keeping the view small. The optimal value of kmax depends on the update pattern and the costs of querying the base table and updating the view. Compared with the simple approach of maintaining either the top-k view itself or a copy of the base table, our algorithm can provide orders-of-magnitude improvements in performance with appropriate kmax values. We show how to choose kmax dynamically to adapt to the actual system workload and performance at runtime, without requiring accurate prior knowledge.