2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)
Download PDF

Abstract

Instruction prefetching is a standard approach to improve the performance of operating system (OS) intensive workloads such as web servers, file servers and database servers. Sophisticated instruction prefetching techniques such as PIF [12] and RDIP [17] record the execution history of a program in dedicated hardware structures and use this information for prefetching if a known execution pattern is repeated. The storage overheads of the additional hardware structures are prohibitively high (64–200 KB per core). This makes it difficult for the deployment of such schemes in real systems. We propose a solution that uses minimal hardware modifications to tackle this problem. We notice that the execution of server applications keeps switching between tasks such as the application, system call handlers, and interrupt handlers. Each task has a distinct instruction footprint, and is separated by a special OS event. We propose a sophisticated technique to capture the instruction stream in the vicinity of such OS events; the captured information is then compressed significantly and is stored in a process's virtual address space. Special OS routines then use this information to prefetch instructions for the OS and the application codes. Using modest hardware support (4 registers per core), we report an increase in instruction throughput of 2–14% (mean: 7%) over state of the art instruction prefetching techniques for a suite of 8 popular OS intensive applications.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles