Abstract
The gap between memory and processor speeds is responsible for the substantial amount of idle time of current processors. To reduce the impact provoked by the so-called "memory gap problem," many software techniques (e.g., the code layout reorganization) together with hardware mechanisms (cache memory, translation look-aside buffer, branch prediction, speculative execution, trace cache, instruction reuse, and so on) have been successfully implemented. In this paper we present some experiments that explain why these mechanisms and techniques are so efficient. We found that only a small fraction of the object code is actually executed: our experiments disclosed that more than 50% of the instructions remain untouched during the whole execution, and the percentages of basic blocks which remain unused are slightly greater. In addition to the usage of instructions and blocks, the paper provides further insights regarding the behavior of application programs, and gives some suggestions for extra performance gains.