Abstract
We present a high-level analytical model for chip-multiprocessors (CMPs) that encompasses processors, memory, and communication in an area-constrained, global optimization process. Applying this analytical model to the design of a symmetric CMP for speech recognition, we demonstrate a methodology for estimating model parameters prior to design exploration. Then we present an automated approach for finding the optimal high-level CMP architecture. The result is the ability to find the allocation of silicon resources for each architectural element that maximizes overall system performance. This balances the performance gains from parallelism, processor microarchitecture, and cache memory with the energy-delay costs of computation and communication.