Abstract
This work presents a SystemC-based simulation approach for fast performance analysis of parallel software components, using source code annotated with low-level timing properties. In contrast to other source-level approaches for performance analysis, timing attributes obtained from binary code can be annotated even if compiler optimizations are used without requiring changes in the compiler. To consider concurrent accesses to shared resources like caches accurately during a source-level simulation, an extension of the SystemC TLM-2.0 standard for reducing the necessary synchronization overhead is proposed as well. This enables the simulation of low-level timing effects without performing a full-fledged instruction set simulation and at speeds close to pure native execution.