VLSI Design, International Conference on
Download PDF

Abstract

This paper presents a new high-level synthesis methodology to generate optimized implementations for multi-process behavioral descriptions. The concurrent communicating processes specification paradigm is widely used in digital circuit and system design, and is employed in all popular hardware description languages. It has been shown that interprocess communication and synchronization can result in complex timing inter-dependencies, which significantly affect the performance of a multi-process system. However, previous research on high-level synthesis typically takes a one-process-at-a-time approach, and the effects of inter-process communication and synchronization are ignored when performing tasks such as scheduling, resource sharing, etc. In this paper, we demonstrate that state-of-the-art high-level synthesis tools can generate significantly sub-optimal implementations for behaviors that contain concurrent communicating processes. We present an analysis of how inter-process communication impacts high-level synthesis steps, and describe a new methodology to adapt existing high-level synthesis tools to optimize multi-process descriptions. Our methodology is based on executing multi-process performance analysis and process-by-process scheduling in an iterative manner. The results of performance analysis are used to identify critical and near-critical operations, and to judiciously partition the global resource budget into constraints for each process. The process-level constraints are used to drive scheduling for individual processes, so as to speed up the overall system critical path. We present algorithms for key steps in the proposed methodology. We have performed extensive experiments in the context of a commercial high-level design flow to evaluate the proposed techniques. The results clearly demonstrate the utility of our techniques in synthesizing implementations with superior area, performance, and energy consumption. For example, up to 40.0% performance improvement (average of 35.6%) was achieved with little or no area overheads (average of 4.8%). In effect, the proposed techniques lead to a shift of the entire area-delay trade-off curve for a design to include superior designs that were hitherto infeasible. Our techniques also simultaneously result in up to 50.0% (average of 33.5%) improvement in energy and up to 69.0% (average of 58.3%) in the energy-delay product.
Like what you’re reading?
Already a member?Sign In
Member Price
$11
Non-Member Price
$21
Add to CartSign In
Get this article FREE with a new membership!

Related Articles