Abstract
The VLIW processors with static instruction scheduling and thus deterministic execution times are very suitable for high-performance real-time DSP applications. But the two major weaknesses in VLIW processors prevent the integration of more functional units (FU) for a higher instruction issuing rate - the dramatically growing complexity in the register file (RF), and the poor code density. In this paper, we propose a novel ring-structure RF, which partitions the centralized RF into 2N sub-blocks with an explicit N-by-N switch network for N FU. Each sub-block only requires access ports for a single FU. We also propose the hierarchical VLIW encoding with variable-length RISC-like instructions and NOP removal. The ring-structure RF saves 91.88% silicon area and reduces 77.35% access time of the centralized RF. Our simulation results show that the proposed instruction set architecture with the exposed ring-structure RF has comparable performance with the state-of-the-art DSP processors. Moreover, the hierarchical VLIW encoding can save 32%~50% code sizes.