Abstract
Reducing delay of a digital circuit is an important topic in logic synthesis for standard cells and LUT-based FPGAs. This paper presents a simple, fast, and very efficient synthesis algorithm to improve the delay after technology mapping. The algorithm scales to large designs and is implemented in a publicly-available technology mapper. The code is available online. Experimental results on industrial designs show that the method can improve delay after standard cell mapping by 30% with the increase in area 2.4%, or by 41% with the increase in area by 3.9%, on top of a high-effort synthesis and mapping flow. In a separate experiment, the algorithm was used as part of a complete industrial standard cell design flow, leading to improvements in area and delay after place-and-route. In yet another experiment, the algorithm was applied before FPGA mapping into 4-LUTs, resulting in 16% logic level reduction at the cost of 9% area increase on top of a high-effort mapping.