Abstract
Recent developments in High-Level Synthesis (HLS) for FPGAs are making it possible to “run” C code on FPGAs thereby making modern programming environments available to FPGA developers. In this paper, C code for a complex optical-flow algorithm is optimized for both a desktop PC and for an FPGA-based system, the Xilinx Zynq-7000, a device containing both a programmable fabric and two ARM cores. The paper discusses how the code is optimized and restructured to execute effectively on the programmable fabric and the ARM cores. The resulting Zynq version of the C code is competitive with the desktop PC but only consumes 1/7th as much energy.