Abstract
Givens Rotation is a key computation-intensive block in embedded wireless applications. In order to achieve an efficient mapping which smoothly scales to the underlying architecture, we propose two new Column-based Givens Rotation algorithms, derived from traditional Fast Givens and Square-root and Division Free Givens algorithms. These algorithms allow annihilation of multiple elements in a column of the input matrix simultaneously, without a dependency bottle-neck allowing increased parallelism, resource sharing and scalability. The ease of mapping and scalability has been tested on a layered coarse-grained reconfigurable architecture reaching close to optimal results for highly parallel architectures.