Abstract
In this article, we investigate how the distribution of dense matrices in a cluster of multi-core systems affects the performance of linear algebra codes. These codes have the property of communicating only within rows or columns of a processor grid. Especially, we consider orthogonalization methods like the GramSchmidt orthogonalization. Experiments show the performance for different topology aware mappings on two different multi-core clusters connected with Infiniband. We show how the performance is influenced by these mappings.