Abstract
We address in this paper a case study of sparse matrix multiplication product (SMP) optimization i.e. where both matrices are sparse. We first study initial versions of loop nest structured algorithms corresponding to the most used sparse matrix compressed formats i.e. DNS, CSR and COO. Afterwards, we derive several versions obtained by applying optimization techniques such as scalar replacements, loop invariant motion, loop unrolling and we use different compiler optimizations such that -O0, -O1, -O2, -O3, -O4, -Os and -Funroll-loops. We study here the particular body kernel GAXPY-Row where all the matrices are acceded row-wise. A theoretical multi-fold performance study permits to establish accurate comparisons between the different versions. Our contribution is validated through a series of experiments achieved on a set of real sparse matrices of different sizes and densities on Grid'5000 using different kind of Intel Xeon Processor. We show that our algorithms outperform the SparseBlas library.