Proceedings 20th IEEE International Parallel & Distributed Processing Symposium
Download PDF

Abstract

Many scientific applications use MPI collective communications intensively. Therefore, efficient and scalable implementation of collective operations is critical to the performance of such applications running on clusters. Quadrics QsNet/sup II/ is a high-performance interconnect for clusters that implements some collectives at the Elan level. These collectives are directly used by their corresponding MPI collectives. Quadrics software supports point-to-point striping over multi-rail QsNet/sup II/ networks. However, multi-rail collectives have not been supported. In this work, we propose a number of RDMA-based multi-port collectives over multi-rail QsNet/sup II/ clusters directly at the Elan level. Our performance results indicate that the proposed multi-port gather gains an improvement of up to 6.35 for 1MB message over the native elan/spl I.bar/gather. The proposed multi-port all-to-all performs better than the native elan/spl I.bar/alltoall by a factor of 2.19 for 16KB message. Moreover, we have also proposed two algorithms for the scatter operation.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles