27th Annual IEEE Conference on Local Computer Networks, 2002. Proceedings. LCN 2002.
Download PDF

Abstract

Large-scale clusters built out of commercial components face similar scalability obstacles as the massively parallel processors (MPP) of the 1980?s. This is especially true when they are used for scientific computing. Their networks are the descendants of the MPP networks, but the communication software in use has been designed for wide-area networks with client/server applications in mind. We present a communication protocol which has been designed specifically for large-scale clusters with a scientific application workload. The protocol takes advantage of the low error rate and high performance of these networks. It is adapted to the peculiarities of these MPP-like networks and the communication characteristics of scientific applications. This paper only presents the protocol itself and the ideas behind it. We refer the reader to other publications for more information about scalability, performance, and usage of the protocol presented here.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!