2008 IEEE International Parallel & Distributed Processing Symposium
Download PDF

Abstract

The achievable performance on Infiniband networks is governed by the latencies and bandwidths of communication channels as well as by contention within the network. Currently Infiniband statically routes messages and thus do not take into account dynamic loading of the channels. By interrogating the network routing tables we quantify the contention that occurs for a number of communication patterns using a large-scale (1024 processor) system. Empirical data confirms our contention calculation almost exactly. Custom routing tables are defined that provide both optimum and worst-case performance for a large-range of communication patterns. Performance differences can be as large as 12x (from optimum to worst-case). Two large-scale applications show a runtime improvement of between 10–20% and up to 40% improvement in just their communication time when using optimized routing tables. The approach taken is applicable to many Infiniband systems, and we expect the performance improvements to be even greater on larger-scale systems.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles