Proceedings of the Third IEEE/ACM International Symposium on Cluster Computing and the Grid
Download PDF

Abstract

As computing systems grow in complexity, the cluster and grid communities require more sophisticated tools to diagnose, debug and analyze such systems. We have developed a toolkit called MAGNET (Monitoring Apparatus for General kerNel-Event Tracing) that provides a detailed look at operating-system kernel events with very low overhead. Using the fine-grained information that MAGNET exports from kernel space, challenging problems become amenable to identification and correction. In this paper, we first present the design, implementation and evaluation of MAGNET. Then, we show its use as a diagnostic tool, an online-monitoring tool and a tool for building adaptive applications in clusters and grids.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles