Fourth International Conference on Networking and Services (icns 2008)
Download PDF

Abstract

Service Level Agreements (SLAs) have been introduced into the Grid in order to build a basis for its commercial uptake. The challenge for Grid providers in agreeing and operating SLA-bound jobs is to ensure their fulfillment even in the case of failures. Hence, fault-tolerance mechanisms are an essential means of the provider?s SLA management. The high utilization of commercial operated clusters leads to scenarios in which typically a job migration effects other jobs scheduled. The effects result from the unavailability of enough free resources which would be needed to catch all resource outages. Consequently before initiating a migration, its effects for other jobs have to be compared and the initiation of fault tolerance(FT-) mechanisms have to be evaluated recursively. This paper presents a measurement for the benefit of initiating a FT mechanism,the recursive evaluation, and termination condition. Performing such an impact evaluation of an initiated chain of FT-mechanisms is often more profitable than performing a single FT-mechanism and accordingly this is important for the Grid commercialization.
Like what you’re reading?
Already a member?Sign In
Member Price
$11
Non-Member Price
$21
Add to CartSign In
Get this article FREE with a new membership!

Related Articles