Advanced Search
CS Search Google Search
Subscribers, please login

Published Articles >> Table of Contents >> Abstract

International Symposium on Code Generation and Optimization (CGO'07)   pp. 244-258
Compiler-Managed Software-based Redundant Multi-Threading for Transient Fault Detection

Full Article Text: Download PDF of full textBuy this articleGet full text from IEEE Xplore

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/CGO.2007.7
Send link to a friend

Abstract
As transistors become increasingly smaller and faster with tighter noise margins, modern processors are becoming increasingly more susceptible to transient hardware faults. Existing Hardware-based Redundant Multi-Threading (HRMT) approaches rely mostly on special-purpose hardware to replicate the program into redundant execution threads and compare their computation results. In this paper, we present a Software-based Redundant Multi-Threading (SRMT) approach for transient fault detection. Our SRMT technique uses compiler to automatically generate redundant threads so they can run on general-purpose chip multi-processors (CMPs). We exploit high-level program information available at compile time to optimize data communication between redundant threads. Furthermore, our software-based technique provides flexible program execution environment where the legacy binary codes and the reliability-enhanced codes can co-exist in a mix-and-match fashion, depending on the desired level of reliability and software compatibility. Our experimental results show that compiler analysis and optimization techniques can reduce data communication requirement by up to 88% of HRMT. With general-purpose intra-chip communication mechanisms in CMP machine, SRMT overhead can be as low as 19%. Moreover, SRMT technique achieves error coverage rates of 99.98% and 99.6% for SPEC CPU2000 integer and floating-point benchmarks, respectively. These results demonstrate the competitiveness of SRMT to HRMT approaches.
Additional Information

Citation:  1 Wang, 1 Kim, 1 Wu, 1 Ying, "Compiler-Managed Software-based Redundant Multi-Threading for Transient Fault Detection," cgo, pp. 244-258,  International Symposium on Code Generation and Optimization (CGO'07),  2007

Similar Articles

Abstract Contents
Abstract
Citation




Free access to

  • Abstracts
  • Selected PDFs

Electronic subscribers login to:

  • Access HTML/PDFs of full text articles

Subscription information

Get a Web account

PDFs require Adobe Acrobat Reader.

Peer Review Notice

Give us Feedback