2014 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)
Download PDF

Abstract

Dataflow Error Detection and Recovery (DFER) was shown to be a good approach to address errors in the scope of parallel programming. Previous work showed that this technique presents good performance by imposing reduced overhead in error-free executions. However, in the presence of errors excessive rollbacks may occur, characterizing the Domino Effect. In this paper we propose a scheme that addresses this issue by protecting execution from the Domino Effect. Our experimental results show that without adding any significant overheads to the original DFER version we are able to reduce in up to 40% the total execution time in situations where errors are detected. Furthermore, since there are no significant overheads, the execution time in error-free situations remains the same as in the baseline.
Like what you’re reading?
Already a member?Sign In
Member Price
$11
Non-Member Price
$21
Add to CartSign In
Get this article FREE with a new membership!

Related Articles