Correcting Response Failure Errors in Multi-Objective Optimisation in Unreliable Distributed Computing Environments

View/ Open
Author(s)
Rawlins, T
Lewis, A
Griffith University Author(s)
Year published
2010
Metadata
Show full item recordAbstract
Population-based, multi-objective optimisation algorithms are increasingly making use of distributed, parallel computing environments. In these cases it is a commonsense precaution to consider the possibility of a variety of failures. In particular, errors caused by response failures are more prone to arise than in homogeneous parallel computers. While masking errors using redundant computation is simple and reasonably reliable, it is expensive in terms of the computing resources required. An alternative approach is presented that uses a Byzantine agreement methodology, utlising only results already computed. In computational ...
View more >Population-based, multi-objective optimisation algorithms are increasingly making use of distributed, parallel computing environments. In these cases it is a commonsense precaution to consider the possibility of a variety of failures. In particular, errors caused by response failures are more prone to arise than in homogeneous parallel computers. While masking errors using redundant computation is simple and reasonably reliable, it is expensive in terms of the computing resources required. An alternative approach is presented that uses a Byzantine agreement methodology, utlising only results already computed. In computational experiments it has a demonstrated ability to correct errors, and salvage useable results from unreliable, distributed computing environments. With increasing reliance on computing resources provided and operated by external agencies, error detection and correction can be expected to become more important to a range of applications.
View less >
View more >Population-based, multi-objective optimisation algorithms are increasingly making use of distributed, parallel computing environments. In these cases it is a commonsense precaution to consider the possibility of a variety of failures. In particular, errors caused by response failures are more prone to arise than in homogeneous parallel computers. While masking errors using redundant computation is simple and reasonably reliable, it is expensive in terms of the computing resources required. An alternative approach is presented that uses a Byzantine agreement methodology, utlising only results already computed. In computational experiments it has a demonstrated ability to correct errors, and salvage useable results from unreliable, distributed computing environments. With increasing reliance on computing resources provided and operated by external agencies, error detection and correction can be expected to become more important to a range of applications.
View less >
Conference Title
RPC 2010 - 1st Russia and Pacific Conference on Computer Technology and Applications
Publisher URI
Copyright Statement
© 2010 Academic Alliance International. The attached file is reproduced here in accordance with the copyright policy of the publisher. Please refer to the conference's website for access to the definitive, published version.
Subject
Optimisation
Software engineering not elsewhere classified