Reliability assessment for distributed systems via communication abstraction and refinement
Author(s)
Gui, L
Sun, J
Liu, Y
Dong, JS
Griffith University Author(s)
Year published
2015
Metadata
Show full item recordAbstract
Distributed systems like cloud-based services are ever more popular. Assessing the reliability of distributed systems is highly non-trivial. Particularly, the order of executions among distributed components adds a dimension of non-determinism, which invalidates existing reliability assessment methods based on Markov chains. Probabilistic model checking based on models like Markov decision processes is designed to deal with scenarios involving both probabilistic behavior (e.g., reliabilities of system components) and non-determinism. However, its application is currently limited by state space explosion, which makes reliability ...
View more >Distributed systems like cloud-based services are ever more popular. Assessing the reliability of distributed systems is highly non-trivial. Particularly, the order of executions among distributed components adds a dimension of non-determinism, which invalidates existing reliability assessment methods based on Markov chains. Probabilistic model checking based on models like Markov decision processes is designed to deal with scenarios involving both probabilistic behavior (e.g., reliabilities of system components) and non-determinism. However, its application is currently limited by state space explosion, which makes reliability assessment of distributed system particularly difficult. In this work, we improve the probabilistic model checking through a method of abstraction and reduction, which controls the communications among system components and actively reduces the size of each component. We prove the soundness and completeness of the proposed approach. Through an implementation in a software toolkit and evaluations with several systems, we show that our approach often reduces the size of the state space by several orders of magnitude, while still producing sound and accurate assessment.
View less >
View more >Distributed systems like cloud-based services are ever more popular. Assessing the reliability of distributed systems is highly non-trivial. Particularly, the order of executions among distributed components adds a dimension of non-determinism, which invalidates existing reliability assessment methods based on Markov chains. Probabilistic model checking based on models like Markov decision processes is designed to deal with scenarios involving both probabilistic behavior (e.g., reliabilities of system components) and non-determinism. However, its application is currently limited by state space explosion, which makes reliability assessment of distributed system particularly difficult. In this work, we improve the probabilistic model checking through a method of abstraction and reduction, which controls the communications among system components and actively reduces the size of each component. We prove the soundness and completeness of the proposed approach. Through an implementation in a software toolkit and evaluations with several systems, we show that our approach often reduces the size of the state space by several orders of magnitude, while still producing sound and accurate assessment.
View less >
Conference Title
2015 International Symposium on Software Testing and Analysis, ISSTA 2015 - Proceedings
Subject
Software engineering not elsewhere classified