Designing elicitation of expert knowledge into conditional probability tables in Bayesian networks: choosing scenario
MetadataShow full item record
For many researchers there is increasing pressure to collect and analyze bigger datasets, from sources such as analytics, online surveys or spatial datasets. Bayesian networks (BNs) provide a feasible and intuitive means of developing explanatory models with diverse stakeholders, having limited quantitative expertise. However, when a large number of variables and levels are involved as potential inputs to a BN, the more resources are required to evaluate alternative models. Our motivation is to design a large species distribution modeling (SDM) experiment, in the Biodiversity and Climate Change Virtual Laboratory. We show how BNs, elicited from experts, can be used to inform design of these kinds of large computing experiments. In this context we examine how settings of some SDM algorithms potentially affect the quality of the prediction. For example, one setting could be the choice of covariates used as input to the SDM, with three levels: a minimal, an extensive set or something in between. A conditional probability table (CPT) quantifies the child node (e.g. quality of prediction) as it depends conditionally on each of the parents (here settings). Guidelines on eliciting CPTs generally advise modellers to simplify the elicitation task by keeping to a minimum the number of parent nodes and parent/child states. The literature on BNs indicates that elicitation of more complex CPTs may be too demanding for experts, because of the time required. In the context of large CPTs, an often encountered problem is the sheer amount of information asked of the expert (number of scenarios). Here we propose that an elicitation strategy can be designed according to statistical criteria: to ensure adequate coverage of the CPTs, in an efficient manner, to make best use o f the scarce resources like the valuable time of the experts. This is essentially a problem of experimental design. Some software tools such as CPT calculator support specification of large CPTs, but implicitly adopt a particular kind of experiment design. Here we conduct experiments to evaluate designs for eliciting expert knowledge to help quantify CPTs that define B Ns. We consider three types of design of elicitation: Taguchi as a kind of screening design, CPT calculator’s design, and a composite. In the case study, we asked modellers to consider how different settings affect the quality of an algorithm used to construct a SDM. Limiting the number of scenarios avoids tiring the experts, which can lead to inaccuracies. Eliciting and encoding CPTs was examined using a model-based “outside-in” Elicitator approach to quantitative elicitation, which allows experts to specify their opinions with uncertainty. Our results determined that the most important settings with the largest positive impacts on the quality of prediction were: the choice of real absence data, quadratic complexity of the function and the choice of the expert’s minimal subset of variables. In addition, there were differences arising due to the choice of design and the elicitation scenarios. Overall, we found that the Taguchi OA design was more efficient because the effect sizes estimated for CPT calculator had more uncertainties than for the Taguchi design.
22nd International Congress on Modelling and Simulation: Managing cumulative risks through model-based processes (MODSIM2017)
© 2017 Modellling & Simulation Society of Australia & New Zealand. The attached file is reproduced here in accordance with the copyright policy of the publisher. For information about this conference please refer to the conference’s website or contact the author(s).