dc.description.abstract | Ecologists and environmental managers regularly need to make decisions
about restoration and management with limited information and uncertainty
about the outcomes of system interventions. Uncertainty is an inevitable
component of environmental management and decision making (as well as
other complex systems’ management); however, the significance of
environmental problems necessitates the effort to quantify and, where possible,
minimise the uncertainty around the outputs of models of a given system.
Uncertainty about the model output stems from the uncertainty about the model
input (arising from different sources) and the model structure. The latter is the
more complex of the two and as such challenging to address.
There are a range of quantitative methods available to assist with environmental
management and decision making. However, while finding one specific model
to represent a system effectively is an ideal goal, the selection of the most
suitable model among those available is a challenging task for the modeller.
Model selection includes at least two aspects, the selection of appropriate
variables and an appropriate model structure. Model structure describes the
nature of the mathematical representation of the cause and effect relationships
that are quantified by the model. Many approaches for variable selection
already exist; however, methods to guide the quantitative selection of an
appropriate model structure are not so well developed. Model structure
selection is an important step in modelling, which needs not only to include the
essential variables and processes of the system, but also to avoid unnecessary
complexity that doesn’t improve the modelling results.
Selecting an appropriate model structure involves several related steps. One
step is to determine the possible nature of the cause and effect relationships
among the variables. A second step is to investigate carefully the amount and
the quality of the available information, along with evaluating the uncertainty
about the conditions, the variables and the modelling parameters, and then
selecting one out of many model structures. For example, the modeller may
decide to use deterministic models, or to embody some stochastic components
in the model and assign some probability to the occurrence of some random
variables. For the latter, they may decide to use a statistical model or other probabilistic models, such as Bayesian networks. They may decide to represent
the whole system with one model of the system, or may choose to break the
large-scale system down into simpler ones and use different models to
represent the more clear cause and effect relationships among variables.
Among the many aspects of model structure for a modeller to select, there is
the choice between a single-level and a multi-level model structure. Single-level
models represent the relationships between variables with fixed coefficients,
and in case the data are grouped, ignore the group differences. Conversely,
hierarchical models consider the relationships within and between levels of
grouped data and can account for the variation between groups. Hierarchical
models can include varying coefficients to quantify how relationships between
variables at one level may depend on variables at other levels. The complexity
and heterogeneity of environmental and ecological systems can benefit from the
use of hierarchical models to accommodate more complexity and embrace
different principles that might apply at different scales. However, compared to
single-level models, hierarchical models are considerably more complex and
more difficult to implement. Therefore, there is a strong need to understand the
conditions where the additional work of fitting a hierarchical model is necessary.
In this research, I aimed to identify the statistical conditions under which
hierarchical models provide a better fit to complex data than single level
models. This involved the analysis of an empirical ecological dataset in tandem
with a large simulation study of 70,000 datasets. The simulation study provided
a way to analyse a large range of datasets with known structure, uncertainty
and relationships among the variables, while the empirical study provided an
avenue to test the approach in a real setting with noisy data.
For the simulation study, I set both single-level and hierarchical models’
structures as Poisson regression (due to the importance of this distribution in a
large number of ecological studies) in a Bayesian framework. The Bayesian
framework is a flexible approach that is used increasingly to quantify
environmental and ecological processes, and guide decision making. A key
feature of Bayesian approaches is the capacity to quantify uncertainty at all
levels of the model and propagate that uncertainty through to the response or
outcome variable. This ensures that uncertainty around predictions from the
model, be they ecological responses to natural disturbances or management interventions, is naturally included in the model output. Moreover, Hierarchical
Bayesian models offer great promise in quantifying multiscale processes and
developing complex probabilistic models that reflect underlying ecological
processes. The results of the simulation study identified which of the 70,000
datasets where a hierarchical model fit better than a single-level model.
Based on the findings, I developed a statistical tool for model-structure selection
that can be efficiently applied to a set of data, and recommend a modelstructure
with an accompanying reliability of recommendation. This tool
provides a quantitative approach to inform users when the additional effort of
hierarchical modelling would provide a better model fit and when the simpler
single level model structures are appropriate. To demonstrate the applicability
of the proposed model-structure selection tool, I applied the proposed tool to
three empirical ecological datasets. I also developed both single-level and
hierarchical models on these datasets and compared standard goodness fit
metrics to the recommendation from my proposed tool. For all datasets, the
proposed tool recommended, with a very high reliability of recommendation, the same model-structure selected as the better fit by modelling results. | |