Is correlation dimension a reliable proxy for the number of dominant influencing variables for modeling risk of arsenic contamination in groundwater?
MetadataShow full item record
The correlation dimension (CD) of a time series provides information on the number of dominant variables present in the evolution of the underlying system dynamics. In this study, we explore, using logistic regression (LR), possible physical connections between the CD and the mathematical modeling of risk of arsenic contamination in groundwater. Our database comprises a large-scale arsenic survey conducted in Bangladesh. Following the recommendation by Hossain and Sivakumar (Stoch Environ Res Risk Assess 20(1-2):66-76, 2006), who reported CD values ranging from 8 to 11 for this database, 11 variables are considered herein as indicators of the aquifer's geochemical regime with potential influence on the arsenic concentration in groundwater. A total of 2,048 possible combinations of influencing variables are considered as candidate LR risk models to delineate the impact of the number of variables on the prediction accuracy of the model. We find that the uncertainty associated with prediction of wells as safe and unsafe by LR risk model declines systematically as the total number of influencing variables increases from 7 to 11. The sensitivity of the mean predictive performance also increases noticeably for this range. The consistent reduction in predictive uncertainty coupled with the increased sensitivity of the mean predictive behavior within the universal sample space exemplify the ability of CD to function as a proxy for the number of dominant influencing variables. Such a rapid proxy, based on non-linear dynamic concepts, appears to have considerable merit for application in current management strategies on arsenic contamination in developing countries, where both time and resources are very limited. 頓pringer-Verlag 2006.
Stochastic Environmental Research and Risk Assessment
Engineering not elsewhere classified