Random forest algorithm yields accurate quantitative prediction models of benthic light at intertidal sites affected by toxic Lyngbya majuscula blooms
MetadataShow full item record
It is shown that targeted high frequency monitoring and modern machine learning methods lead to highly predictive models of benthic light flux. A state-of-the-art machine learning technique was used in conjunction with a high frequency data set to calibrate and test predictive benthic lights models for sites dominated by the toxic cyanobacterium Lyngbya majuscula. Datasets of incident light flux, waters depth, wind speed, wind direction and benthic light flux were collected at two sites, Deception Bay and Eastern Banks, in Moreton Bay, Queensland, that are affected by the toxic marine cyanobacteria L. majuscula. Random forest models were calibrated and validated using the dataset. The models predict benthic photosynthetically active radiation (PAR) based on wind speed, wind direction, water depth and incident light. The two benthic light models produced have high predictive capacity with R2 values of the order 0.8 for both calibration and validation. Wind speed and/or direction influenced the prediction of benthic PAR in the model at each site, suggesting that wind-driven sediment resuspension is an important process in controlling water clarity and hence benthic PAR at both sites; a result consistent with many previous examples from coastal regions. In particular, wind direction had a marked affect on benthic PAR at the partially sheltered Deception Bay site compared to the fully exposed Eastern Banks site. Consistent with other Harmful Algae Bloom (HAB) species, eutrophication has been identified as a common link between sites affected by L. majuscula blooms. However, we show here that at least one major bloom of L. majuscula in Deception Bay coincided with light conditions for optimal L. majuscula photosynthesis, supporting the hypothesis that toxic blooms are a result of a confluence of causes rather than just one. We argue that at eutrophied sites, variation in parameters such as temperature and light rather than nutrients are more likely to cause blooms. This study demonstrates that collection of high frequency variables related to aquatic light flux combined with state of the art machine learning techniques, such as the random forest algorithm, can produce highly accurate models of light flux at shallow coastal sites. These models provide management with the beginnings of an early warning system and scientists with a practical tool for catching the bloom at the incipient stage rather than after the bloom has already developed.
Conservation and Biodiversity