A novel protocol for exploratory analysis of unknown sound-types in large acoustic datasets
File version
Version of Record (VoR)
Author(s)
Suárez-Castro, AF
Teixeira, D
Linke, S
Sheldon, F
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
Size
File type(s)
Location
Abstract
Current ecoacoustic analysis methods are unsuitable for exploring unknown sound-types in large acoustic datasets. Ecoacoustic studies can collect considerable quantities of audio with minimal field effort; however, analysing these recordings effectively remains a challenge. Manual annotation is labour-intensive, acoustic indices only summarise soundscape patterns and machine learning tools like BirdNet enable species-level identification but are not optimised for non-terrestrial taxa and cannot explore unknown sound-types. This creates a clear need for exploratory methods that can efficiently identify unknown sound-types, particularly in data-deficient environments. We present a protocol to identify sound-types in ecoacoustic recordings using beta acoustic indices and nested clustering (a multi-level method where clusters contain sub-clusters). Compared to existing methods, our protocol offers a more adaptable framework for identifying sound-types in unsurveyed or data-poor environments. It is suitable for large acoustic datasets and does not require advanced computational skills. To our knowledge, this is the first protocol to combine beta acoustic indices with nested clustering to identify sound-types in ecoacoustic data. Limited evidence exists on how window length (WL) influences beta index calculations, so we tested 11 indices against six WLs and multiple cluster quantities. Our nested clustering approach addressed challenges including background noise, acoustic overlap and data imbalances (e.g. unequal quantities of reoccurring sound-types). We tested our protocol in a stream soundscape as freshwater ecosystems are underexplored, lack a global underwater sound database and contain many unidentified sounds. We used a systematic testing framework to identify optimal combinations of beta indices and WL for sound-type clustering. For the studied system, the Kolmogorov–Smirnov index with a 2048 WL produced the highest-performing results. This combination scored ≥0.75 (normalised, 0–1) on all external validation metrics, a true positive rate of over 90% and identified almost 90% of sound-types. This work presents a streamlined approach for identifying unknown sound events in large audio datasets using minimal manual effort and demonstrates a novel use-case for beta indices. We anticipate this method will inspire new applications of beta indices and user-friendly analysis tools for big data, advancing ecoacoustic analyses alongside technological advancements.
Journal Title
Methods in Ecology and Evolution
Conference Title
Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
© 2025 The Author(s). Methods in Ecology and Evolution published by John Wiley & Sons Ltd on behalf of British Ecological Society. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
Item Access Status
Note
This publication has been entered in Griffith Research Online as an advance online version.
Access the data
Related item(s)
Subject
Ecology
Zoology
Environmental management
Persistent link to this record
Citation
Turlington, K; Suárez-Castro, AF; Teixeira, D; Linke, S; Sheldon, F, A novel protocol for exploratory analysis of unknown sound-types in large acoustic datasets, Methods in Ecology and Evolution, 2025