Application of ASR to a sociolinguistic corpus of Australian English
File version
Version of Record (VoR)
Author(s)
Gnevsheva, Ksenia
Travis, Catherine
Docherty, Gerard
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
Size
File type(s)
Location
Melbourne, Australia
License
Abstract
This study applies Automatic Speech Recognition (ASR) to a sociolinguistic corpus of Australian English. We compare a human transcription of excerpts from 20 urban and regional speakers with a transcription generated by Microsoft’s Azure AI Speech. The Word Error Rate is comparable to previous studies, and is not impacted by the sociolinguistic variables of speaker region and gender, nor the phonetic variable of vowel formants. Despite the overall low rate of transcription errors, our findings suggest that the quality of certain vowel categories that are particularly characteristic of Australian English can impact on the accuracy of the ASR-generated transcription.
Journal Title
Conference Title
Proceedings of the Nineteenth Australasian International Conference on Speech Science and Technology
Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
Publisher link
DOI
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
This work is covered by copyright. You must assume that re-use is limited to personal use and that permission from the copyright owner must be obtained for all other uses. If the document is available under a specified licence, refer to the licence for details of permitted re-use. If you believe that this work infringes copyright please make a copyright takedown request using the form at https://www.griffith.edu.au/copyright-matters.
Item Access Status
Note
Access the data
Related item(s)
Subject
Phonetics and speech science
Sociolinguistics
Persistent link to this record
Citation
Weiss, M; Gnevsheva, K; Travis, C; Docherty, G, Application of ASR to a sociolinguistic corpus of Australian English, Proceedings of the Nineteenth Australasian International Conference on Speech Science and Technology, 2024, pp. 27-31