A Comparison of LSTM and GRU for Bengali Speech-to-Text Transformation
File version
Author(s)
Sultana, Zakia
Chowdhury, Fahim
Ahmed, Sajjad
Parvez, Mohammad Zavid
Barua, Prabal Datta
Chakraborty, Subrata
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
Size
File type(s)
Location
Orlando, USA
License
Abstract
This paper represents an approach to speech-to-text conversion in the Bengali language. In this area, we have found most of the methodologies were focused on other languages rather than Bengali. We started with a novel dataset of 56 unique words from 160 individual subjects was prepared. Then in this paper, we illustrate the approach to increasing accuracy in a speech-to-text over the Bengali language where initially we started with Gated Recurrent Unit(GRU) and Long short-term memory (LSTM) algorithms. During further observation, we found that the output of the GRU failed to give any stable output. So, we moved completely to the LSTM algorithm where we achieved 90% accuracy on an unexplored dataset. Voices of several demographic populations and noises were used to validate the model. In the testing phase, we tried a variety of classes based on their length, complexity, noise, and gender variant. Moreover, we expect that this research will help to develop a real-time Bengali speak-to-text recognition model.
Journal Title
Conference Title
Proceedings of the 2023 International Conference on Advances in Computing Research (ACR’23)
Book Title
Edition
Volume
700
Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject
Persistent link to this record
Citation
Jahan, N; Sultana, Z; Chowdhury, F; Ahmed, S; Parvez, MZ; Barua, PD; Chakraborty, S, A Comparison of LSTM and GRU for Bengali Speech-to-Text Transformation, Proceedings of the 2023 International Conference on Advances in Computing Research (ACR’23), 2023, pp. 214-224