Impact of Feature Selection Algorithm on Speech Emotion Recognition Using Deep Convolutional Neural Network

Loading...
Thumbnail Image
File version

Version of Record (VoR)

Author(s)
Farooq, Misbah
Hussain, Fawad
Baloch, Naveed Khan
Raja, Fawad Riasat
Yu, Heejung
Zikria, Yousaf Bin
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2020
Size
File type(s)
Location
Abstract

Speech emotion recognition (SER) plays a significant role in human-machine interaction. Emotion recognition from speech and its precise classification is a challenging task because a machine is unable to understand its context. For an accurate emotion classification, emotionally relevant features must be extracted from the speech data. Traditionally, handcrafted features were used for emotional classification from speech signals; however, they are not efficient enough to accurately depict the emotional states of the speaker. In this study, the benefits of a deep convolutional neural network (DCNN) for SER are explored. For this purpose, a pretrained network is used to extract features from state-of-the-art speech emotional datasets. Subsequently, a correlation-based feature selection technique is applied to the extracted features to select the most appropriate and discriminative features for SER. For the classification of emotions, we utilize support vector machines, random forests, the k-nearest neighbors algorithm, and neural network classifiers. Experiments are performed for speaker-dependent and speaker-independent SER using four publicly available datasets: the Berlin Dataset of Emotional Speech (Emo-DB), Surrey Audio Visual Expressed Emotion (SAVEE), Interactive Emotional Dyadic Motion Capture (IEMOCAP), and the Ryerson Audio Visual Dataset of Emotional Speech and Song (RAVDESS). Our proposed method achieves an accuracy of 95.10% for Emo-DB, 82.10% for SAVEE, 83.80% for IEMOCAP, and 81.30% for RAVDESS, for speaker-dependent SER experiments. Moreover, our method yields the best results for speaker-independent SER with existing handcrafted features-based SER approaches.

Journal Title

Sensors

Conference Title
Book Title
Edition
Volume

20

Issue

21

Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Item Access Status
Note
Access the data
Related item(s)
Subject

Artificial intelligence

Analytical chemistry

Ecology

correlation-based feature selection

deep convolutional neural network

speech emotion recognition

Persistent link to this record
Citation

Farooq, M; Hussain, F; Baloch, NK; Raja, FR; Yu, H; Zikria, YB, Impact of Feature Selection Algorithm on Speech Emotion Recognition Using Deep Convolutional Neural Network., Sensors, 2020, 20 (21), pp. 6008

Collections