Spectral Subband Centroids for Robust Speaker Identification using Marginalization-based Missing Feature Theory

Loading...
Thumbnail Image
File version

Accepted Manuscript (AM)

Author(s)
Nicolson, Aaron
Hanson, Jack
Lyons, James
Paliwal, Kuldip
Primary Supervisor
Other Supervisors
Editor(s)
Date
2018
Size
File type(s)
Location
License
Abstract

Until now, marginalization-based Missing Feature Theory (MFT) for speech classification has been limited to the use of Log Spectral Subband Energies (LSSEs) as features. These features are highly correlated, thus suboptimal for classification with diagonal-covariance Gaussian Mixture Models (GMMs), a common classifier in marginalization-based MFT. In this paper, we propose that Spectral Subband Centroids (SSCs) are more apt for marginalization-based MFT, as they are both decorrelated and spectrally local. Our results show that SSCs as features produce a more robust marginalization-based MFT, diagonal-covariance GMM-based, Automatic Speaker Identification (ASI) system than LSSEs as features, for at all tested SNR values (with Additive White Gaussian Noise (AWGN)). It is also shown that a fully-connected Deep Neural Network (DNN) can accurately estimate the Ideal Binary Mask (IBM) used for MFT.

Journal Title

International Journal of Signal Processing Systems

Conference Title
Book Title
Edition
Volume

6

Issue

1

Thesis Type
Degree Program
School
DOI
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement

© 2018 Springer New York. This is an electronic version of an article published in Journal of Signal Processing Systems, Volume 6, No. 1, March 2018 > . Journal of Signal Processing Systems is available online at: http://link.springer.com/ with the open URL of your article.

Item Access Status
Note
Access the data
Related item(s)
Subject

Natural Language Processing

Persistent link to this record
Citation
Collections