dc.contributor.author | Stark, A | |
dc.contributor.author | Paliwal, K | |
dc.date.accessioned | 2017-05-03T12:28:10Z | |
dc.date.available | 2017-05-03T12:28:10Z | |
dc.date.issued | 2011 | |
dc.date.modified | 2012-04-10T23:50:26Z | |
dc.identifier.issn | 0167-6393 | |
dc.identifier.doi | 10.1016/j.specom.2010.11.004 | |
dc.identifier.uri | http://hdl.handle.net/10072/44400 | |
dc.description.abstract | In this paper, we derive a minimum mean square error log-filterbank energy estimator for environment-robust automatic speech recognition. While several such estimators exist within the literature, most involve trade-offs between simplifications of the log-filterbank noise distortion model and analytical tractability. To avoid this limitation, we extend a well known spectral domain noise distortion model for use in the log-filterbank energy domain. To do this, several mathematical transformations are developed to transform spectral domain models into filterbank and log-filterbank energy models. As a result, a new estimator is developed that allows for robust estimation of both log-filterbank energies and subsequent Mel-frequency cepstral coefficients. The proposed estimator is evaluated over the Aurora2, and RM speech recognition tasks, with results showing a significant reduction in word recognition error over both baseline results and several competing estimators. | |
dc.description.peerreviewed | Yes | |
dc.description.publicationstatus | Yes | |
dc.language | English | |
dc.language.iso | eng | |
dc.publisher | Elsevier | |
dc.publisher.place | Netherlands | |
dc.relation.ispartofstudentpublication | N | |
dc.relation.ispartofpagefrom | 403 | |
dc.relation.ispartofpageto | 416 | |
dc.relation.ispartofissue | 3 | |
dc.relation.ispartofjournal | Speech Communication | |
dc.relation.ispartofvolume | 53 | |
dc.rights.retention | Y | |
dc.subject.fieldofresearch | Artificial intelligence not elsewhere classified | |
dc.subject.fieldofresearch | Cognitive and computational psychology | |
dc.subject.fieldofresearch | Linguistics | |
dc.subject.fieldofresearchcode | 460299 | |
dc.subject.fieldofresearchcode | 5204 | |
dc.subject.fieldofresearchcode | 4704 | |
dc.title | MMSE estimation of log-filter bank energies for robust speech recognition | |
dc.type | Journal article | |
dc.type.description | C1 - Articles | |
dc.type.code | C - Journal Articles | |
gro.faculty | Griffith Sciences, Griffith School of Engineering | |
gro.date.issued | 2011 | |
gro.hasfulltext | No Full Text | |
gro.griffith.author | Paliwal, Kuldip K. | |