dc.contributor.advisor | Paliwal, Kuldip | |
dc.contributor.author | Stark, Anthony | |
dc.date.accessioned | 2018-01-23T02:30:14Z | |
dc.date.available | 2018-01-23T02:30:14Z | |
dc.date.issued | 2011 | |
dc.identifier.doi | 10.25904/1912/2288 | |
dc.identifier.uri | http://hdl.handle.net/10072/366490 | |
dc.description.abstract | Speech is the dominant mode of communication between humans; simple to learn, easy to use and integral for modern life. Given the importance of speech, development of a human-machine speech interface has been greatly anticipated. This challenging task is encapsulated in the digital speech processing research field. In this dissertation, two specific areas of research are considered: 1) the use of short-time Fourier spectral phase in digital speech processing and 2) use of the minimum mean square error spectral energy estimator for environment-robust automatic speech recognition. In speech processing and modelling, the short-time Fourier spectral phase has been considered of minor importance. This is because classic psychoacoustic experiments have shown speech intelligibility to be closely related to short-time Fourier spectral magnitude. Given this result, it is unsurprising that the majority of speech processing literature has involved exploitation of the short-time magnitude spectrum. Despite this, recent studies have shown useful information can be extracted from the spectral phase of speech. As a result, it is now known that spectral phase possesses much of the same intelligibility information as spectral magnitude. It is this avenue of research that is explored in greater detail within this dissertation. In particular, we investigate two phase derived quantities – the short-time instantaneous frequency spectrum and the short-time group delay spectrum. The properties of both spectra are investigated mathematically and empirically, identifying the relationship between known speech features and the underlying phase spectrum. We continue the investigation by examining two related quantities – the instantaneous frequency deviation and the group delay deviation. As a result of this research, two novel phase-based spectral representations are proposed, both of which show a high degree information applicable to speech processing. | |
dc.language | English | |
dc.publisher | Griffith University | |
dc.publisher.place | Brisbane | |
dc.rights.copyright | The author owns the copyright in this thesis, unless stated otherwise. | |
dc.subject.keywords | Digital speech processing | |
dc.subject.keywords | Speech recognition | |
dc.subject.keywords | Fourier spectral magnitude | |
dc.subject.keywords | Automatic speech recognition | |
dc.title | Phase Spectrum Based Speech Processing and Spectral Energy Estimation for Robust Speech Recognition | |
dc.type | Griffith thesis | |
gro.faculty | Science, Environment, Engineering and Technology | |
gro.rights.copyright | The author owns the copyright in this thesis, unless stated otherwise. | |
gro.hasfulltext | Full Text | |
dc.contributor.otheradvisor | So, Stephen | |
dc.rights.accessRights | Public | |
gro.identifier.gurtID | gu1341791048255 | |
gro.source.ADTshelfno | ADT0 | |
gro.source.GURTshelfno | GURT1240 | |
gro.thesis.degreelevel | Thesis (PhD Doctorate) | |
gro.thesis.degreeprogram | Doctor of Philosophy (PhD) | |
gro.department | Griffith School of Engineering | |
gro.griffith.author | Stark, Anthony P. | |