A response generation in the Mongolian spoken language system for accessing to multimedia knowledge base
MetadataShow full item record
By using automatic speech recognition (ASR) and text to speech (TTS) systems, which have been available in Mongolian for last few years, this research set out to implement a new version of the Mongolian Virtual Education Environment (VEE) that has not included a speech interface. The spoken language system aims to provide a natural interface between trainees and the environment by using simple and natural dialogues to enable the user to access the multimedia knowledge base of the VEE. We have worked on the response generation part of the system. This paper describes a TTS system for the VEE for university courses held in Mongolian. A concatenative speech synthesizer for Mongolian is applied for the TTS in response generation. A Festvox framework for unit selection speech synthesis was used to build the Mongolian voice. We discuss aspects of the voice development process and the results of a perceptual test of the synthesized voice.
2008 IEEE Workshop on Spoken Language Technology
Copyright 2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.