• myGriffith
    • Staff portal
    • Contact Us⌄
      • Future student enquiries 1800 677 728
      • Current student enquiries 1800 154 055
      • International enquiries +61 7 3735 6425
      • General enquiries 07 3735 7111
      • Online enquiries
      • Staff phonebook
    View Item 
    •   Home
    • Griffith Research Online
    • Conference outputs
    • View Item
    • Home
    • Griffith Research Online
    • Conference outputs
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

  • All of Griffith Research Online
    • Communities & Collections
    • Authors
    • By Issue Date
    • Titles
  • This Collection
    • Authors
    • By Issue Date
    • Titles
  • Statistics

  • Most Popular Items
  • Statistics by Country
  • Most Popular Authors
  • Support

  • Contact us
  • FAQs
  • Admin login

  • Login
  • A Text to speech synthesis system for the Mongolian language

    Author(s)
    Davaatsagaan, Mugi
    Paliwal, Kuldip
    Griffith University Author(s)
    Paliwal, Kuldip K.
    Davaatsagaan, Mugi
    Year published
    2007
    Metadata
    Show full item record
    Abstract
    The first Text-to-Speech (TTS) system for the Mongolian language has implemented using the general speech synthesis architecture of Festival. The conversion process from input text into acoustic waveform is performed in a number of steps consisting of functional components. The TTS is based on diphone concatenative synthesis, applying TD-PSOLA technique. Hand written letter to sound rules are applied in sequence mapping strings of letters to strings of phones. Prosodic phrasing is provided by a CART tree making decisions based on distance from punctuation and whether the current word is a function or content word. Intonation ...
    View more >
    The first Text-to-Speech (TTS) system for the Mongolian language has implemented using the general speech synthesis architecture of Festival. The conversion process from input text into acoustic waveform is performed in a number of steps consisting of functional components. The TTS is based on diphone concatenative synthesis, applying TD-PSOLA technique. Hand written letter to sound rules are applied in sequence mapping strings of letters to strings of phones. Prosodic phrasing is provided by a CART tree making decisions based on distance from punctuation and whether the current word is a function or content word. Intonation is provided by a CART tree predicting ToBI accents and an F0 contour generated from a model trained from natural speech. The duration model is also trained from data using a CART tree. The quality of synthesised speech is assessed in terms of acceptability and intelligibility. The synthetic speech produced by the current version of the system is intelligible, but utterances sometimes suffer from a lack of naturalness and fluency.
    View less >
    Conference Title
    Griffith School of Engineering Research Conference (GSERC)
    Publisher URI
    https://www.griffith.edu.au/griffith-sciences/school-engineering-built-environment
    Publication URI
    http://hdl.handle.net/10072/141604
    Collection
    • Conference outputs

    Footer

    Disclaimer

    • Privacy policy
    • Copyright matters
    • CRICOS Provider - 00233E
    • TEQSA: PRV12076

    Tagline

    • Gold Coast
    • Logan
    • Brisbane - Queensland, Australia
    First Peoples of Australia
    • Aboriginal
    • Torres Strait Islander