Vinsamlegast notið þetta auðkenni þegar þið vitnið til verksins eða tengið í það: https://hdl.handle.net/1946/39414
A vocoder is a mechanism that encodes and decodes a speech signal so that it can be represented and/or transmitted in an efficient way. Vocoders are widely used in statistical parametric and neural speech synthesis where a regressor is used to predict the vocoder's representation by using input text. This study assesses five different vocoders based on three fundamentally different approaches with the aim to show how robust they are with respect to Icelandic. The study shows that the Wavenet and Magphase vocoders have the best performance in terms of retaining quality but the Wavenet vocoder uses by far the most resources, both in terms of computation and memory. The study also shows that the vocoders perform equally well for English and Icelandic, but the Wavenet vocoder trained on English needs to be adapted using Icelandic data in order for it to maintain good performance.
Skráarnafn | Stærð | Aðgangur | Lýsing | Skráartegund | |
---|---|---|---|---|---|
Vocoders.pdf | 3,86 MB | Opinn | Heildartexti | Skoða/Opna |