Abstract
In recent years, the development of automatic speech recognition systems has ensured their widespread use in a broad range of areas. Most of training data, making them less suitable for lowresourced languages and for smaller varieties of
(well-resourced) languages. This paper focuses on improving automatic speech recognition for Austrian German by means of training data augmentation through neural network-based text-to-speech synthesis. For this purpose, speaker embedding vectors are extracted from an existing corpus and subsequent interpolation between these vectors is used for the generation of new voices. Synthesised speech is
then used to train an automatic speech recognition system, while comparing differently large portions of synthesised speech in the training data. Overall,
we find that performance improves when the ratio of real and synthesised speech is in the same order of magnitude.
(well-resourced) languages. This paper focuses on improving automatic speech recognition for Austrian German by means of training data augmentation through neural network-based text-to-speech synthesis. For this purpose, speaker embedding vectors are extracted from an existing corpus and subsequent interpolation between these vectors is used for the generation of new voices. Synthesised speech is
then used to train an automatic speech recognition system, while comparing differently large portions of synthesised speech in the training data. Overall,
we find that performance improves when the ratio of real and synthesised speech is in the same order of magnitude.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 20th International Congress of Phonetic Sciences |
| Place of Publication | Prague |
| Publisher | International Phonetic Association |
| Pages | 3126-3130 |
| ISBN (Electronic) | 978-80-908 114-2-3 |
| Publication status | Published - 2023 |
| Event | 20th International Congress on Phonetic Sciences : ICPhS 2023 - Prag, Czech Republic Duration: 7 Aug 2023 → 11 Aug 2023 https://www.icphs2023.org/call-for-papers/ |
Conference
| Conference | 20th International Congress on Phonetic Sciences |
|---|---|
| Abbreviated title | ICPhS 2023 |
| Country/Territory | Czech Republic |
| City | Prag |
| Period | 7/08/23 → 11/08/23 |
| Internet address |
Fingerprint
Dive into the research topics of 'Speaker interpolation based data augmentation for automatic speech recognition'. Together they form a unique fingerprint.Projects
- 1 Finished
-
FWF - Spontansprache - Cross-layer language models for conversational speech
Schuppler, B. (Consortium manager resp. coordinator with external organisations) & Schuppler, B. (Project manager on research unit)
1/11/19 → 31/10/24
Project: Research project
Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS