Evaluating State of the Art Voice Conversion Models for Dysphonic and Electro-Larynx Speech

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

Abstract

Pathological speech, caused by dysphonia or produced via electro-larynx devices, often suffers from poor intelligibility and unnatural prosody. In this paper, we investigate the potential of four state-of-the-art voice conversion models: FreeVC, QuickVC, LLVC, and XVC for restoring healthy-sounding speech. All models are fine-
tuned on Austrian-German datasets and evaluated using objective and subjective metrics. Results show substantial gains in intelligibility, naturalness, and perceived vocal health. QuickVC, FreeVC, and XVC perform similarly and achieve the highest preference scores, exceeding unprocessed pathological speech by up to 200%. These
findings highlight the potential to improve communication for individuals with voice disorders and motivate further development of efficient, high-quality conversion systems.
Original languageEnglish
Title of host publicationModels and Analysis of Vocal Emissions for Biomedical Applications
Subtitle of host publication14th International Workshop
PublisherFirenze University Press
Pages33 - 36
ISBN (Electronic)979-12-215-0821-5
ISBN (Print)979-12-215-0820-8
Publication statusPublished - 2025
Event14th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2025 - Firenze, Italy
Duration: 16 Dec 202517 Dec 2025

Conference

Conference14th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2025
Country/TerritoryItaly
CityFirenze
Period16/12/2517/12/25

Fields of Expertise

  • Information, Communication & Computing

Cite this