Master Thesis and Semester Projects
Synthetic Data Generation for Automated Speech Recognition in Impaired Speech
Automatic Speech Recognition (ASR) for individuals with impaired speech is severely hampered by data scarcity. This project addresses this problem by developing a personalized Text-to-Dysarthric-Speech (TTDS) model to serve as an advanced data augmentation method. Unlike assistive technologies that aim to correct speech impairments, the primary goal here is to faithfully clone a speaker’s unique impaired speech patterns. Using state-of-the-art generative audio models (e.g., VITS), the system will learn to generate synthetic yet realistically impaired speech data from very few recordings. A key innovation will be to leverage phoneme uncertainty analyses from prior work [5] to guide the synthesis process , enabling the targeted generation of more realistic phonetic deviations. This project is designed for a highly motivated, independent individual ready to take ownership of a challenging research topic.