Welcome

Thanks for clicking the link. You're visiting the site for the evaluation of Text-To-Speech models developed during the research for my master's thesis.

Text-To-Speech systems map written text to the audio recording of the speech.

The evaluation set contains 50 sentences, with three generated audio samples per each. The time for the full evaluation should take ~15 minutes.

During the evaluation, you should take into account two parameters: intelligibility and naturalness.

Intelligibility stands for how good can you understand the words spoken in the audio. Naturalness stands for how close to the natural human speech does it sound.

Based on these two parameters, you should give a score between 1 and 5 to each audio sample.

There are known problems with audio playback for Firefox and Safari. Google Chrome is proved to work as expected.

Start