Online / 5 & 6 February 2022

visit

How to Start a Language on Mozilla Common Voice?

A case study for under-resourced Turkish Language


On Mozilla Common Voice, as of December 2021, there are 154 locales, but only 87 fulfilled the requirements to collect voices, where 27 of them are fairly new. In this two-part presentation, we want to give some starting points for the new language communities, share our accumulated knowledge in the last year while working on the under-resourced Turkish language, with initial training results.

The presentation includes the following topics: Resources on Mozilla Common Voice, how to analyze your dataset, how to set goals, how to design a social media campaign, what tools you can use, Google Colabs, Coqui STT, and our roundups on training Common Voice Turkish Dataset v1 - v7.0, all with our successes and failures as Common Voice Turkish Volunteers group as lessons learned.

Speakers

Photo of Bülent Özden Bülent Özden

Attachments

Links