2019:Languages/Wikispeech - making Wikipedia accessible through speech technology/notes

From Wikimania

SESSION OVERVIEW[edit | edit source]

Title
Wikispeech - making Wikipedia accessible through speech technology
Day & time
Friday 16 August, 13:30
Session link
https://wikimania.wikimedia.org/wiki/2019:A11y/Wikispeech_-_making_Wikipedia_accessible_through_speech_technology
Speaker
Sebastian Berlin (WMSE), André Costa (WMSE)
Notetakers
Jon Harald Søby
Attendees
~ 40 (room capacity was 30 but tables were moved out to give more space)


SESSION SUMMARY[edit | edit source]

  • Problems:
    • 15–30 % of people are auditory learners (= learn better by listening)
    • 14 % illiteracy rate worldwide
    • These people are not served by our projects at the moment. How can we solve this?
  • Existing solutions are proprietary solutions, which are normally only available in commercially profitable languages, and can be quite expensive
  • Wikispeech's purpose is to make Wikipedia accessible for anyone that faces reading difficulties
  • Wanted a server-based solution, so not everything is done on your device. Also, while Wikispeech is centered on the Wiki* projects, the components should be reusable by outside users as well
  • Modular, to allow as many languages as possible
  • First project period was 2016–2018
  • Currently supports English, Arabic and Swedish
  • Developer version available at wikispeech.wmflabs.org
  • Showing video presentation about how it works
  • Second project period 2019–2021:
    • Finalize the Wikispeech reader, take it into beta
  • New related project: Wikispeech Speech Data Collector
  • Without speech data, doing text-to-speech for a language is impossible
  • Toolkit to crowdsource collecting speech data
    • Toolkit will collect,validate, annotate, and create phonetically balanced manuscripts
  • Future uses:
    • More voices, new supported languages
    • Use for Wikidata & Wiktionary
  • Possible future uses:
    • Language preservation
    • Oral citations
    • Third party tools


  • Q: It should be possible to choose the gender and dialect of the voice that speaks.
    • A: The possibility to choose different voices exists, but we need more free voices to use
  • Q: Are you consulting experts to ensure the correctness of the pronunciations (e.g. for languages like Hebrew and Arabic)?
    • A: Are in touch with an expert from a technical university in Stockholm
  • Q: In some programs for Arabic/Hebrew, they use probability instead of grammar rules. What is used for Wikispeech?
    • A: Use probability, but users can correct it with annotations where it is wrong