Jump to content

2025:Program/Wikispeech - continuing the work for a Wikipedia you can listen to

From Wikimania
View on Commons

Session title: Wikispeech - continuing the work for a Wikipedia you can listen to

Session type: Lecture
Track: Diversity & Inclusion
Language: en

In this presentation we will tell you about [Wikispeech](https://meta.wikimedia.org/wiki/Wikispeech), the solution to let people listen to Wikipedia. We'll give a brief recap of what we've done so far and then tell you what we're doing right now and in the near future.

In late 2024 we received a grant from the Swedish Inheritance Fund. This has helped us to restart our work with Wikispeech and plan for further development. We'll be working with such things as improving the experience for the end user, investigating new, modern voices and finalising tools for crowdsourcing.

Recording: https://w.wiki/FQNX - moved

Description

Wikispeech is a text-to-speech (TTS) solution that allows people to listen to Wikipedia. It's been developed by Wikimedia Sverige since 2016 with help from expert partner organisations. In late 2024 we received a grant from the Swedish Inheritance Fund that will help us continue working on Wikispeech.

On the technical side Wikispeech is [an extension for MediaWiki](https://www.mediawiki.org/wiki/Extension:Wikispeech), which means any wiki using this platform can install it. It uses a service text-to-speech called Speechoid. Both these components are developed as free and open source. Wikipedia is the primary project for Wikispeech, but we plan to add support for more Wikimedia projects in the future.

In this project we'll be looking at adding and improving features. These include reading out more complex parts of content, such as tables, images and references. We'll add support for the interface and editing. You'll be able to listen offline and download articles as audio files.

Like all other work done in the Wikimedia projects Wikispeech can be improved through crowdsourcing. It includes a tool for editing the pronunciations used by the TTS. Using this you can improve the listening experience for anyone listening to the same article. If you don't have the knowledge to improve pronunciations directly you will still be able to help out by noting when something sounds wrong. We'll also be looking at how Wikidata's lexemes can be used by Wikispeech and how it in turn can help by contributing new lexemes.

Another way to improve and expand Wikispeech is by contributing speech data. This data can be used to add new voices and languages to Wikispeech. Since it will be freely licensed it can also be used by other initiatives where speech data is needed. This will help in the development of ethical AI. As part of Wikispeech we've developed the Speech Data Collector. We'll continue working on this tool to give contributors an easy way to help out.

A lot has happened in the field of TTS since we first started working on Wikispeech. This includes more natural sounding voices and the ability to create new voices with relatively little speech data. We'll be looking into what components fit in our open source solution and how to incorporate them.

Alongside the continued development we'll be starting up an accessibility academy. This will create educational resources that will help when collecting speech data. In the future it'll also expand to other accessibility areas.

How does your session relate to the event theme, Wikimania@20 – Inclusivity. Impact. Sustainability?

Wikispeech is strongly connected to inclusivity. Not everyone who wants to take part of the knowledge generated by the Wikimedia movement is able to do so because a lot of it is presented as text. Wikispeech can help those who need or prefer to listen instead.

What is the experience level needed for the audience for your session?

Everyone can participate in this session

Resources

Speakers

  • Sebastian Berlin
...
  • Viktoria Hillerud (WMSE)