2019:Research/How can Research Help in Reducing the Language Gap on Wikipedia

Abstract

While Wikipedia exists in 287 languages, its content is unevenly distributed among them. It is therefore of the utmost social and cultural interests to address languages for which native speakers have only access to an impoverished Wikipedia. In this presentation, we will present work from three research directions. First, research that is concerned with creating stub-articles in Wikipedias of underserved languages during which we will present our previous work on the ArticlePlaceholder project and its textual extension.

The second research direction we are going to present is concerned with building a better editing experience for Wikipedia editors in underserved languages. During which, we will present several research projects for article and section suggestion, the Content Translation tool and finally ongoing project Scribe.

Finally, we believe that one of the crucial factors of success of any research project is to listen to your target audience. In addition to the language gap, there exists another gap between the research community and the Wikipedia editors community. To close this gap community studies become essential to understand more the community needs and expectations. In the last part of our presentation, we will present several qualitative studies we performed in which we interviewed active editors from underserved language communities on Wikipedia to understand their perception of the automatically generated text on Wikipedia and their needs in terms of tools to enhance their editing experience.

Authors

Hady Elsahar (Naver Labs Europe), Lucie-Aimée Kaffee (University of Southampton & TIB Hannover)

Relevance to Wikimedia Community

In our presentation, we will present several projects related to closing the language gap between different language Wikipedias. The first part will include the Articleplaceholder project which is already deployed to 14 languages on Wikipedia. The other project we are going to discuss is our Wikimedia funded project Scribe which is going to run between July 2019 and July 2020. For both of the projects, we will introduce the related research, the current results and the future promising directions that fellow researchers can be interested to work on. To explain where both projects lie in the Wikipedia research ecosystem we will include other prominent projects from the Wikimedia research community.

Session type

22-min presentation.

