Jump to content

2023:Program/Research, Science, and Medicine/XLHYKP-Bridge the Digital Language Divide: Can Machine Translation Narrow Knowledge Gap?

From Wikimania

Title: Bridge the Digital Language Divide: Can Machine Translation Narrow Knowledge Gap?

Speakers:

Kai Zhu

I am an Assistant Professor at Bocconi University. I study how technology and digitization change our society.

Pretalx link

Etherpad link

Room: Room 324

Start time: Fri, 18 Aug 2023 11:15:00 +0800

End time: Fri, 18 Aug 2023 11:35:00 +0800

Type: No (pretalx) session type id specified

Track: Research, Science, and Medicine

Submission state: confirmed

Duration: 20 minutes

Do not record: true

Presentation language: en


Abstract & description

[edit source]

Abstract

[edit source]

This research project investigates the role of machine translation in content creation across different language editions of Wikipedia, with a focus on the impact on knowledge gap.

Description

[edit source]

This partnership between Wikipedia and Google Translate presents us with a great opportunity to gain a deeper understanding of if and how a new technology enabled by Artificial Intelligence can narrow the knowledge gap in social-technical systems. Leveraging this unique setting, we use tools and techniques from econometrics modeling, causal inference, and natural language processing to investigate three sets of closely related questions on the impact of machine translation on Wikipedia. First, how does a better machine translation service enable knowledge transfer between different languages? Does it mostly support knowledge outflow from a few major language editions like English and French? Or does it support a bidirectional and hence more mutual information exchange between different language editions of Wikipedia? Second, how does Google Translate change the collaboration and coordination pattern between human editors and machine intelligence? Specifically, how do the human editors change their roles in the process of content production when there is a good initial translation created by machines? Third, a large portion of articles from each Wikipedia language edition is locally-relevant and culture-specific content. Does machine translation also help the spreading of local content?

Further details

[edit source]

Qn. How does your session relate to the event themes: Diversity, Collaboration Future?

Knowledge gap widely exists in online space and in digital systems. It is often known as disparity in distribution of information and knowledge throughout a social system. Despite being one of the most successful open collaboration platforms, Wikipedia also suffers from this problem (Graham et al., 2014; Zhu et al., 2020). With the objective of providing “free access to the sum of all human knowledge”, Wikipedia is now part of the essential infrastructure of knowledge repositories in digital space. However, as pointed out by 2030 Wikimedia Strategic Direction (Leila et al., 2019), it is becoming increasingly critical to address the knowledge gap on Wikipedia so that it can better serve audiences, communities, and cultures that have been traditionally left out by structure of power and privilege. However, knowledge gap across languages is a notoriously challenging issue as it is difficult to recruit volunteers to contribute content in low-resource languages. In this study, we examine if and how state-of-art neural machine translation can narrow the knowledge gap across different language editions of Wikipedia.

Qn. What is the experience level needed for the audience for your session?

Everyone can participate in this session

Qn. What is the most appropriate format for this session?

  • Tick Onsite in Singapore
  • Empty Remote online participation, livestreamed
  • Empty Remote from a satellite event
  • Empty Hybrid with some participants in Singapore and others dialing in remotely
  • Empty Pre-recorded and available on demand