2023:Program/Submissions/LLMs and Wikipedia: Use Cases and Limitations - XJLNZR

From Wikimania

Title: LLMs and Wikipedia: Use Cases and Limitations

Speakers:

William Bothe (User:DraconicDark)

I have been active on Wikimedia projects since 2017, and am primarily active on English Wikipedia and Wikisource. I am from the USA, and have attended Wikimedia events previously: Wikimania 2019 in Stockholm, and WikiConference North America 2019 in Boston.

Pretalx link

Etherpad link

Room:

Start time:

End time:

Type: Lecture

Track: Technology

Submission state: submitted

Duration: 30 minutes

Do not record: false

Presentation language: en


Abstract & description[edit source]

Abstract[edit source]

The release of ChatGPT, a large language model (LLM) that can generate realistic and engaging conversations, has sparked a surge of interest in and adoption of LLM technology. In this presentation, we explore how LLMs can be applied to Wikipedia, the largest online encyclopedia, to enhance its quality, coverage, and accessibility. I will discuss the opportunities and challenges of using LLMs on Wikipedia, presenting various use cases for LLMs, my evaluation of how they perform at these, and ideas for future directions the Wikimedia movement can take LLMs into.

Description[edit source]

In this presentation, I will share my experience and findings of using large language models (LLMs) on Wikipedia. LLMs are powerful neural networks that can generate natural language texts based on a given input or context. The results generated by these texts are impressively realistic, but LLMs can also hallucinate, or produce inaccurate information.

I have tested LLMs on several Wikipedia-related tasks, such as: 1. Article generation: creating a new Wikipedia article from scratch, given a topic or a title. I will describe results from experiments on generating entire articles about different topics, and give input on quality of results. Note that any article generated by an LLM should be reviewed for accuracy and properly cited before being published 2. Article outlining/pre-writing: generating a structured outline for a new or existing article, with sections and sub-sections. In addition, some LLMs with web search capabilities, such as Bing, are able to be used for source-finding; I will also discuss this here a bit. 3. Issue template resolution: addressing the issues or problems that are flagged by Wikipedia editors using templates, such as citation needed, tone, or clarity. I have tested LLMs' capability to resolve the issue presented by the template, and I will describe some of the results. Some of the issue resolutions I have tested: lead too short, lead too long, citation needed, list -> prose.

I have tested each task on several different publicly available LLM models/tools. Different models, such as ChatGPT, Bing (an implementation of GPT-4 specialized for web search), and Bard can give different outputs, and I will describe my experiences testing similar prompts on each.

Finally, I will discuss some possible future applications of LLMs on Wikipedia. I will discuss the possibility of Wikimedia-specific LLM tools, and what I believe these will look like.

Further details[edit source]

Qn. How does your session relate to the event themes: Diversity, Collaboration Future?

My session relates to the event themes in the following ways:

Diversity: I will show how LLMs can be used to improve the diversity and inclusivity of Wikipedia content, by generating articles on underrepresented topics, languages, and perspectives, and by addressing the biases and gaps in existing articles. Collaboration: I will show how LLMs can be used as tools to facilitate the collaboration and communication among Wikipedia editors and contributors. When used correctly, LLMs can facilitate collaboration by enhancing the capabilities of Wikipedia editors, as well as solving problems faced by Wikipedia contributors, as illustrated by the issue template resolution examples. Future: I will show how LLMs can enhance the future of Wikipedia as a source of reliable and accessible knowledge, by improving the quality, coverage, and readability of Wikipedia articles. In addition, I will illustrate what the uses and impact the development of Wikimedia-specific LLM tools could have.

Qn. What is the experience level needed for the audience for your session?

Everyone can participate in this session

Qn. What is the most appropriate format for this session?

  • Tick Onsite in Singapore
  • Empty Remote online participation, livestreamed
  • Empty Remote from a satellite event
  • Tick Hybrid with some participants in Singapore and others dialing in remotely
  • Empty Pre-recorded and available on demand