2023:Program/Submissions/Using LLMs to Overcome Bias: An Experiment - XJ8SN3

Title: Using LLMs to Overcome Bias: An Experiment

Speakers:

Clifford_Anderson

Clifford B. Anderson is Director of Digital Research at the Center of Theological Inquiry in Princeton, NJ. He is also Chief Digital Strategist at the Vanderbilt University Library. He holds a secondary appointment as Professor of Religious Studies in the College of Arts & Science at Vanderbilt University. Among other works, he is co-author of XQuery for Humanists (Texas A&M University Press, 2020) and editor of Digital Humanities and Libraries and Archives in Religious Studies (De Gruyter, 2022). He is on the steering committee of the Wikimedia and Libraries User Group and also an active member of the Women in Religion User Group.

Colleen D Hartung

Colleen D. Hartung, PhD, is co-founder and coordinator of the Women in Religion Wikipedia Project where she works to develop global programs to address gender bias on digital platforms like Wikipedia. She teaches people around the globe how to edit and write biographical entries about women in religion. She was the series editor for the Atla Women in Religion series and editor for a forthcoming book of biographies about women important in the history of the Parliament of the World’s Religions, published in conjunction with the Parliament of the World’s Religions and the United Nations. Colleen is also the author of a contribution to Polydoxy: Theology of Multiplicity and Relation titled “Faith and Polydoxy in the Whirlwind” (Routledge, 2012). She is a homilist at Holy Wisdom Monastery outside of Madison, WI and Chair of the Board for the Benedictine Women of Madison.

Christine Meyer

Rosalind Hinton

Pretalx link

Etherpad link

Room:

Start time:

End time:

Type: Lecture

Track: Equity, Inclusion, and Community Health

Submission state: submitted

Duration: 29 minutes

Do not record: false

Presentation language: en

Abstract & description

Abstract

The potential for bias in artificial intelligence systems has been well-documented. But can artificial intelligence be used to rectify inequities as well? In this talk, members of the Women in Religion User Group discuss how large language models (LLMs) could counteract gender bias. Could the use of LLMs accelerate the improvement of gender representation on Wikipedia by assisting with common editorial tasks, such as transcribing oral histories, summarizing academic papers and news articles, and drafting initial versions of Wikipedia articles about notable women?

Description

The potential for bias in artificial intelligence systems has been well-documented (O’Neill 2016; Eubanks 2018; Noble 2018). But can artificial intelligence be used to rectify inequities as well? In this talk, members of the Women in Religion User Group analyze how large language models (LLMs) could address gender bias in the English-language Wikipedia. Could the use of LLMs accelerate the addition of articles about women in religion by assisting with common editorial tasks, such as transcribing oral histories, summarizing academic papers and news articles, and drafting initial versions of Wikipedia articles about notable women?

This paper presents the findings of an experiment conducted by the Women in Religion User Group. Members of the User Group have published three peer-reviewed volumes on notable women. The chapters in these volumes provide reputable sources for the publication of corresponding articles on Wikipedia. Our study analyzes the differences between human-authored Wikipedia articles and the articles authored by GPT-4 about the same women, both with and without references to the chapters in these publications.

We hope this study will contribute toward significant questions about artificial intelligence being discussed in the larger Wikimedia community. To what extent can GPT-4 foster the summarization and drafting of articles? Given the ability to provide peer-reviewed documents to these LLMs as data, can we improve the quality of the outcome? What kind of prompts generate the best results? Is there a danger of so-called “hallucination,” and how do we fact-check machine-generated articles?

We will also reflect on our own experience as editors using GPT-4. How might reliance on LLMs affect our User Group, encouraging or dissuading new editors? And, finally, what are the prospects for developing a coordinated effort with other user groups to harness artificial intelligence to identify and potentially correct patterns of bias in Wikipedia?

Further details

Qn. How does your session relate to the event themes: Diversity, Collaboration Future?

We believe that, by harnessing machine learning and artificial for the improvement of human welfare, we may discover new ways to increase the diversity of Wikipedia.

Qn. What is the experience level needed for the audience for your session?

Everyone can participate in this session

Qn. What is the most appropriate format for this session?

Onsite in Singapore
Remote online participation, livestreamed
Remote from a satellite event
Hybrid with some participants in Singapore and others dialing in remotely
Pre-recorded and available on demand