Jump to content

2023:Program/Submissions/The Challenges of Creating a Lexeme Database from Scratch - JKJACD

From Wikimania

Title: The Challenges of Creating a Lexeme Database from Scratch


Leon Liesener

Being a current student of Lithuanian philology, the prospect of having the Lithuanian language documented in a structured, open database feels great. Therefore, I am a Wikidata lexeme enthusiast and believe these can be a meaningful contribution. I would like to share my enthusiasm for Wikidata lexemes, and inspire others to discover this niche area of the wiki world for themselves, too.

Pretalx link

Etherpad link


Start time:

End time:

Type: Lecture

Track: Open Data

Submission state: submitted

Duration: 30 minutes

Do not record: false

Presentation language: en

Abstract & description[edit source]

Abstract[edit source]

Wikidata lexemes have given us the opportunity to save human language in a structured way. But languages are complex! And so is the resulting structured data. This session will introduce you to the challenges a lexeme editor might face on his journey.

Description[edit source]

This session will be based on the experiences I made while trying to systemise the Lithuanian lexemes on Wikidata. Aim is to inspire participants to work on systemising the lexemes of a language of their own choice, while providing them with insights into the challenges that are waiting on the way.

Further details[edit source]

Qn. How does your session relate to the event themes: Diversity, Collaboration Future?

Open lexeme databases can prove to be useful especially for smaller languages, where no alternatives exist. The language I am focusing on, Lithuanian, is exactly such a case. While researchers have attempted to think about creating a lexeme database for Lithuanian before, it has never actually been done. Wikidata can be a real gain here.

Qn. What is the experience level needed for the audience for your session?

Everyone can participate in this session

Qn. What is the most appropriate format for this session?

  • Tick Onsite in Singapore
  • Empty Remote online participation, livestreamed
  • Empty Remote from a satellite event
  • Tick Hybrid with some participants in Singapore and others dialing in remotely
  • Empty Pre-recorded and available on demand