2021:Submissions/Searching for similar sentences within Wikipedia articles

From Wikimania


Speakers[edit | edit source]

Mark Shuttleworth,Tzl

Abstract[edit | edit source]

As part of our research into the role of translation in Wikipedia we have developed a tool that allows us to detect the presence and extent of different types of similarity between pairs of articles. Based on the Sentence Transformers Python framework, our tool is able to detect the presence of translated sentences by comparing parallel articles in different languages and also to track the evolution of an article by aligning versions from its revision history to give the user an idea of the nature and rate of the editing that is occurring. Taking our cue from the well-known WikiWho tool, for the next stage of development we plan to add dynamic highlighting to increase ease of use. The tutorial will include a demonstration of our tool.

Session Outcomes[edit | edit source]

Although not always obvious, translation is often present in Wikipedia articles and can play a very significant role.

How to access and use the tool.

What can be learnt from using the tool.

I'm planning to attend this session live![edit | edit source]

  1. Add your name here