2023:Program/Submissions/Towards a global and multilingual public domain searcher with Wikidata - LUVJFW

From Wikimania

Title: Towards a global and multilingual public domain searcher with Wikidata


Jorge Gemetto

Miembro de Wikimedistas de Uruguay y colaborador de Wikipedia desde 2007. Participa también en el capítulo uruguayo de Creative Commons, así como en diversas iniciativas para promover el acceso al conocimiento y los derechos digitales.

Member of Wikimedistas de Uruguay and contributor to Wikipedia since 2007. He also participates in the Uruguayan chapter of Creative Commons, as well as in various initiatives to promote access to knowledge and digital rights.

Pretalx link

Etherpad link


Start time:

End time:

Type: Lightning talk

Track: GLAM, Heritage, and Culture

Submission state: submitted

Duration: 10 minutes

Do not record: false

Presentation language: en

Abstract & description[edit source]

Abstract[edit source]

The talk presents the project to create a global and multilingual search engine for works and authors in the public domain using Wikidata, Pywikibot and Flask. The goals are to create a friendly interface to identify the copyright status of a work or author's works, download public domain works from Wikisource and/or other sites, and invite users to add missing information to Wikidata.

Description[edit source]

The objective of the talk is to present a project to create a global and multilingual search engine for works and authors in the public domain.

The project takes as input some of the problems and objectives developed in a previous project, called the Public Domain Awareness Project, developed by Creative Commons and presented at Wikimania Stockholm: https://upload.wikimedia.org/wikipedia/commons/1/1f/Public_Domain_Awareness_Project.pdf

That project pointed out, among other things, the way in which the enormous complexity of copyright constitutes a barrier to access works in the public domain. Therefore, it points out the need to have tools that facilitate this objective, including tools to improve the metadata of the works, to carry out copyright clearance in different jurisdictions and to facilitate access to the works.

The project that I present in this talk seeks to put some of these ideas into practice by developing a tool based on Wikidata. Specifically, the initial form that the project takes is a web application demo developed in Python, using Pywikibot and Flask. This tool is intended to be a user-friendly interface to recognize the copyright status of a work or works of an author, download public domain works from Wikisource, Wikimedia Commons and other sites linked in Wikidata, and invite users to add missing information in Wikidata.

Recognizing that there are already other tools based on Wikidata that facilitate knowledge and access to cultural works, such as Crotos, OpenArtBrowser or Inventaire, among others, this project focuses on: - identifying the public domain status of works of any kind, - encouraging people to contribute to Wikidata if they know of missing data, - to help think about best practices for modeling data related to the public domain in Wikidata.

The presenter of the talk has a long experience in projects related to facilitating access to the public domain, carried out by Creative Commons Uruguay and other projects. For example: https://www.creativecommons.uy/2016/04/11/felisberto-hernandez-libre-en-internet/ However, he has little experience in programming, so part of the objective of this talk is to publicize the project to meet people with more experience who can make recommendations related to programming aspects.

Further details[edit source]

Qn. How does your session relate to the event themes: Diversity, Collaboration Future?

The tool that will be presented has a focus on diversity, since it seeks to be a global and multilingual tool, not only for the United States and Europe. In addition, it seeks the collaboration of copyright experts, developers, heritage professionals and users. It seeks to contribute to the future through innovation for the best access to the public domain.

Qn. What is the experience level needed for the audience for your session?

Everyone can participate in this session

Qn. What is the most appropriate format for this session?

  • Tick Onsite in Singapore
  • Empty Remote online participation, livestreamed
  • Empty Remote from a satellite event
  • Empty Hybrid with some participants in Singapore and others dialing in remotely
  • Empty Pre-recorded and available on demand