2021 talk:Submissions/A future Wikimedia Language Diversity hub

From Wikimania

Session notes[edit source]

SESSION OVERVIEW[edit source]

Title
A future Wikimedia Language Diversity hub
Day & time
Tuesday August 17, 17:30 – 18:15 UTC
Session wikilink
https://wikimania.wikimedia.org/wiki/2021:Submissions/A_future_Wikimedia_Language_Diversity_hub#Speakers
Facilitators
Jon Harald Søby (WMNO), Astrid Carlsen (WMNO)
Notetakers
One notetaker per table
Attendees will be invited to split up in two tables.

• Notes Table 1: https://etherpad.wikimedia.org/p/Wikimania2021-LangHub-1 • Notes Table 2: https://etherpad.wikimedia.org/p/Wikimania2021-LangHub-2

Attendees (optional, self-reported)
- 54 attendees in Remo at 17:34 UTC


Questions for discussion:[edit source]

1: Coordinate collaboration opportunities[edit source]

A hub can provide support on how to establish partnerships for affiliates and language projects. Wikimedia Norge is for example currently working with the UNESCO Decade of Indigenous languages 2022-2032. Which organizations would your community want to collaborate with and why?

  • Amir: My answer is somewhat generic because I'm not representing a particular community, but the "language diversity geeks" group itself :) But generally, the most relevant *kind* of community for such a project is academics and other respected knowledgeable people who can provide information about languages and the cultures around them.
  • KuboF Hromoslav (Esperanto): o cooperating with quite a few Esperanto orgs. o right now UNESCO competition in eowiki
  • Gereon Kalkuhl: o Camerun: half of the [...] languages are on the brink of extinction.  [...] has the dream of documenting these langs for wiktionary, etc.• going to the lang communities and working with the chiefs and recording the langs, this would require money, found a Christian org that he could work with, better would be to find someone who doesn't need money
  • Susan Kung: archivist at the Archive of the Indigenous Languages of Latin America. brings up the good point that people in this communities don't have the privilege to work for free or little money. Not appropriate to say we need to look for contributors or participants who don't need to be paid to get valuable information, we need to find a way to support people since they're not doing their job during the time they are working on wm projects
  • Astrid Carlsen: o WMNO has found that working with students is one way to get this done
  • Ganesh K. Paudel: From Nepal where there are 123 langs. Agrees with Susan, lingua franca Nepali, wp is still in growing phrase, o the others are growing very, very slowly, Nepali community has to support other lang communities as well.
  • Anna Torres, from Argentina: it would be nice if all the problems could be solved by paying people, but this isn't the only challenge we face. In our case, they often don't have a computer or a good Internet connection. We took a step back and we have a small fund that provides support for making sure that they can the technical resources they need in order to even start thinking about participating. She also agrees with Susan.
  • WereSpielChequers: I just wanted to comment on (forgets woman's name) the fact that there will be communities that don't have the spare time and you will have to compensate people. Yes, lots of people like this in both rich and poor countries. Need to lower the expectation so that just a couple of hours a month will be enough. The people who turn this hobby into their life shouldn't be crowding out the people who only have a couple of hours a month. Says that many communities where [...]. Secondary motivations, for ex. on enwiki. English is a second or third lang and it is commercially worth being able to use English, so they use it to gain practice in it. Many people who are willing to put in the time to preserve their language and culture
  • Vera de Kok: o essential [...]. Maintaining a small lang version without having even a small group w/o technically skilled staff/Wikipedians is difficult. Ptwiki turned off edits by IP addresses, even though it is comparable in size to the nlwiki. WMNL has the fortune of havng access to a lot of CC-0 material and technically skilled people
  • Trond Trosterud: works wth sewiki and smnwiki. Hard to come up with things if we don't have concrete examples. Is there a written standard? Is there a keyboard? Are there people who actually know how to write the lang? If this group is small, do they want to spend their time wrting on wp or would they prefer to write dictionaries, grammars, pedagogical material, etc. Does the lang have readers?. saddest example: Greenlandic wp would have 50,000 readers but the wp is sh*tty. They would need it because they don't have the lang skills to read the Danish wp. Another sad example: the Northern Sámi know too many languages and prefer to use nbwiki, nnwiki, svwiki, fiwiki and prefer to spend their time writing other types of material. Need to prevent wikipedia hijackers (+1). wikipedia hijackers ≈ non-speakers who think they know a language and are saving it by imposing their non-version of the language on the community by trying to create a wp in the language. Often outright lie about knowing a language, especially when trying to get them to understand the disservice they are doing to the community's welfare. Will also claim to be native speakers when they clearly aren't. often of multiple languages that the general community is not familiar with because then it's harder for them to be outed as non-speakers. Extinct languages are a favourite because there aren't any speakers left to tell them their "work" is crud. Will wholesale import words, grammar, etc. from major languages nearby under the pretext of "expanding" the language
  • Rebecca O'Neill: Irish has two elements to this - the Irish language (GA), but also an indigneous community, Irish Travellers, with their own language but with a *very* small community of speakers and writers. I know the GA community are interested, but generally over stretched. The Travelling community, I would defer to their advocacy groups (Pavee Point etc), but they are wary of Wikipedia as how they are currently represented on EM is deeply problematic.
  • Kimberli Mäkäräinen: being overstretched is an unfortunate reality with indigenous and minority language communities :/ (so, so true, even with Irish - an official state language)


2: Technical support[edit source]

Can you give examples of technical support and tools that are important for your community? Can you think of special support for projects in the incubator that would make it easier to recruit and retain editors?

  • Quiddity: I think the https://meta.wikimedia.org/wiki/Small_wiki_toolkits is immensely useful, and could grow even better. Please comment/contribute!
  • Quiddity: +1 to clearer / more detailed documentation on /about https://incubator.wikimedia.org/
  • Amir: I totally agree, and I kind of have a plan for that - https://phabricator.wikimedia.org/T228745 . But it needs stakeholders who would demand it. I am mostly alone in asking the relevant people for it. I don't have the resources to implement it alone. So please ask for it, tell your friends to ask for it, and keep asking for it, and don't give up. If you don't know where to raise your voice and ask for it, ask me for more info: User:Amire80 / Telegram: @amire80 / Twitter: @aharoni – See also https://www.youtube.com/watch?v=DdyzrDzD0qg (as linked within the phab)
  • Quiddity: Ciell's session had some good related ideas: https://wikimania.wikimedia.org/wiki/2021:Submissions/How_can_I_help%3F_Reviving_a_wiki_of_which_I_do_not_speak_the_language
  • Quiddity: Providing automatic access to more Toolforge tools (e.g. the links at the top of the "History" tabs, or "Special:Contributions" pages, in bigger wikis). These tools are missing at many (most?) of the medium and small wikis. I have a plan, essentially putting these into TranslateWiki, but I need help with researching and deciding the details - I think the main challenge is deciding which links to include, and what order to put them in. See this page, and comment or ping me: https://www.mediawiki.org/wiki/Notes_for_potentially_moving_some_mediawiki_system_messages_into_wikimediamessages
  • KuboF: currently ContentTranslation (also working in Incubator would be great!); aspirationally Global Templates, easier use of gadets
  • Amir: has a plan fr making the incubator better, can contact him to know more, and find a way to make it happen. see the main etherpad for this session.
  • Rebecca O'Neill: I would be really interested in exploring deploying the front page developed for Saami as an alternative to the current traditional frontpage currently in use on Vicipéid. Broken templates are the biggest complaint I get from GA editors, if there was a way to turn them off in the Translation Tool, that would save some heartache :)
  • K: This is definitely a problem when the community doesn't have anyone technically skilled to fix them. The sewiki has one template with a lot of errors in it, but even I can't figure out how to fix it. I can imagine less experienced users would look at it and leave in frustration.
  • Vera de Kok: technical bg, Zotero translator, a bit of JavaScript code that provides accurate metadata about sources to Zotero
  • Trond Trosterud: the tool that writers of minority and indigenous lang wps need first and foremost: keyboards for their own langs,  spellcheckers for their own languages -- this is actually a very important point, e.g., https://giellalt.github.io/LanguageModels.html, an environment to write a literature with which to find the facts in their own langs. All of those things before money since money might just be buying non-essential things instead of essential tools being developed first. If there is no capacity to write, money would actually do bad -- since it could have been spent better on other purposes (e.g. on teaching people to write).
  • [Antiqueight]:I think Trond has a huge point - what is the purpose of the wiki For some of us it's about saving a language at all and others it's a living language but a minority like dutch - in the world
  • T: Yes: We (and the writers) should think through what a wp with 4000 articles with 15 words each can be used for. I have an answer to that: With IW linking and a good and balanced selection of articles even such a minimal WP would be useful, IF THE ARTICLE NAMES CAN BE TRUSTED. Also, for languages with schoolchildren learning the lg at school, basic and school-related articles could provent models for argumentative writing, formulation of definitions, etc. Example: The former Sámi Parliament president Ole Henrik Magga told how for him as a kid reading the Bible (the only literature available) taught him a language capable of handling abstract concepts in complex narrative structures as a written language. I do not expect small WPs to reach this linguistic level, but at least their topic selection should be more relavant than the Bible's.
  • K: Fortunately lang communities are not limited to just wikipedias: Images and videos can be captioned in different languages, labels, descriptions, properties can be added in Wikidata, extant material can be uploaded into Wikisource with the right license, wiktionary can be used to create a dictionary for the language, Wikibooks can be used for creating and storing textbooks. these are also easier to update than traditional textbooks.etc., For some reason, we often only think about Wikipedia when we start working with a language community.
  • (Trond): Technical support from WM community to small WP communities: Infoboxes, photos (but not 30 photos for a 1 1/2 line long article -- the example is sadly typical) IW links, such things.
  • Anna Torres: this is not answering the tool thing, but one thing that I've been advising others on with similar ideas, there should be a global advisory committee that have contacts with the lang communities. Needs are superdifferent in different parts of the world so one solution is not going to be ok.
  • Astrid Carlsen: we already have done this and we have an ad hoc steering committee from all over the world with ties to small and large indigenous communities
  • something we need that I forgot: new metrics. We can't evaluate this work as we are doing it.
  • Trey Jones (WMF): Are there obvious ways to get support from larger technical organizations within the Movement, such as the WMF or WMDE? Are there technical wishes that could be built that would provide useful tools for many smaller language communities? Is there a central place to document useful contacts with teams or individuals who can provide technical support for smaller language communities? (For example, I'm on the WMF Search Platform Team, and I've worked with the Mirandese and Khmer communities to make improvements specific to their language needs. The most difficult part of the process is making connections with the right people in the communities to figure out what needs to be done.)


3: Capacity building[edit source]

What support can a language diversity hub give on Capacity building? Can you give examples on documentation, learning patterns, templates, or initiatives etc that could be helpful for you and your community?

  • Abbad: Building a hub for minor languages is a little bit miscellaneous, it gathers a large number of loosely-related communities, except for the fact that they share similar problems. In such scenarios, I believe it tends to be extremely hard for people to communicate or collaborate together at a community level. It's likely that only more English-fluent and wiki-literate members would be able to participate in the hub and bring issues on behalf of their community at the table. ESEAP is probably the most inspiring example, within the movement for such a case, which cna be surveyed for needs of capacity buildin, learning patterns, etc. that they found useful to share. In my experience, multilingual and tech-savy translation communities, like Khan Academy, are also successful in sharing such solution across languages. Again, however, this should be mindful of the loose relation between such languages and the fact that a minority of each community can meaningfully participate in a common hub that builds their "capacity".
  • KuboF: using Global Templates, bots, gadgets, templates with data from Wikidata
  • Like for WMCEE, a goal for the hub could be to help set up user groups for communities that aren't represented by one yet
  • Kimberli Mäkäräinen: Amir Aharoni's Global templates would be a wonderful solution to some of the problems of keeping small wikipedias up-to-date.
  • Getting rid of wp hijackers first, they take up way too much of the small lang wp communities' time
  • it is also imperative that we acknowledge that not all lang communities are going to want to work with us and respect that. Their knowledge is not free knowledge and not free to be taken. This is something the wp hijackers don't understand.
  • Rebecca: technical support to lessen the load on small wps. Welsh wp has done a fine job of dynamically generating articles


4: Provide resources[edit source]

What types of activities do you think your community would prioritize if you had more resources (for example different equipment, data packages, funds for events and salaries etc)?

  • Abbad: Besides typical support and capacity building (probably more related to the previous question), I'd say that the biggest single need is tech expertise support. While this probably includes some capacity building and training, it's often the case that minor communities don't even have people with the availbility or the capability to learn how to perform technical tasks, even with suppost. Often, this is a task-based rather than training-based support, and it's essential. For example, it's usual for smaller Wikipedias to have no bots at all or only a couple of barely functional ones.
  • Article writing contests with prizes are one opportunity if there are funds for it
  • share bot frm WMNO
  • Trond Trosterud: terminological resources. The main challenge for small language communities is TO WRITE GOOD TEXT: Text written in good, idiomatic language, without orthographic or grammatical errors, with an adequate terminology and a balanced mix of technical and pedagogical content. For majority languages, this is self evident. WPs used in revitalisation contexts constantly run the risk of producing bad language. (text written by non-fluent speakers, or even worse, by non-speakers)


General feedback[edit source]

Set up a landing page for the info gathered so far and next steps Workshops like this; 45 minutes is not enough time. Format suggestion: mix of speaking, reflecting, notetaking,