2019:Transcription/Using transcribed content for community building and more...
This is an Accepted submission for the Transcription space at Wikimania 2019. |
Description
[edit | edit source]Short presentation based on Polish Wikisource community activity and their experience with utilizing Wikisource content
Relationship to the theme
[edit | edit source]This session will address the conference theme — Wikimedia, Free Knowledge and the Sustainable Development Goals — in the following manner:
4, 8, 9
Session outcomes
[edit | edit source]At the end of the session, the following will have been achieved:
Polish Wikisource community experience with utilizing Wikisource content outside of Wikimedia will be presented.
Session leader(s)
[edit | edit source]- Ankry (talk) - Polish Wikisource, WMPL
Session type
[edit | edit source]Each Space at Wikimania 2019 will have specific format requests. The program design prioritises submissions which are future-oriented and directly engage the audience. The format of this submission is a:
- Lecture
SESSION SUMMARY
[edit | edit source]- Copy of the notes from the Etherpad, taken by VIGNERON, Peter Isotalo
I'm Ankry from Polish Wikisource mainly but also other wikisources (multilingual, French, English, etc.) I like to see how it works in others languages. Also active in Commons, a lot of works of Wikisource has to be done on Commons.
Active for about 10 years, active in chapter for ~4 years.
Presentation based my experience on polish Wikisource.
General problem.
Wikipedia you need to "edit", on Wikisource is easier you just "copy" (most being done automatically).
Users are introvertic and you can see it in this room. They often avoid interpersonal communication and if so, they use private channel to communicate. For the most part by inviting to IRC channels, or even through e-mail.
Orthography change a lot every few years, and public domain is 70 years which is a very long time for orthography. Some letters changed or disappeared. So faithful transcription of original can be not helpful, especially for young people.
We need people, what are we looking for ?
First what don't we look: not people from Wikipedia (too creative, not to recuperate original research and plagiarism are not good subject), we don't work in schools. We're looking for e-book related sites with community and to retired people (who love books and have a lot of times, but they are learning slowly and it's more work for training them).
2014 post by a wikipedian to his e-book blog site, and (thankfully) he warned the Wikisource community in advance. Over 100 new users daily for next few days and 5-10 remains active until today. A lot of Wikisource users have no previous experience with other Wikimedia projects. Those who have experience of working with other projects are still mostly active on Wikisource.
Facebook. Can have a significant impact. Publish "fun" book, interresting statistics. https://www.facebook.com/wikizrodlaPL/
Proofreading need at least 2 peoples. And again, they're introverts so they don't ask for proofread...
- Balaji: how do you use the three quality levels?
- Ankry: we'll take about it after the break, for instance pure OCR is marked as problematic.
New users prefer personal communication, they're not Wikimedia and not technically skilled, tool should be more simple for them.
Weak point: bad documentation and help page, some inspiration from much better page on the English Wikisource.
Źródłosłów : joint conference for Polish Wikisource and Wiktionary
- 2017 in Poznań
- 2018 in Łódź
- 2019 in Warsaw (planned)
Only 20-25 participants.
Utilising Wikisource content, generating with wsexport : https://tools.wmflabs.org/wsexport/tool/book.php Up-to-date vs. downloaded on-click ? What is the more important ?
Wikisource e-books
Aug-Jul; data from two sites: http://wsexport.wikizrodla.pl & https://tools.wmflabs.org/wsexport/tool/ Polish server is slowler but it works.
Commercial services takes Wikisource texts (often without attribution)
Some audiobooks projects supported by Wikimedia Polska.
Printed books, from a manually modified (some preprocessing) epub.