I have been part of the c3subtitles team that aims at creating high-quality transcripts for the conference recordings of the annual Chaos Communication Congress for the past few years. I will describe how we process recordings, which tools and infrastructure we use, and which challenges we encountered along the way.
This session will address the conference theme — Wikimedia, Free Knowledge and the Sustainable Development Goals — in the following manner:
Video transcripts help to make videos accessible to a wider audience, thus improving the quality of education (4) and reducing inequalities (10). Furthermore, transcripts enable indexing and searching video contents (9).
At the end of the session, the following will have been achieved:
The audience understands which tools and techniques are used to transcribe the recordings of the Chaos Communication Congress.
Each Space at Wikimania 2019 will have specific format requests. The program design prioritises submissions which are future-oriented and directly engage the audience. The format of this submission is a:
The session will work best with these conditions:
projector + screen
This might be a bit of a niche topic, so I'm not sure about the appropriate size of the audience. No prior knowledge is required.
appropriate for recording
CCC chaos communication congress, between Christmas and New Year, now in Leipzig, 17000 visitors.
4 days of presentation and much more
Mostly volunteers works and a lot of fun.
The goal is to provide realtime caption
- with minimum delay
- but... people speak fast so it doesn't work well
- 200 words/minute WPM = 3-4 times faster than typing
- 1200 strokes/Minute SPM
- The tech respeaking based on speech recognition, works well for clearly-defined vocabularies, but doesn't work at all for too large or specific
- Stenographer 300 WPM
- None work for them
Real time captions
- Our solution is : if one person is too slow, just use 4 persons.
- Can work if people are coordinating. And need focus and error will happens anyway !
Offline afterward subtitles for high-quality video only, the goal is not to be fast anymore, you have time.
speech recognition software = transcript
Angels turn it into a proper transcript, it doesn't need to much correction
There is a quality control step to fix spelling error and unify the style and alignment.
Align transcript & audio using Youtube