Just a little update for thread-watchers so they know what's happening after that last page of posts.
First off, we found official subtitles for Polish, Greek, Turkish, and European Portuguese. This means improvements for Turkish (especially ROTJ), promoting Greek from unverified to verified, adding a long-absent Portuguese dialect, and possibly some minor improvement for Polish. Some of this work is already done.
Then, there's the effort I have taken to calling "Operation Eyestrain". The goal is to no longer have any graphical-only subtitles--using a mixture of our newfound OCR method, and a painfully slow phase of manual transcription and correction. Japanese is furthest along, thanks to Sadako. Mandarin/Traditional and Cantonese are probably in a very good state since our OCR software seemed to handle Chinese characters very well, but some manual correction is undoubtedly necessary, and I'm hoping I can lean on Sadako for that as well. The surprise was with Thai, which I thought would be easy due to it being an alphabet with a much more manageable number of character permutations than Chinese, but the OCR fell down hard on this text, and it's hard to find a single line that doesn't require at least one manual correction, if not several. Feallan is taking two films and I'm taking one. I predict this will be the slowest of the jobs, since neither of us know the language at all.