To update everyone on what's cooking for 8.1 (likely timed to match Harmy's Despecialized Jedi 2.0):
- As mentioned earlier, I'm hoping for more progress from Operation Eyestrain. Thai and Japanese still have one movie apiece that are graphical-only
- "Matching" subtitles for Despecialized Jedi 2.0 (the only thing on my to-do list since day one, finally done!)
- Improved Arabic subtitles. I'm diving in and doing this manually, character-by-character, and it is not easy
- CJK subtitles will use different fonts for their graphical subtitles
- Maybe some new languages. Slovak is most likely, but you never know what might come along
- Miscellaneous minor updates, I'm sure
The deal with the CJK subtitles and fonts is interesting. Like I'm sure a lot of Westerners, I believed all I really needed was one good-quality Unicode font and all of my language needs would be met. As it turns out, even with the best-looking Unicode font out there, you're going to run into problems with CJK characters.
As I'm sure you know, Japanese and Korean use characters of Chinese origin in their writing systems. The Unicode standard has a "unified CJK" section where these characters are mapped. The problem is that the characters are often written slightly differently in Simplified Chinese, Traditional Chinese, Japanese, and Korean, even when there is only one shared character defined in Unicode. So if you use one Unicode font for everything, it will look a little off for some CJK text. The solution is to use a different font for each of the four systems, which is what I'll do.
The problem isn't huge, and it's fairly common in the computer age, but there are enough people complaining that the Japanese text on their iPhone looks like it was written by a Chinese speaker, that I figured I'd do my best to avoid annoying people further. I've heard the problem described as being like if someone reversed every R and N while writing English, but maybe not quite that bad. It's still completely readable, but it looks wrong, nobody who knew English would ever write it like that, and it gives the text a vaguely Cyrillic feel.