Non-Latin scripts


I wonder whether any thought has been given to supporting non-Latin scripts. Awareness of grapheme clusters is one key part of this. I don’t have an invite yet, but I looked at the autoflow package, and it didn’t seem to pay any attention to this.

JavaScript is rather weak here, so it’s not going to work without thought and effort.


Only one data point, but it seems to support Japanese just fine, as far as I’ve been able to test.


With Japanese, the issues could be:

  • line wrapping (allowed not just at spaces)
  • word selection
  • characters outside the BMP

I was thinking more of Arabic, Indic and South-East Asian languages where character to glyph mapping is not 1:1.

With Arabic, there’s also bidi, of course.


Here is a tweet from someone that Shift-JIS is character corrupting.

I just tested ISO 2022-JP and Shift-JIS as well, they are both character corrupting. I do not have an editor that allows me to save in EUC-JP, so I have not been able to test that.