Word Count is not correct for Chinese Characters


Word Count numbers are not correct when FoldingText is dealing with Chinese Characters. It seems that counting number is determined by space.

For English, a word is separated by space, not for Chinese,

Can you post some example Chinese text and the correct word count? I think I read somewhere that for Chinese character count is better?

3 English words, separated by 2 spaces.

Actually, it’s 4 Chinese Words, not 1

Thanks. I’ve been reading more about word count… gets complex quickly when you add in other languages. It’s on my list now, but I don’t have a fix yet.

Actually, it’s 4 Chinese Words, not 1

Or, of course, two … (中文 + 数字)

The difficulty here is the distinction between ‘word counts’ (‘word’ boundaries are not marked in Chinese script) and ‘character’ counts.

A character count (汉字 rather than 单词) for Chinese could be written as a plugin, or the existing code could report character counts (better for CJK) as well as word counts (better for Roman scripts with space-delimited ‘words’).

(As long as you don’t mind 葡萄 being counted as 2 rather than 1 :-)
(It’s two glyphs, but one morpheme, and one word …)

The latest dev release makes some changes to word count that I “think” should fix this issue.