Monday, 7 April 2008


I found this Telugu word for "orange" (nāriṃza) (which won't display properly unless you have a Telugu Unicode font) in Charles Philip Brown's Telugu-English Dictionary (the last entry on the page). The last character in this word is empty. What's going on?

Thanks to Unicode Checker, I see that its code point is U+0C5B, and it's unassigned. The equivalent point in Devanāgarī is U+095B ज़ - used for Hindi /za/. The mystery Telugu character must be the obsolete grapheme for ẓa, which can be seen on the second page of this document. The people behind the Digital Dictionaries of South Asia either know something about a future version of Unicode or are hoping that this character will eventually be supported. The forthcoming Unicode 5.1.0 has additional Telugu characters that "expand the support of Sanskrit", but they don't say which ones. But I don't think this grapheme, if it represents a voiced sibilant, would have been used for Sanskrit in any case.

No comments :