.comment-link {margin-left:.6em;}

Malayalam Related Topics

[See instructions to install an old orthography Malayalam Unicode font which is required to read the posts below]
 

Issues in representing a ചില്ലക്ഷരം as Consonant+Virama+ZWJ

ZWJ & ZWNJ are supposed to be font directives, directing a font to select from two or more sematically same renderings. In case of Malayalam, this is no longer true. ZWJ becomes an alien language construct introduced to Malayalam language by Unicode to produce ചില്ലക്ഷരങ്ങള്‍. Thus, it is possible to produce 2 semantically different words which differ only by ZWJ in its Unicode representation. Eg: അവന്‍ & അവന്‌.

When a word is searched in Unicode text, the search algorithm should ignore ZWJ & ZWNJ for it shall not care about the rendering of the word. From the above reasoning, that does not hold good for Malayalam. But, if does not ignore ZWJ & ZWNJ, then it surely is going to missout on some words which are sematically same but rendered differently by using/omitting ZWJ/ZWNJ.

Because of this ZWJ & ZWNJ 'ad hoc fix' for Malayalam, a minor issue also pops up: an encoding can not give the font the freedom to choose between conjunct form or virama seperated form - as in other Indian languages. Example: no way to allow font to choose between ന്മ ന്‌മ. For Malayalam only, ന + virama + മ should always prefer ന്മ conjunct because ന + virama + ZWJ forms ന്‍.

Comments:
ചില്ലക്ഷരങ്ങള്‍ are supposed to be the simplest letters in മലയാളം, but current Unicode implementation "Consonant + Virama + ZWJ" to get ചില്ലക്ഷരം is making ചില്ലക്ഷരങ്ങള്‍ complex letters. A problem simlar to ചില്ല-able അക്ഷരം issue mentioned in previous post is also observed for other letters in the Unicode implementation of Supersoft Thoolika font, many times we get unexpected കൂട്ടക്ഷരങ്ങള്. Hence I find ചന്ദ്രക്കല (Virama) can stand for 3 different things
1. ചന്ദ്രക്കല to be shown as ചന്ദ്രക്കല
2. ചന്ദ്രക്കല to make കൂട്ടക്ഷരം
3. ചന്ദ്രക്കല to make ചില്ലക്ഷരം
If in Latin character they can have 3 different "dashes" (dash, em-dash, en-dash), why cant Malayalam have 3 different Virama, as "Virama-standalone", "Virama-കൂട്ടക്ഷരം" and "Virama-ചില്ലക്ഷരം".
~ WikiPedia:User:Bijee
 
Post a Comment



<< Home

Archives

09/04   12/04   03/05   05/05   06/05   07/05   02/06   06/06   08/06  

This page is powered by Blogger. Isn't yours?