[See instructions to install an old orthography Malayalam Unicode font which is required to read the posts below]

Unicode: Collation: suggestions

Modificaiton suggestions for Malayalam listing in AllKeys table thru examples:

-x is the symbol of vowel x
m_ is the anuswara
~ is the virama
H is the visarga

n_ = n ~ zw space= [1A2B.0020]
n~ = n ~ = [1A2B.0021]
nu~ = n ~ = [1A2B.0022]

na = n ~ -a = [1A2B.0021], [1A08.0020]
naa = n ~ -aa = [1A2B.0021], [1A0A.0020]
nau = n ~ -au = [1A2B.0021], [1A17.0020]

n~a = n ~ a = [1A2B.0021], [1A08.0021]
n~au = n ~ au = [1A2B.0021], [1A17.0021]

nka = n ~ k ~ -a = [1A2B.0021], [1A18.0021], [1A08.0020]

Together with anuswara

m_ka = ng ~ k ~ -a = [1A1C.0021], [1A18.0021], [1A08.0020]
m_ja = ny ~ j ~ -a = [1A21.0021], [1A1F.0021], [1A08.0020]
m_ta = ny ~ j ~ -a = [1A2B.0021], [1A27.0021], [1A08.0020]
m_ya = m ~ y ~ -a = [1A30.0021], [1A31.0021], [1A08.0020]



  1. Chillus and virama forms are diacritics of base character. So diacritics themselves have to be level-1 ignorable; but should have some weight in level-2. Also, chillu can be thought of as a virama form (vowellessness) + a zero width hidden whitespace. This can be achieved by keeping the secondary value of a chillu less than the virama form.
  2. Simillarly the full form of a vowel differ from its sign only in secondary value.
  3. The AllKeys file containing the Default Unicode Collation Element Table (DUCET), and does not currently handle Malayalam accurately. For example, ZWJ is by default ignorable, and NNA + VIRAMA + ZWJ, NNA + VIRAMA are treated as equal.
  4. Why is there an expansion for malayalam digits? In which level should the digit zero of different scripts should differ?
  5. If Chillus are encoded, the following equivalance should not be used for tailoring:

    0D7F = 0D15 0D4D 200D

    The behavior of 0D15 0D4D 200D is different from chillu as explained in the solution to Ken's counter-challenge for chillu-challenge.

[Thanks to Åke Persson for educating me on UTS#10]

