Saturday, August 03, 2013

Collation of Myanmar (Burmese) in Unicode

(မြန်မာ အက္ခရာစဉ်)

Burmese Sorting In A Nutshell
at Lionslayer : The Legend
http://lionslayer.yoeyar.com/?p=632


This is a quick guide to sorting Burmese (မြန်မာ အက္ခရာစဉ်). If you are using Unicode for Burmese, you can auto-sort Burmese texts and files in Linux OS distros and OpenOffice suite in any OS. You can develop applications with Burmese sorting supported using ICU library. Even so, there are times you cannot rely on machines when you are working with non-Unicode fonts or on paper. Here is the shortcut for you to quickly memorize how the Burmese sorting works.
Burmese Consonants

    က ခ ဂ ဃ င
    စ ဆ ဇ ဈ ဉ ည
    ဋ ဌ ဍ ဎ ဏ
    တ ထ ဒ ဓ န
    ပ ဖ ဗ ဘ မ
    ယ ရ လ ဝ သ
    ဟ ဠ အ

Dependent Vowels

    အ အာ အိ အီ အု အူ အေ အဲ အော အော် အံ အို

Independent Vowels

    ဣ ဤ ဥ ဦ ဥ ဩ ဪ

Medials

    ျ   ြ   ွ   ှ   (ပင့်ရစ်ဆွဲထိုး)

See the complete list of Burmese characters in Unicode chart.





The Burmese Sorting Formula

    (1) Consonant*+ Vowel**

    (2) Consonant*+ Vowel***+ (Consonant+Asat)**

    (3) Consonant*+ Medial**+

    - (a) Vowel***

    - (b) Vowel****+ (Consonant+Asat)***

    - (c) (another) Medial*** + Vowel****

    - (d) (another) Medial*** + Vowel***** + (Consonant+Asat)****

    - (e) (another) Medial*** + (another) Medial**** + Vowel*****  (There is only one for this; မြွှာ)

    - (f) (another) Medial*** + (another) Medial**** + Vowel******+ (Consonant+Asat)***** (There is only one for this; မြွှင်း)

#The less the stars(*), the higher the priority.

Notes:

(1) Asat and Virama are assumed as equal. E.g. စက္က and စက်က are the same. Kinzi (ကင်းစီး) is equal to Nga+Asat (ငသတ်). E.g. အင်္ဂလိပ် is equal to အင်ဂလိပ်.

(2) ည (Double Nya or Nya) comes after ဉ(Nya lay). ယျ (Double Ya) comes after ယ (Ya).

(3) Dot below (အောက်မြစ်) and Visaga (ဝစ္စပေါက်) come after related vowels. E.g. က ကာ ကာ့ ကား ကာ့း.

(4) Independent vowels are equal to A(အ)+Dependent vowels. E.g. ဧ = အေ, ဦ = အူ, ဥုမ် = အုမ် = အုံ etc. မ် (Ma+asat) is equal to သေးသေးတင် (Anusvara).

(5) There is no special mentioning about number sorting. So numbers will comes before consonants as other languages do.

#See the original reference from Burmese dictionary of Myanmar Language Commission.

#See the pre-sorted Burmese Orthography in Wiktionary.

More Links:

    http://lionslayer.yoeyar.com/?p=450
    http://www.thanlwinsoft.org/ThanLwinSoft/MyanmarUnicode/Sorting/
    http://unicode.org/repos/cldr/trunk/common/collation/my.xml
    http://www.thanlwinsoft.org/ThanLwinSoft/MyanmarUnicode/Sorting/MyanmarCollation20080424.pdf
    http://www.unicode.org/notes/tn11/
    http://std.dkuug.dk/JTC1/SC2/WG2/
    http://demo.icu-project.org/icu-bin/locexp?_=my&d_=en&x=col

cridit:  http://lionslayer.yoeyar.com/?p=632

Collation of Myanmar (Burmese) in Unicode
http://developer.mimer.com/collations/myanmar/MyanmarCollation.pdf