Jump to content

Talk:Mongolian (Unicode block)

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

U+1878 MONGOLIAN LETTER CHA WITH TWO DOTS

[edit]

U+1878 MONGOLIAN LETTER CHA WITH TWO DOTS was added with Unicode 11.0. I'm not sure where it should be added in the "Presentation forms" section of this article and with which title. https://www.unicode.org/L2/L2017/17007-n4781-mongolian.pdf states "This dotted form of the letter cha is used to indicate that it is pronounced as š rather than č, reflecting Buryat pronunciation practice which differs significantly from Classical Mongolian." So, does it go under SH or CH? Also, is "Buryat" a good enough title if the Unicode proposal is cited? DRMcCreedy (talk) 21:32, 18 June 2018 (UTC)[reply]

I think it should go in a separate row with the letter labelled "SH š", and as it is an extension of the basic Mongolian set, the subset should be "Basic", with an annotation that is used for Buryat. BabelStone (talk) 08:42, 19 June 2018 (UTC)[reply]
I've made the suggested edit. Take a look and adjust it if I have anything wrong. Thanks. DRMcCreedy (talk) 15:49, 19 June 2018 (UTC)[reply]

Ali Gali letters

[edit]

I've added a table of presentation forms for the Ali Gali letters U+1880-18AA (excluding U+18A9 because it's a non-spacing mark) but I feel out of my depth here. Can someone familiar with Mongolian script look it over and give me feedback:

  • Are the letters in the right groups (vowels, signs/marks, and consonants)?
  • Are the letters in the right order within the three groups?
  • Am I missing any defined positional forms? Do I erroneously have any non-defines forms lists?
  • FYI: I feel strongly about having the signs/marks section even though they don't have positional forms because
    • it gives me a logical place to put the standardized variants for U+1880 and U+1881
    • they are categorized as letters in the Unicode character database

Thanks. DRMcCreedy (talk) 23:35, 12 February 2019 (UTC)[reply]

1823 1824, 1825 1826 appear identical

[edit]

Why do these glyphs appear identical on my device? I think a note about why these have different encodings would be beneficial. I'll look for if the Unicode proposal mentions anything Awelotta (talk) 14:43, 25 September 2025 (UTC)[reply]

The two sequences look different from each other on my laptop so it might be a font/support issue. DRMcCreedy (talk) 20:37, 25 September 2025 (UTC)[reply]
I've found the answer. Mongolian Unicode support is just really bad (still in 2025?), and o and u are visually identical in all contexts but are historically differentiated in the education of the script.

Why are ⟨o⟩ and ⟨u⟩ are separate codepoints? Why are ⟨ö⟩ and ⟨ü⟩ separate codepoints? They are visually identical. They are visually identical. They are visually identical. Every single encoding differentiates o and u, and ö and ü, and I can't figure out why. Chinese and Mongolian representatives at the Unicode Consortium insisted on it from the beginning.

And from one of the cited sources:

There are one and the same letters for denoting o and u; and ö and ü as well. Though, they are considered as separate letters according to the Jirüken-ü tolta.

Now I'm curious why they appear different for you. For me 1823 and 1824 look identical, and 1825 and 1826 look identical, but the two sets don't look identical. Awelotta (talk) 16:06, 29 September 2025 (UTC)[reply]
Sorry... I thought you meant the sequence U+1823 1824 looked identical to the sequence U+1825 1826. Indeed U+1823 and 1824 look identical on my machine. As do U+1825 and 1826. Oddly, Unicode's PDF code chart actually uses different glyphs for U+1825 and 1826. But none of the Mongolian fonts I have on my machine show U+1826 in the shape shown in the PDF. Unicode's PDF glyph for U+1826 looks the same as 1826+FVS2 (Third Isolate Form) in https://www.unicode.org/wg2/docs/n4752r2-16258-mongolian-forms.pdf but I don't know why the PDF uses that shape. DRMcCreedy (talk) 03:54, 30 September 2025 (UTC)[reply]
Different forms were chosen for disambiguation. If you look closely, U+1823 and U+1824 are also disambiguated by choosing the "first isolate form" and "second isolate form". I'm not sure what benefit disambiguation here actually serves font developers or others if it would have to be explained every time anyways.
Source: https://www.colips.org/journals/volume21/21.1.3-Biligsaikhan.pdf (Section 3.2.1 Representative Glyphs on page 33)
I think we should keep showing the isolated forms here instead of the disambiguated forms. For now I'll add a footnote with the "Unicode presentation forms" (aka representative glyph) and an explanation. Awelotta (talk) 14:15, 1 October 2025 (UTC)[reply]
Sounds good. DRMcCreedy (talk) 14:21, 1 October 2025 (UTC)[reply]