file upload – Sharepoint / OneDrive improper recognition of Tibetan / Choekey / Dzongkha filenames


File names with international characters (UTF-8) are accepted by Sharepoint libraries / OneDrive. However, for Tibetan/Choekey/Dzongkha script, they seem to be saved improperly in the backend, and the characters become somehow indistinguishable from each other once stored.

Make two files:

མགོན་པོ་ཀླུ་སྒྲུབ་ཀྱི་རྣམ་ཐར། ༠༡.txt

མགོན་པོ་ཀླུ་སྒྲུབ་ཀྱི་རྣམ་ཐར། ༠༢.txt

They are different filenames, but the names have the same number of characters. After the first file is uploaded, the second one cannot be, throwing the error “A file with an equivalent name exists.” The file names are not the same — why does SharePoint/OneDrive treat them as “equivalent”?

If the second filename has a different number of characters in it than the first, or is mixed with standard ASCII characters that differ from those in another filename of the same length, the file is accepted for upload without throwing the error.

The error does not apply to other international characters. The following two files upload just fine:

карабкаться.txt

карабкатьяс.txt

Is this a product limitation in the Microsoft cloud storage ecosystem or are there certain settings/alternative encoding to the filename that could be done to make it more acceptable?

The error can be reproduced by anyone (just copy-paste the characters from above into the names of simple dummy files).