Friday, May 26, 2006

.desktop files internationalization

In my last post I spoke about the basic format of .desktop files. If you remember, this files where composed of two elements, one the category header and the other key/value pairs that holded multiple information about the associated Desktop Entry. Specification includes means to localize certain keys in the file so it may contain information to be shown to the user (i.e. name, description,...) that must be localized, this are the keys wich value type is declared as localestring. The approach is easy, we have one "primary" key that holds the default localized value (normally in english) and then add more keys with the same name suffixed with the corresponding POSIX locale in brackets. For example we should have the key Name=Menu Editor with the keys Name[ca]=Editor del menú Name[cs]=Editor nabídky Name[cy]=Golygydd Dewislen Name[da]=Menuredigering following it. If the current locale is not found in the file, the default key must be used. The way this entries are encoded (ISO-8859-1, UTF-8,...) comes defined in the .desktop file using the Encoding required key. This key can take two possible values : UTF-8 of Legacy-Mixed. While the later is deprecated, it's still possible to find it (I think) in some systems, so let's see how it works. When the Legacy-Mixed mode is active for a file, localized keys are encoded using old style encoding (ie not UTF-8). In this files the POSIX Locale that's used to identify each localized key is used to determine the encoding. Following table shows what encodings correspond to each Locale :
EncodingAliasesTags
ARMSCII-8 (*) hy
BIG5 zh_TW
CP1251 be bg
EUC-CNGB2312zh_CN
EUC-JP ja
EUC-KR ko
GEORGIAN-ACADEMY (*)
GEORGIAN-PS (*) ka
ISO-8859-1 br ca da de en es eu fi fr gl it nl no pt sv wa
ISO-8859-2 cs hr hu pl ro sk sl sq sr
ISO-8859-3 eo
ISO-8859-5 mk sp
ISO-8859-7 el
ISO-8859-9 tr
ISO-8859-13 lt lv mi
ISO-8859-14 cy ga
ISO-8859-15 et
KOI8-R ru
KOI8-U uk
TCVN-5712 (*)TCVNvi
TIS-620 th
VISCII
( Table directly copied from the standard )
Encodings that are marked with an asterisk are very rare and are not supported by the GNU C library, so maybe they are neither supported by Java. Anyway, as they are so rare and the legacy mode is to be extincted it doesn't look like a problem to really care about.

0 Comments:

Post a Comment

<< Home