0.) Definition of G11n (=Globalization) elements and their relationship ----------------------------------------------------------------------- language - usually two letter code, e.g. French = 'fr', Japanese = 'ja' territory - usually two capital letter code, e.g. Canada = 'CA' codeset - e.g. 'UTF-8', 'ISO-8859-1', 'PCK' variant - used only when introducing specifically customized locale locale = language_territory[.codeset][@variant] e.g. 'fr_CA', 'ja_JP.UTF-8', 'de_DE.ISO8859-15@euro' 1.) Introduction of G11n Installation interface ----------------------------------------------- This document describes a new G11nInstall interface, which I plan to integrate into OpenSolaris. The interface provides information about G11n elements (languages, locales), and about their packaging. In order to be able better install/remove language and locale bits in OpenSolaris. The interface can be for example consumed by: - language selection on LiveCD start-up, - system default locale selection screen of installer, - languages installation screen which may be introduced at installer, - IPS package manager. For example package manager will get a list of available languages/locales from the interface. Later on, when user decides to install additional languages, the interface will provide IPS filter and list of packages for which the filter must be applied to install selected languages on the system. Note, when image-wide filters will be fully supported, listing of packages will become less important. Follows example of filter returned by the interface. It installs French localization and English files needed by Canadian UTF-8 locales (fr_CA, en_CA): arch=i386 language=fr | en territory=CA encoding=UTF-8 message=true documentation=false 2.) Functionality of G11nInstall interface ------------------------------------------ a) Major functionality: - Get a list of languages and locales installed on the system (IPS image). - Get a list of languages and locales available at IPS network repository. - Get a list of files (packages) which must be installed (removed) on the system in order to support given language(s) or locale(s). - Get and Set system default locale. b) Additional functionality: - Provide localized names of languages and territories (displayed e.g. by localized versions of installer). - Differentiate language/locale enablers (I18n) and software messages (L10n), when listing files (packages). As a result the sw messages can be installed (removed) separately. - Differentiate following (and maybe more) classes of language/locale bits: - mandatory (e.g. iiimf-cle-sunpinyin, UTF-8 encoding support) - recommended (e.g. iiimf-cle-open) - optional (e.g. non-UTF-8 legacy encodings support) - List G11n files (packages) only if base products/files are installed on the system (formerly SUNW_PKGLIST). - Provide info about translation state of languages (locales), e.g. fully/partially/not translated. - Suggest one sample locale per language (used e.g. for LiveCD session). 3.) Implementation of G11nInstall interface ------------------------------------------- Implementation of the interface will rely on G11n-related data defined at IPS, as file tags (attributes) and dependencies. The interface will be querying that meta data from IPS image and network repository. See also high-level schema (high-level.png). Note IPS team is working on new concept of "facets and variants". We will most likely use mechanism instead of IPS tags. In this document (and above schema) I do *not consider* the facets and variants yet, so actual implementation will most likely change. Here is example of meta data stored at IPS (SUNWlang-fr package). See more examples of at section 5.). File tags (ULL=usr/lib/locale): PATH LANGUAGE TERRITORY ENCODING MESSAGE $ULL/fr.UTF-8/LC_MESSAGES fr UTF-8 true $ULL/fr.UTF-8/LC_MESSAGES/*.mo fr UTF-8 true $ULL/fr_FR.UTF-8/fr_FR.UTF-8.so.3 fr FR UTF-8 $ULL/fr_CA.UTF-8/fr_CA.UTF-8.so.3 fr CA UTF-8 Note IPS tags are used mostly because those give us file granularity. Also, because it is not possible to apply filters to package level attributes, and packages themselves. To avoid IPS tags to be consumed directly by the action, they need to be prefixed by "pkg.". E.g. pkg.language, pkg.message. The prefix needs to be used also by filters. We may compact locale-related IPS tags into one, e.g. fr_CA.UTF-8. Some additional information will not be stored at IPS, but at system message catalogs and other system files. For example language and locale names, also translations of those names. Python will be used for implementation of the G11n interface. One of the reasons is to be able better integrate with IPS. 4.) Specification of G11nInstall interface ------------------------------------------ class G11nStorage(object): def __init__(self, ips_image): def __init__(self, ips_network_repository): def get_langs(self): """Returns list of language objects available at IPS image/repo.""" def get_sysdefault_locale(self): """Get system default locale. Returns locale object.""" def set_sysdefault_locale(self, locale_obj): """Set system default locale.""" class Language(object): def __init__(self, langcode_str): self.code= langcode_str # e.g. 'fr', 'de' def get_locales(self): """Returns lists of locale objects available for this language.""" def get_filter(self): """Returns IPS filter which installs (removes) this language.""" def get_packages(self): """Returns list of packages which must be updated to install (remove) support for this language.""" def get_description(self): """Returns language description string, e.g. 'french', 'german'.""" def get_l10n_description(self, language): """Returns localized language description string, e.g. 'francais', 'allemand'.""" def get_t18nprogress(self): """Returns translation progress of this language as a float number, e.g. 0.8 = 80%.""" def suggest_locale(self): """Suggests one sample locale for this language. Returns locale object.""" class Locale(object): def __init__(self, language, territory, codeset_str='', variant_str=''): self.language= language self.territory= territory self.codeset= codeset_str # e.g. 'UTF-8', 'ISO8859-1' self.variant= variant_str # e.g. '@euro' def get_filter(self): """Returns IPS filter which installs (removes) this locale.""" def get_packages(self): """Returns list of packages which must be updated to install (remove) support for this locale.""" def get_code(self): """Returns locale code, e.g. 'fr_CA', 'de_AT.UTF-8', 'es_ES.ISO8859-15@euro'.""" class Territory(object): def __init__(self, terrcode_str): self.code= terrcode_str # e.g. 'AT', 'CA' def get_description(self): """Returns territory description string, e.g. 'Austria', 'Canada'.""" def get_l10n_description(self, language): """Returns localized territory description string, e.g. 'Osterreich'.""" 5.) Example of G11n-related meta data stored at IPS --------------------------------------------------- a) Localization files packaged within base package - e.g. SUNWa2ps: Tags (file granularity): NAME PATH LANGUAGE TERRITORY ENCODING MESSAGE dir usr/bin file usr/bin/a2ps file usr/share/locale/fr/LC_MESSAGES/a2ps.mo fr true file usr/share/locale/de/LC_MESSAGES/a2ps.mo de true file usr/share/locale/es/LC_MESSAGES/a2ps.mo es true ... b) Localization files packaged separately in dedicated package, e.g. Japanese message files for Firefox - SUNWfirefoxl10n-ja-JP. Note, the separated localization packages approach is obsolete and we are moving out of it. Attributes: depend fmri=pkg:/SUNWfirefox type=exclude depend fmri=pkg:/SUNWfirefox-root type=exclude Tags (file granularity): NAME PATH LANGUAGE TERRITORY ENCODING MESSAGE dir usr/lib/firefox/chrome ja true file usr/lib/firefox/chrome/ja.jar ja true file usr/lib/firefox/chrome/ja.manifest ja true ... c) Locale enablers and localization file packaged together, e.g. French system locales and message files - SUNWlang-fr: Attributes: depend fmri=pkg:/SUNWlang-common type=require Tags (file granularity): NAME PATH LANGUAGE TERRITORY ENCODING MESSAGE dir usr/lib/locale/fr.UTF-8 fr UTF-8 dir usr/lib/locale/fr.UTF-8/LC_MESSAGES fr UTF-8 true file usr/lib/locale/fr.UTF-8/LC_MESSAGES/*.mo fr UTF-8 true dir usr/lib/locale/fr_FR.UTF-8 fr FR UTF-8 file usr/lib/locale/fr_FR.UTF-8/fr_FR.UTF-8.so.3 fr FR UTF-8 dir usr/lib/locale/fr_CA.UTF-8 fr CA UTF-8 file usr/lib/locale/fr_CA.UTF-8/fr_CA.UTF-8.so.3 fr CA UTF-8 ... 6.) Samples of interface consumer code -------------------------------------- a) Listing all languages and locales installed on the system. img= G11nInstall.G11nStorage('/') for la in (img.get_langs()): print 'Language %s: ' % (la.code,) for lo (la.get_locales()): print '%s, ' % (lo.get_code(),) print '\n' b) Installing French Canada locale. Note when image-wide filters will be fully supported, the code will get simpler. canada= G11nInstall.Locale('fr_CA') pkgs= canada.get_packages() fltr= canada.get_filter() os.system('pkg install -f ' + fltr.tostring() + ' ' + pkgs.tostring())