One Pager for Adding Malaysian Indonesian Vietnamese UTF 8 Locales

1. Introduction
   1.1. Project/Component Working Name:
        Adding Malaysian, Indonesian, Vietnamese UTF-8 locales.  

   1.2. Name of Document Author/Supplier:
        William Xue (wei.xue@sun.com)

   1.3. Date of This Document:
        06/20/2008    First Draft

   1.5. Email Aliases:
        1.5.2. Responsible Engineer: William Xue (wei.xue@sun.com)


2. Project Summary
   2.1. Project Description:
        Add the following South East Asia locales support to current 
        OpenSolaris.
        Malaysian:   ms_MY.UTF-8 
        Indonesian:  id_ID.UTF-8
        Vietnamese:  vi_VN.UTF-8 
        
        Add VISCII and TCVN code conversion feature in iconv library. So that
        the bidirectional conversion between VISCII,TCVN, and UTF-8/UCS4(2) 
        will be supported in iconv(1) and iconv(3).

   2.2. Risks and Assumptions:
         
       
3. Business Summary
   3.1. Problem Area:
        So far, Solaris dose not support the Malaysian, Indonesian, 
        Vietnamese locales. Hence, the application for these tree
        locales can not set locale successfully. This project will make the
        applications can run on Malaysian, Indonesian, Vietnamese locales by
        adding the three new locales to Open Solaris, which will also help 
        the users and developers from these areas to use Solaris more 
        conveniently.

        As there are many different encoding standards for Vietnamese,
        it is necessary to support the conversion feature between these
        native encoding standards and UTF-8/UCS encodings.

   3.2. Market/Requester:
        Solaris L10N marketing

   3.3. Business Justification:
        There is huge population in Malaysia(27,170,000), Indonesia(238,452,952)
        and Vietnam(85,662,800). They are really growing as the next user
        groups. So there are many opportunities in the three locales. For us, 
        it's a great chance to increase Open Solaris in the community on the
        three locales. And it definitely will also attract more developers
        and users from Malaysia, Indonesia, Vietnam.

   3.4. Competitive Analysis:
        GNU/Linux supports Malaysia, Indonesia, Vietnamese locales.
        GNU/libc iconv supports Vietnamese encoding conversion.

   3.5. Opportunity Window/Exposure:
        Solaris Nevada 
        Project Indiana 

   3.6. How will you know when you are done:


4. Technical Description:
   4.1. Details:
        1>
        To create a new locale support in Solaris, following locale data 
        need to be defined.
        LC_CTYPE
        LC_COLLATE
        LC_NUMERIC
        LC_TIME
        LC_MONETARY
        LC_MESSAGE
         
        Because CLDR(Unicode's Common Locale Data Repository)[1] is so far the
        largest and most extensive standard repository of locale data. The  
        new UTF-8 locales: ms_MY.UTF-8, id_ID.UTF-8 and vi_VN.UTF-8 will be 
        created with standard locale data according to CLDR.
  
        The l10n(localization) messages of these three locale languages are not
        covered by this project. As a reference, here are some information about l10n
        messages status for these three languages:
        
          * CLI (Command Line Interface) messages:
            The localization for Solaris system libraries and utilities in Malaysian,
            Indonesian and Vietnamese are not available.
            
          * L10N status for major GUI components by communities

                              Malaysian    Indonesian    Vietnamese
            --------------------------------------------------------------
            Gnome[3]          Yes          Yes           Yes
            Firefox [10]      N/A          Yes           N/A
            Thunderbird [11]  N/A          N/A           N/A
            Openoffice [14]   Yes [12]     N/A           Yes [13]

            The localization contents for these three languages of gnome had been  
            integrated in package SUNWgnome-l10nmessages-extra on Solaris.
         
        2>
        To enhance iconv modules for Vietnamese encodings:

        The most popular encoding standards for Vietnamese are :
        VISCII [8]
        TCVN(5712) [7]
        CP1258 [9]
        
        CP1258 is the standard of Microsoft Windows. It is supported by current 
        Solaris iconv.
        
        If Vietnamese locale is supported, the VISCII, TCVN encoding conversion
        should be supported as well. It includes details as following:
        VISCII <-> UTF-8/UCS-4/UCS-4BE/UCS-4LE/UCS-2/UCS-2BE/UCS-2LE
        TCVN  <-> UTF-8/UCS-4/UCS-4BE/UCS-4LE/UCS-2/UCS-2BE/UCS-2LE
        VISCII <-> TCVN
        (Note: <-> means from and to.)
  
        Since Malaysia and Indonesia language characters belong to ISO8859-1[2]
        standard, they do not need extra iconv modules.
        
        
   4.5. Interfaces:
        1>
        For function:
        iconv_t iconv_open(const char *tocode, const char *fromcode)
        The parameters tocode and fromcode will support: VISCII, TCVN(TCVN5712).

        2> 
        For utility: 
        iconv [-cs] -f frommap -t tomap [file]...
        "frommap" and "tomap" will support VISCII, TCVN(TCVN5712).
 

   4.6. Doc Impact:
        None.

   4.7. Admin/Config Impact:
        None.

   4.8. HA Impact:
        None.

   4.9. I18N/L10N Impact:
        No impact to current XI18N.

   4.10. Packaging & Delivery:
        locale enabling packages (new packages):
        SUNWlang-ms
        SUNWlang-id
        SUNWlang-vi

        iconv packages (updated packages):
        SUNWiconv-extra
        SUNWiconv-unicode

   4.11. Security Impact:
        None.

   4.12. Dependencies:       


5. Reference Documents:
   [1] Unicode CLDR Project: Common Locale Data Repository
      http://unicode.org/cldr
   
   [2] ISO/IEC 8859-1:1998:
      http://anubis.dkuug.dk/JTC1/SC2/WG3/docs/n411.pdf

   [3] Gnome l10n message language list:
      http://l10n.gnome.org/languages/

   [4] Vietnamese Unicode FAQ:
      http://vietunicode.sourceforge.net/

   [5] Vietnamese encoding conversion-tables: 
      http://www.haible.de/bruno/charsets/conversion-tables/Vietnamese.html

   [6] The TCVN 6909 standard:
      http://www.informatik.uni-leipzig.de/~duc/software/misc/tcvn6909.pdf

   [7] TCVN 5712:1993 standard:
      http://www.informatik.uni-leipzig.de/~duc/software/misc/tcvn.txt

   [8] rfc1456 - Conventions for Encoding the Vietnamese Language
      http://tools.ietf.org/html/rfc1456

   [9] Windows 1258 reference:
      http://www.microsoft.com/globaldev/reference/sbcs/1258.mspx
      
   [10] Firefox is available in over 45 languages:   
      http://www.mozilla.com/en-US/firefox/all.html

   [11] Available Thunderbird languages: 
      http://www.mozilla.com/en-US/thunderbird/all.html

   [12] Malaysian openoffice website:   
      http://ms.openoffice.org/
      
   [13] Vietnamese openoffice website:      
      http://vi.openoffice.org/

   [14] Language localization status of openoffice
      http://wiki.services.openoffice.org/wiki/Languages
      

6. Resources and Schedule:
   6.1. Projected Availability:
        October 2008

   6.3. Cost of Capital Resources:
        QA engineer need to know some of Malaysia, Indonesia, Vietnamese.            

   6.5. ARC review type:
        FastTrack

   6.6. ARC Exposure:
        open


7. Prototype Availability:
   7.1. Prototype Availability:
        None

   7.2. Prototype Cost:
        None

Labels

adding adding Delete
malaysian malaysian Delete
indonesian indonesian Delete
vietnamese vietnamese Delete
utf utf Delete
8 8 Delete
locales locales Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.
  1. Nov 02, 2008

    rishi_k9 says:

    Hi, When are these packages available? If they are available as of now, can you...

    Hi,

    When are these packages available?
    If they are available as of now, can you please point me to them.
    I am looking for packages for Solaris 10 (x86).

    Regards,
    Rishi

Sign up or Log in to add a comment or watch this page.


The individuals who post here are part of the extended Sun Microsystems community and they might not be employed or in any way formally affiliated with Sun Microsystems. The opinions expressed here are their own, are not necessarily reviewed in advance by anyone but the individual authors, and neither Sun nor any other party necessarily agrees with them.

Copyright 1994-2009 Sun Microsystems, Inc.
Powered by Atlassian Confluence
Sun Guidelines on Public Discourse Privacy Policy Terms of Use Trademarks Site Map Employment Investor Relations Contact