Using Pootle for translation

Using Pootle for OS.o Messages Translation

One possibility to translate SW messages for OpenSolaris is using Pootle. Since there are several file formats used in OpenSolaris, one must use Translate Toolkit to convert (and back-convert) different sources to po file format that Pootle can work with. However there are several obstacles that prevent from using Pootle on OpenSolaris messages directly, without modifications. The section below contains list of modifications and steps (workarounds) that we had to execute before using original Solaris po files in Pootle (and in Solaris)

Detail analysis of current problems can be found in the section QA Report.

Following issues were identified on G11n Live repository from 2007-07-30. (hg clone ssh://anon@hg.opensolaris.org/hg/nv-g11n/messages, modul messages).

Changes in Solaris po files

Pootle can process GNU-po files, not Solaris-version

Pootle cannot process correctly po files that do not have one or more empty lines at the beggining of new entry (except of the first one). To fix this issue you need to add one ore more empty lines between end of msgstr string and msgid of new message (or comment section that belongs to next msgid)

Example 1:

msgid "AMD-8000-MU.action"
msgstr "Schedule a repair procedure to replace the affected CPU."

#
# code: AMD-8000-N7
# keys: fault.memory.page_ck
msgid "AMD-8000-N7.action"
msgstr "No repair action is recommended at this time."

Example 2:

msgid "AMD-8000-MU.action"
msgstr "Schedule a repair procedure to replace the affected CPU."
#
# code: AMD-8000-N7
# keys: fault.memory.page_ck
msgid "AMD-8000-N7.action"
msgstr "No repair action is recommended at this time."

Example 1 will be processed by Pootle correctly, however Example 2 will produce garbled po file because of missing new line between msgstr and msgid (or comment section). To fix this issue automatically you can use poval_pootle.sh script.

Some po files contain randomly generated msgid

Some msgid strings are generated automatically by mgsid tool. As a result, translator looses information about original string once the first version of translation is done.

# code: AMD-8000-N7
# keys: fault.memory.page_ck
msgid "AMD-8000-N7.action"
msgstr "No repair action is recommended at this time."

Information about original msgstr ("No repair action is recommended at this time.") will be lost after translation. Workaround is to add English string to comment section. Original message will be shown in Pootle as comments for translator.

# code: AMD-8000-N7
# keys: fault.memory.page_ck
#
#| msgstr "No repair action is recommended at this time."
#
msgid "AMD-8000-N7.action"
msgstr "No repair action is recommended at this time."

You can add such comments automatically by using poval_pootle.sh script.

The other solution is to use poswap script from translation toolbox to create template file, which will be used for translation. The drawback of this approach is that you need also the original file to transform the msgids back to their proper values.

Wrong structure of some Po files

Missing quotes at the end of line for some mgstrs in SUNW_OST_OSCMD.po causes the file cannot be transformed to mo file by msgmft. This is not related to Pootle, but to integration into OpenSolaris. Related Bug ID: 6586805 Missing quotes at the end of line in .po files (Solved by poval_pootle.sh, the double quotes are added).

Solaris po files cannot contain blank msgstr

If some strings remain untranslated, they should contain original (English) text. Otherwise Solaris gettext utility displays it as an empty string instead of English message.

Example:

msgid "No repair action is recommended at this time."
msgstr ""

This causes empty string listing instead of having original English message. Instead of this, you have to remove such <tt>msgid</tt> (not recommended) or add original string to msgstr like below:

msgid "No repair action is recommended at this time."
msgstr "No repair action is recommended at this time."

This is not related to Pootle, but to integration into OpenSolaris.

QA Report

Source of the majority of problems using Pootle editing tool for translation of OpenSolaris .po files is the incompatibility of Sun .po files and GNU gettext .po files. These differences are causing numerous problems (see bellow the Pootle specific problems and .po files incompatibility lists).

The incompatibility of Sun .po files and GNU gettext .po files

  • GNU gettext requires untranslated messages to be formed as 'msgstr ""', but in some solaris files is just 'msgstr', the double quotes are missing.
  • GNU gettext threats some escape sequences used in open Solaris as errors: \' ,\`, |
  • In the Solaris .po files, is not present encoding, this is reported as possible error.
  • In many Solaris .po files the 'msgid ""' repeats several times. This is an error for GNU gettext tools.
  • Duplicate msgids are defined in the scope of each domain. That is, a msgid is considered a duplicate only if the identical msgid exists in the same domain.

The Pootle specific problems and complications

  • Pootle cannot handle Solaris .po unmodified files. Pootle requires the optional empty line at beginning of each translation entry (except of the first one: before msgid, after msgstr).
  • Pootle is not capable to work with .po files in strict Uniforum format.
  • Majority of Solaris .po files has msgid in format of coded string. This implies that after translation the original English text will be reachable only through original .po files. This brings up problem with solving fuzzy translations and controlling the correctness of translation.

Problem solutions

There are two base aproaches:

  • Make the OpenSolaris .po files fully compatible with GNU .po file format

or

  • Correct the structure of OpenSolaris files to make it usable with Pootle.

Validating structure of .po files to use with Pootle

The only problem, which cannot be easily removed is problem with escape sequences. All the other problems can be removed without changing the compatibility and meaning.

All the Pootle specific problems were solved by our script. This script does not make the Solaris .po files compatible with GNU gettext tools, but enables to profit of Pootle translation tool.

poval_pootle.sh can be downloaded here

This is not final version of poval_pootle.sh script. Read the comments in the header of script to get help on usage. Pay attention when rewriting sed code, it is sensitive on white spaces at the end of lines.

Checking the validity of .po files, for use with Pootle

This script will verify these pitfalls:

  • Missing empty line at the beggining.
  • Missing numbersign(#) separator at the first line (for Uniforum files)
  • Invalid strings in the file (is not msgid, msgstr, comentary or empty line)
  • Comentaries or empty lines between msgid and msgstr (see example 1. and 2. below)
  • Missing or corrupted msgids, or additional spurious msgstr lines.
  • Corrupted msgstr enteries.

popy_check.py can be dowloaded here

Example 1:

this one causes Pootle to ignore the entry

#
# code: AMD-8000-N7
# keys: fault.memory.page_ck
msgid "AMD-8000-N7.action"

msgstr "No repair action is recommended at this time."
Example 2:
#
# code: AMD-8000-N7
# keys: fault.memory.page_ck
msgid "AMD-8000-N7.action"
#
msgstr "No repair action is recommended at this time."

Make the OpenSolaris .po files fully compatible with GNU .po file format

Scripts poval_gnu.sh was implemented in order to validate the .po files for GNU gettext tools. The script is using GNU gettext tools to make it compatible with GNU gettext .po files (the majority of work is done by GNU gettext tools automatic checkups). This solution is not final due to unsolved problem with escaped characters.

poval_gnu.sh

#!/bin/bash
# Simple work arround, presumes correct inputs and experienced user.

if [[ -d "$1" ]]; then
  po_dir="$1"
  files=(`find "$po_dir" -name "*po"`)
else
  files="$1"
fi

for filename in "${files[@]}"; do
  new_filename=${filename/.po/_val.po}
  tmp_filename=${filename/.po/_tmp.po}

# This section corrects the missing "" after msgstr
  cat "$filename" | sed -e 's/msgstr/msgstr \"\"/' > $tmp_filename
  gmsguniq --strict -E -u "$tmp_filename" > "$new_filename"
# gmsguniq -u "$filename" > "$new_filename"
# gmsguniq -d "$filename"

# Clearing temporary files
  rm "$tmp_filename"
done
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.

Sign up or Log in to add a comment or watch this page.


The individuals who post here are part of the extended Sun Microsystems community and they might not be employed or in any way formally affiliated with Sun Microsystems. The opinions expressed here are their own, are not necessarily reviewed in advance by anyone but the individual authors, and neither Sun nor any other party necessarily agrees with them.

Copyright 1994-2009 Sun Microsystems, Inc.
Powered by Atlassian Confluence
Sun Guidelines on Public Discourse Privacy Policy Terms of Use Trademarks Site Map Employment Investor Relations Contact