PO4Moz HOWTO

(Or: how I learned to stop foaming out my mouth and love the XPI)

PO4Moz is a set of scripts that allow you to convert a recent Mozilla XPI language pack into a huge POT file you can then edit with your favourite PO file editor, use gettext tools on (like msgmerge), etc., etc. But then you can pick the original langpack and the PO file and combine both to create a translated langpack.

Also, you can use an English XPI and a translated XPI together to extract a translated PO file. This is useful for updating a translation. Neat, uh?

The Process

Obtaining a PO File for Translating

Starting A New Translation

Imagine you want to translate any Mozilla product's en-US.xpi language pack into a fictional language called Anklebonian (ISO code ank-ANK).

First, you should downloade the en-US.xpi file from ftp.mozilla.org (for example, for Firefox 1.0.7, you should download ftp://ftp.mozilla.org/pub/mozilla.org/firefox/releases/1.0.7/linux-i686/xpi/en-US.xpi).

Next, use the unpack_langpack script to turn that compact XPI file into a mess of directories and files. You must provide, as arguments, the pathname of the en-US.xpi file and the name of the new directory that must hold the aforesaid mess:

$ ./unpack_langpack en-US.xpi dir-en-US

After some screenfuls of zip output, it should be finished.

Now, use the extract_pot script to turn the directory into a POT file. You should provide the name of the directory and redirect the script's output into the filename of your choosing:

$ ./extract_pot dir-en-US > firefox-1.0.7.pot

Now you can copy firefox-1.0.7.pot into firefox-1.0.7_ank-ANK.po and start translating.

Importing an Old Translation

The scenario is the same as in the previous section, but now there already is an ank-ANK.xpi file you want to turn into a valid PO file for translating.

First, download both the en-US.xpi file and the ank-ANK.xpi file. Then, unpack them:

$ ./unpack_langpack en-US.xpi dir-en-US
 [...]
$ ./unpack_langpack ank-ANK.xpi dir-ank-ANK

Then, use the extract_po script to generate the translated PO file. The first argument is the English directory's name, the second one is the translation's name. Redirect the output into the filename of your choosing:

$ ./extract_po dir-en-US dir-ank-ANK > firefox-1.0.7_ank-ANK.po

Now you can deal with the PO file directly.

Importing an Old Translation into a New Translation

You have a translation for, say, Firefox 1.0.7, and now Firefox 1.0.8 has come out and you want to provide a translation. One way to do this is to generate a POT file out of 1.0.8's langpack and translate it all again, but it is not the best way.

Probably, the best way is using gettext's msgmerge utility to merge old translations into the new POT file.

First, obtain a translated PO file for the old version. Second, obtain a POT file for the new version. Third, use msgmerge. The first argument is the old PO translation, the second one the new POT file. Redirect the output into your new PO file:

$ msgmerge firefox-1.0.7_ank-ANK.po firefox-1.0.8.pot > firefox-1.0.8_ank-ANK.po

Now you can update firefox-1.0.8_ank-ANK.po.

Translating the PO File

You have performed one of the actions described before, and now you have a PO file you're eager to edit. Or not, but you're paid for it. Or not, but someone's tricked you into doing it. Anyway, that's how a typical entry in a PO file (generated by these tools) looks like:

#. extracted from content/pref-proxies.xul
#. LOCALIZATION NOTE : FILE The Proxies preferences dialog
msgid ""
"locale/browser/pref/pref-connection.dtd:lHeader:\n"
"Connection Settings"
msgstr ""

Don't panic!

The lines starting with a hash sign (#) are comments. These comments appeared in the language pack and were inserted in the PO file so that you don't lose any information or any important notes regarding the translation.

The line starting with msgid, and the following ones that are enclosed in double qoutes, define the string to be translated.

Generally.

In PO files generated by these scripts, the first line of the “string to be translated” tells us the file where this string was found and the tag it had in that file. In this example, the string was in the file locale/browser/pref/pref-connection.dtd and the tag was lheader. This information is saved because, sometimes, two strings look the same but mean two different things and, so, must be translated differently, and PO doesn't support it.

Then, yes, the next lines are the text to be translated. You should read it and translate it as best as you can and put the translation into the msgstr:

#. extracted from content/pref-proxies.xul
#. LOCALIZATION NOTE : FILE The Proxies preferences dialog
msgid ""
"locale/browser/pref/pref-connection.dtd:lHeader:\n"
"Connection Settings"
msgstr "Kunexn Sirns"

Isn't Anklebonian a neat language?

Be sure to use UTF-8 encoding.

Some things that are not really proper strings appear here, too, like accelerator keys and command keys:

msgid ""
"locale/browser/browser.dtd:tabCmd.label:\n"
"New Tab"
msgstr ""

msgid ""
"locale/browser/browser.dtd:tabCmd.accesskey:\n"
"T"
msgstr ""

msgid ""
"locale/browser/browser.dtd:tabCmd.commandkey:\n"
"t"
msgstr ""

In this example you can see a label for a command (“New Tab”) and its associated access and command keys. Look closely at the strings' tags. The first one is tabCmd.label, the second is tabCmd.accesskey and the third one is tabCmd.commandkey. This way you can tell which one is which and which one goes with which one.

The accesskey is the character that appears underlined in a menu entry or in a dialog box item's label and that you can press to select it with the keyboard. The commandkey is a key you can press anywhere, along with the key Ctrl (in Windows/Unix) or Option (in Mac), to perform the action. So, the accesskey should be a letter that appears in the label (but which is not repeated), but the command key should be the same (unless your country's keyboards don't have that key).

msgid ""
"locale/browser/browser.dtd:tabCmd.label:\n"
"New Tab"
msgstr "Profumo Kriexon Av Nutab"

msgid ""
"locale/browser/browser.dtd:tabCmd.accesskey:\n"
"T"
msgstr "N"

msgid ""
"locale/browser/browser.dtd:tabCmd.commandkey:\n"
"t"
msgstr "t"

With this translation, when the user right-clicks on the tab bar, the “Profumo Kriexon Av Nutab” menu entry has the N underlined, but the user can still create new tabs directly by pressing Ctrl-T.

If you have imported old translations, you may run across this one:

#, fuzzy
msgid ""
"locale/browser/browser.dtd:openLocationCmd.label:\n"
"Open Location..."
msgstr "Apro Filo..."

The #, fuzzy comment indicates that, when msgmerge ran, it did not find the string in the old translation, but it found a similar one, so it put its translation in this place and marked it “fuzzy”. This happens when a string has changed, or when there is a new string that is similar to an old one, or when it has changed file or label. Now you should check whether the proposed translation fits the original string, fix it if necessary, and remove the #, fuzzy comment:

msgid ""
"locale/browser/browser.dtd:openLocationCmd.label:\n"
"Open Location..."
msgstr "Apro Lukaxen..."

When everything's translated, it will be time to apply your translation to create a new langpack.

Creating a Translated Language Pack

Obtaining a PO file for translating involved unpacking an English XPI file. Now you will use this unpacked English langpack to generate an unpacked translated langpack, and from this, a translated XPI language pack.

To apply your newly translated PO file to the unpacked langpack you must use the script apply_po, giving it as its first argument the directory where you unpacked the original XPI, as the second argument the PO file, and as the third argument the directory where you want to create the unpacked translated langpack:

$ ./apply_po dir-en-US firefox-1.0.7_ank-ANK.po dir-ank-ANK

You you can pack the dir-ank-ANK directory to create your installable Anklebonian XPI language pack, via the pack_langpack script, whose first argument is the unpacked directory and second argument is the file that will be created:

$ ./pack_langpack dir-ank-ANK ank-ANK.xpi

Distributing the Translation

Use your imagination :-)

Conclusion

If you are already used to gettext and tools that work with the PO format and want to translate Mozilla products, you may be interested in PO4Moz: it makes your work easy if you only want to translate, but doesn't make you lose any flexibility, as the full, unpacked contents of langpacks are exposed to you, so you can modify them in any way you want.

Finally, a word of caution: PO files generated by this script are huge, so you'll need a quite fast computer for most operations. For example, it takes almost 5 minutes to apply a fully translated PO file to an unpacked Firefox 1.0.7 langpack in a 450-MHz Pentium III machine. And msgmerge operations may take even more time. Fortunately, we only have to do them once for every new version of the software :-)

Credits & License

This document was written by Jacobo Tarrío Barreiro <jaco bo at tarrio dot or g>.

This document is an "associate documentation file" for PO4Moz, so it is subject to its license, whose terms are:

Copyright (c) 2006 Jacobo Tarrío Barreiro

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.