Monday, October 01, 2007

auto l10n build tool: considerations

In Axel's email there was a lot that it is said and at the bottom of this post you can read some things I have highlighted. I'm going to send an email out and maybe file a bug to track things about this project.

Chris Hoffman had been working on an en-IN (india) localization from the en-GB and there are some bugs you can read about.

Our 0.1 milestone might be to create the X-dude langpack; I have to find out what the l10n tools and/or scripts can do for me. I also have to learn python.

-------------------------
Chris Hoffman's en-IN langpack and the X-dude langpack:
http://l10n.mozilla.org/~chofmann/l10n/tree/build/dist/install/
Chris's blog

The bug talking about en-IN:
https://bugzilla.mozilla.org/show_bug.cgi?id=392945
and another one about en-CA, that mentions the necessity of automated l10n build tool:
https://bugzilla.mozilla.org/show_bug.cgi?id=345039#c16

----------------------------
Some of the things I want to highlight from Axel's email:
I envision this code to be python code. We have some existing python code that can extract the localization strings from a working copy,
we're using that to compare localizations, and other tests.

You can find the supporting code on
http://lxr.mozilla.org/mozilla/source/testing/tests/l10n, together with other tools. The scripts are in 'scripts', the modules in lib.

I think that just overwrites are likely not good enough, at least not on
a file level. One way to start would be to just specify the changed entities and use dynamis' l10n-merge work to fill in the rest, if that
could pick strings from other localizations than just en-US. More
flexibility would be a plus, though.

Try to automate localizations like en-IN or en-CA from en-GB

once you installed the modules via setup.py, you should be able to just run it, say
$> python l10n-diff mozilla/browser/locales/en-US l10n/en-GB/browser
It will spit something out that looks like a unified diff, but really isn't.

I'd say that the first good step would be to actually try to reproduce en-IN from en-GB
the next would be to create en-GB from en-US, up to points like accesskey upper-lower-case.

For en-CA and fr-CA, I guess that the language differences are none, so
you'd really only have to do the post-processing that Chris Hofmann did
for en-IN. intl.properties seems to be a candidate, region.properties,
too, the two defines.inc and bookmarks.html (the links need to have the
locale name twice).