
Hello all, I'd like to bring up localization. Right now, translation is done on Transifex and then just put into individual translated pages. The real downside with this is that it's not obvious to users who come to a given page, how to get to that page in their language (if it even exists). What would improve this process is using a passive localization script that detects the user's browser language (like l10n.js) and then swaps out chunks of content based on that. You can designate a fallback language. Considering translation is spotty for several key languages, I think that this approach is better than the current model. And because it's pulling in content from translation files, there's less html to maintain. Thoughts? ~Griffin [1] https://github.com/eligrey/l10n.js/

Hi, On Fri, Jan 10, 2014 at 6:38 PM, Griffin Boyce <griffin@cryptolab.net> wrote:
I'd like to bring up localization. Right now, translation is done on Transifex and then just put into individual translated pages. The real downside with this is that it's not obvious to users who come to a given page, how to get to that page in their language (if it even exists).
IIRC the Tor project is using gettext and Transifex for its programs. There are no translations of the homepage at the moment. Back when they existed they were not managed using gettext. That said, I also like the idea of using gettext for translations.
What would improve this process is using a passive localization script that detects the user's browser language (like l10n.js) and then swaps out chunks of content based on that. You can designate a fallback language. Considering translation is spotty for several key languages, I think that this approach is better than the current model.
You don't even need the magic in the browser. As an example of what could be done look at the Tails projects homepage tails.boum.org. This is static html compiled from markdown using ikiwiki (basically a Perl based static generator on top of a VCS). The translations are managed using gettext, that is chunks of the markdown master are translated in po-files. The one downside is that translators need to be able to edit po-files. Unfortunately I don't think that Transifex helps much here: When you are translating a complete web page, you need a lot more context than when translating isolated strings in a program. But such an interface for translator could maybe be designed?

hello, On 10.01.2014 20:01, Frithjof wrote:
[....] The one downside is that translators need to be able to edit po-files. Unfortunately I don't think that Transifex helps much here: When you are translating a complete web page, you need a lot more context than when translating isolated strings in a program.
disclaimer: i know nothing about transiflex, so i can't make useful comparisons. i'm also not a localization expert or a pootle developer. take everything i say with a grain of salt. that being said, may i bring up pootle? it seems that torproject.org once used pootle [http://pootle.translatehouse.org], according to https://trac.torproject.org/projects/tor/ticket/6851 but moved on to transiflex. pootle understands POs (and a bunch of other formats) and the web interface is decent enough. the translation entries are listed as in the input-file, so the translator gets context from the previous and next entries. take a look at this screenshot: https://imgur.com/pZLN5qe (with some entries from the tor FAQ) which seems more helpful than: https://ds0k0en9abmn1.cloudfront.net/static/images/com/carousel-editor.77070... regarding context. some potential improvements to pootle (which also has a focus on software translations): * sneak links in there, so the translator can jump to the corresponding part of the website to get more context * highlight outdated paragraphs (instead of filtering as is currently possible, so we maintain the context of the last and following paragraph) * show rendered markdown (or $format) and show referenced images/screenshots * make the edit-box a WYSIWYG markdown/$format editor * ...
But such an interface for translator could maybe be designed?
what are the requirements of the translators? maybe some translators are around here and tell us about their workflow. once we have the requirements we can start looking into our options (transiflex, (vanilla or customized) pootle, custom tool or something different all together) and how good they fit those requirements. anyway, i'm willing to put some time into making translations "work" (be it integration scripts, hacking on pootle, help with a custom tool) cheers, armin

On Sat, Jan 11, 2014 at 6:42 PM, Armin <armin.baj@gmx.de> wrote:
On 10.01.2014 20:01, Frithjof wrote:
[....] The one downside is that translators need to be able to edit po-files. Unfortunately I don't think that Transifex helps much here: When you are translating a complete web page, you need a lot more context than when translating isolated strings in a program.
disclaimer: i know nothing about transiflex, so i can't make useful comparisons. i'm also not a localization expert or a pootle developer. take everything i say with a grain of salt.
This looks more like what I was thinking of.
But such an interface for translator could maybe be designed?
what are the requirements of the translators? maybe some translators are around here and tell us about their workflow. once we have the requirements we can start looking into our options (transiflex, (vanilla or customized) pootle, custom tool or something different all together) and how good they fit those requirements.
I am actually way out of my knowledge domain here. I started doing translations recently and have been thinking about requirements since. I also tried to look around how other projects are handling translations. The problem is certainly not new, so if somebody knows about good solutions, please tell. There have to be some out there. The main question I see is: How can one make the translations survive. There already were translations once, but at some point many weren't maintained. This leads to some requirements that I think are important: * In a open-source project it is not clear that someone is caring for all the translations all the time. This might be possible in a commercial setting, but here it might happen that a translation is not maintained for some month, say, until a new maintainer shows up. (Maybe there even isn't a proper maintainer or team and all we can hope for is different people fixing and translating parts randomly.) Thus it is vital to be able to tell which translations are correct or up to date. Wrong documentation is worse than none (or only English documentation). Using something like gettext solves this nicely. Outdated translations are marked fuzzy and are not used until they are corrected or confirmed. This way a translation can deteriorate in extend but now in correctness. * The translation needs to be sustainable in the sense that translator will leave and the entry barrier for new translator should be low. Even if a whole team closes down, it should be easy for others to inherit. Transifex helps here by making it really easy to get started and for example by having projects specific glossaries, so the knowledge and decisions of past translator is still available (or at least could be). Learning specialized tools and having commit access to some git repository instead already is quite a barrier.
anyway, i'm willing to put some time into making translations "work" (be it integration scripts, hacking on pootle, help with a custom tool)
I am also interested in making this work, but I am far from being a web developer.

On 11.01.2014 23:40, Frithjof wrote:
On Sat, Jan 11, 2014 at 6:42 PM, Armin <armin.baj@gmx.de> wrote: [...]
But such an interface for translator could maybe be designed? what are the requirements of the translators? maybe some translators are around here and tell us about their workflow. once we have the requirements we can start looking into our options (transiflex, (vanilla or customized) pootle, custom tool or something different all together) and how good they fit those requirements.
I am actually way out of my knowledge domain here. I started doing translations recently and have been thinking about requirements since. I also tried to look around how other projects are handling translations. The problem is certainly not new, so if somebody knows about good solutions, please tell. There have to be some out there.
The main question I see is: How can one make the translations survive.
There already were translations once, but at some point many weren't maintained. This leads to some requirements that I think are important:
* In a open-source project it is not clear that someone is caring for all the translations all the time. This might be possible in a commercial setting, but here it might happen that a translation is not maintained for some month, say, until a new maintainer shows up. (Maybe there even isn't a proper maintainer or team and all we can hope for is different people fixing and translating parts randomly.)
Thus it is vital to be able to tell which translations are correct or up to date. Wrong documentation is worse than none (or only English documentation). Using something like gettext solves this nicely. Outdated translations are marked fuzzy and are not used until they are corrected or confirmed. This way a translation can deteriorate in extend but now in correctness.
* The translation needs to be sustainable in the sense that translator will leave and the entry barrier for new translator should be low. Even if a whole team closes down, it should be easy for others to inherit.
Transifex helps here by making it really easy to get started and for example by having projects specific glossaries, so the knowledge and decisions of past translator is still available (or at least could be). Learning specialized tools and having commit access to some git repository instead already is quite a barrier.
i fully agree, the critical problem is to keep the translations up to date. a low barrier of entry is certainly something to shoot for. there are certain (already discussed) things we can do from the software side of things, like show outdated content (in context) and sending notifications about new/changed content. a glossary helps of course, as well as translation memory (both are available in pootle and mediawiki:translate, i think). but apart from those: not sure there is some magic sauce, that will help to keep translations current. now after some research (as in search engine usage) i want to bring up mediawiki:translate [https://www.mediawiki.org/wiki/Extension:Translate]. Last time i looked at mediawiki itself i wasn't too impressed, but the translation extension seems rather nice (without having actually played with it). they also have a workflow/ui spec at: https://upload.wikimedia.org/wikipedia/commons/4/4a/Translate-workflow-spec.... including a 'page' translation mode, which might be what is called for here: https://commons.wikimedia.org/w/index.php?title=File:Translate-workflow-spec... https://commons.wikimedia.org/w/index.php?title=File:Translate-workflow-spec... mediawiki:translate is used at https://translatewiki.net to translate a bunch of stuff. integration between the site generator and mediawiki should also be possible (converting between mediawiki syntax and e.g. markdown, if needed) as tor already seems to be okay with using an external service for translations: maybe translatewiki.net is the answer (existing community, page translation mode). we should maybe start a simple matrix/table, so we can start to evaluate the different tools. to narrow things down a bit and identify their pain points. i see the following categories: * ease to get started * glossary support * support for translating paragraphs in context (and not just short strings) * "advanced" CAT features (translation memory, online machine translation lookup, etc.) * "up-to-dateness" support (display of outdated content (in context), notifications) * review process support * existing community * user management/admin * open source/able to customize, if needed * integration effort (between site-generation and translation tool) * ... and probably a bunch of other stuff i forgot those categories need some fleshing out, of course. and on the tool axis * transiflex * pootle * mediawiki:translate/translatewiki.net * some other online CAT tool? (poeditor.com?) then we can grade those in the defined categories (using grades from 1 to 10? or maybe simply bad(-), mediocre(o) and good(+)). i'm willing to setup a pootle and mediawiki:translate instance for evaluation purposes, if there's interest. each category should be graded by the same person, to keep grades consistent.
anyway, i'm willing to put some time into making translations "work" (be it integration scripts, hacking on pootle, help with a custom tool)
Id he am also interested in making this work, but I am far from being a web developer.
well, that's were i could help :-)

Frithjof:
Unfortunately I don't think that Transifex helps much here: When you are translating a complete web page, you need a lot more context than when translating isolated strings in a program.
My opinion on this: The current community of people working on translating Tor software is on Transifex. The tool is already working for them. I suggest we try to push website translations and see what they think. If the translators are unhappy with Transifex to work on web pages, then we can try different things to see what they would like. But let's focus on people here, tools should come after. The other thing is that I think we should have a staging environment where translators can review their translation on a copy of the website right after pushing a change. This would make them able to verify their translation in the best context possible. Setting up a staging environement is tracked in <https://bugs.torproject.org/10597>. Also, it's probably worthwhile to mention that translations *will* have to be manually reviewed before being merged in the official website to prevent attacks based on malicious code. -- Lunar <lunar@torproject.org>

On 13.01.2014 02:36, Lunar wrote:
Frithjof:
Unfortunately I don't think that Transifex helps much here: When you are translating a complete web page, you need a lot more context than when translating isolated strings in a program.
My opinion on this:
The current community of people working on translating Tor software is on Transifex. The tool is already working for them. I suggest we try to push website translations and see what they think. If the translators are unhappy with Transifex to work on web pages, then we can try different things to see what they would like. But let's focus on people here, tools should come after.
you're probably right. go the familiar route and only change if the pain points are real (and not only imaginary). if transifex is open to add (small) features for website translations, that of course would be ideal. the main problem is the lost context when you're filtering over untranslated strings. e.g. you select a paragraph/string but have no (easy) way to get to the next or previous paragraphs (assuming they already have been translated). that IMHO is a *real* pain in the ass. but then you probably could just keep a browser with the real website open and get the context from there. will the transifex integration be handled by the software guys? (as in: the same people who did the integration to translate tor itself) or is there need for any help?
The other thing is that I think we should have a staging environment where translators can review their translation on a copy of the website right after pushing a change.
maybe a point to raise with the back-end/cms guys: the build-times for the site-generator shouldn't get out of hand.
This would make them able to verify their translation in the best context possible.
Setting up a staging environement is tracked in <https://bugs.torproject.org/10597>.
Also, it's probably worthwhile to mention that translations *will* have to be manually reviewed before being merged in the official website to prevent attacks based on malicious code.

Armin:
will the transifex integration be handled by the software guys? (as in: the same people who did the integration to translate tor itself) or is there need for any help?
I am not sure I understand this question. We will need to work out the scripts to push and pull original material and translations from Transifex as part as the work on the website. (Also, please don't assume the gender of the people working around here.) -- Lunar <lunar@torproject.org>
participants (4)
-
Armin
-
Frithjof
-
Griffin Boyce
-
Lunar