Our (website) translation infrastructure has a pretty high barrier for new translators, especially those who are not familiar with Git and/or the command line. Furthermore, the current process makes it hard to add new languages, as often a team cannot be built easily over a long period of time and a web interface could nevertheless help keep translations until a new person arrives.

Corresponding ticket: #10034

Choosing a translation web platform

MUST

  • provide a usable easy web interface
  • be usable from Tor Browser
  • automatic pull from main Git repo
  • provide a common glossary for each language, easy to use and improve
  • allow translators to view, in the correct order, all strings that come from the entire page being translated, both in English and in the target language
  • make it easy to view the translations in context i.e. when translating an entire page, all strings to be translated should only come from this page. translators should be able to view the page in context.
  • provide user roles (admin, reviewer, translator)

SHOULD

  • be "privacy sensitive", i.e. be operated by a non-profit
  • allow translators to push translations through Git (so that core developers only have to fetch reviewed translations from there)
  • provide support for Git standard development branches (devel, stable, and testing) but we could also agree upon translation only master through this interface
  • provide checks for inconsistent translations
  • provide feature to write/read comments between translators

MAY

  • allow translating topic branches without confusing translators, causing duplicate/premature work, fragmentation or merge conflicts -- e.g. present only new or updated strings in topic branches; see https://mailman.boum.org/pipermail/tails-l10n/2015-March/002102.html for details
  • provide a feature to easily see what is new, what needs updating, what are translation priorities
  • provide possibility to set up new languages easily
  • send email notifications
    • to reviewers whenever new strings have been translated or updated
    • to translators whenever a resource is updated
  • respect authorship (different committers?)
  • provide statistics about the percentage of translated and fuzzy strings
  • Letting translators report about problems in original strings, e.g. with a "Report a problem in the original English text" link, that e.g. results in an email being sent to -l10n@ or -support-private@. If we don't have that, then translate MUST document how to report issues in the original English text.

Setup of our translation platform & integration with our infrastructure

Documentation

We have two documentations: - this one, which presents the current status of our work and all public information - translate-server.git which contains more technical information, cronjobs, configs, inner workings of the setup.

Translation web interface

We are testing a Weblate instance to see if it fits our requirements. Read #11759 for more information.

There are several languages enabled and users can suggest translations in several languages:

Translation status

What we plan to do is:

Schematics of the different Git repos, ikiwiki instances, and their relationships.

Schematics of the infrastructure such as Git hooks.

Repository

Currently the repository used by Weblate is built upon the Tails master repo, but the changes generated in translate.lizard are not fed back onto Tails master automatically.

There are several languages enabled, some of them with few or not translations.

You can check out weblate-generated Tails repo with:

git clone https://translate.tails.boum.org/git/tails/index/

This Tails repository has two main differences with other repos:

  • The ikiwiki.setup file has been changed to build more languages. → Ideally, we would create ikiwiki-translationplatform.setup
  • There is a .gitlab.yml file to trigger tests when pushed to gitlab CI enabled repositories. → We can either add this file to tails/master or document this only for people who use gitlab.
  • There are new language files from currently non-active languages on the main website.

Reviewing translation platform output

For languages like fr, fa, de, and it - that are part of tails master repo - you can get the files to review and submit to tails-l10n:

git remote add translations https://translate.tails.boum.org/git/tails/index/
git checkout tails/master
find . -name '*.fa.po' -exec git checkout translations/master --  {} \;
git reset *

And you will have all the changes to farsi (*.fa.po) to review. The same goes for the other languages.

Staging website

From this (or a modified clone of this) repository, a version of the website with more languages will be built #15077 so that users can see how the file they are translating looks.

Updating the repo

POT, then PO files for enabled languages are built and updated from the *.mdwn files when building the wiki.

This process is currently done outside of the Weblate instance, and merged onto the Weblate repo by hand. As translate.lizard has so many languages, if there are many changes this process can take a while.

The changes generated while building the wiki can be fed back to Weblate by cherry picking.

On the long term, we will automate this process by - pulling automatically from tails/master and applying a merge strategy for po #

Weblate installation and maintenance - a hybrid approach

The Tails infrastructure uses Puppet to make it easier to enforce and replicate system configuration, and usually relies on Debian packages to ensure stability of the system. But the effort to maintain a stable system somehow conflicts with installing and maintaining Weblate, a Python web application, which requires using up-to-date versions of Weblate itself and of its dependencies.

Having that in mind, and taking into account that we already started using Docker to replicate the translation server environment to experiment with upgrading and running an up-to-date version of Weblate, it can be a good trade-off to use Puppet to provide an environment to run Docker, and to use a Docker container to actually run an up-to-date Weblate installation.

From the present state of the Docker image, which currently uses (slightly modified/updated) Puppet code to configure the environment and then sets up Weblate, the following steps could be taken to achieve a new service configuration as described above:

  • Move the database to a separate Docker service.
  • Remove all Puppet code from the Docker image: inherit from the simplest possible Docker image and setup a Weblate Docker image with all needed dependencies.
  • Modify the Puppet code to account for setting up an environment that has Docker installed and that runs the Weblate Docker image.
  • Set up persistence for the Weblate git repository and configuration.
  • Set up persistence and backups for the database service.
  • Update the Puppet code to run tmserver (if/when it's needed -- latest Weblate accounts for basic suggestions using its own database).

After that, we should have a clear separation between stable infrastructure maintenance using Debian+Puppet in one side and up-to-date Weblate application deployment using Docker in the other side. The Docker image would have to be constantly maintained to account for Weblate upgrades, but that should be easier cleaner than deploying Weblate directly in the server.