Making multilingual sites is hard. I’ve been using Drupal since version 5 and I can say a few things about the evolution of Drupal multilingual capabilities:

  • First, Drupal 8 is – in my opinion – the first version of Drupal where someone could say that multilingual works, pretty much out of the box.
  • Second, the documentation about how to deal with different scenarios is quite good.
  • And third, from a user experience perspective, translating the user interface of a site is really hard.

In this post we will talk about the third point and what we did to manage that complexity.

Our Current Scenario

We are building a really complex site, and the main challenges we faced regarding multilingual are:

  • The site is a multisite architecture, one database using Organic Groups.
  • Each group represents a country, and each country needs its site in one or more languages.
  • We have several variations of the same language depending on the region this language is spoken in.
  • We don’t want to let content editors translate the site directly from the UI.
  • We don’t speak all the languages the site is in.

The last item is quite relevant, when you don’t speak a language, you cannot even be sure if the string you are copying into a textbox says what it should.

The First Attempt

We started with a translation matrix to store all the translations. A simple Google drive spreadsheet to track each string translation in each language.

Each column uses the language code as a header.

Using a tool to convert Spreadsheets into po files we get each translation file fr.po, es.po, pt.po.

We used wichert/po-xls to achieve this with good results.

Not So Fast

This initial, somewhat naive, approach had a few problems.

  • Drupal string translations are case sensitive. This means that if you made a typo and wrote Photo instead of photo the translation will fail.
  • Some strings are the result of a calculation. For example. Downloads: 3 is actually managed by Drupal as Downloads: @count.

But the more complex item is that Drupal 8 has two ways to translate strings. The first one is inherited from Drupal 7. The one that makes use of the well known t function for example t('Contact us.').

The other one is a new way that allows site builders to translate configuration entities.

The two scenarios that allow translation of a Drupal site.

Translating Configuration Entities is Really Hard

To translate configuration entities, you need to identify which configuration needs translation, and find the exact part relevant to you. For complex configuration entities like views, this could be quite hard to understand.

Even for an experienced site admin this can be hard to understand.

Another problem that we had to solve was the vast amount of configuration alternatives you have when dealing with a medium-size Drupal site.

Each category has a lot of items to translate.

It was clear to us that in order to translate all those items we needed to find another way.

More problems… Identifying Which Strings to Translate is Hard

One thing to consider when dealing with Drupal translations is that it’s not easy to identify if a string is displayed somewhere in the frontend or if it is only a backend string.

Translating the entire codebase may not be a viable option if you want to keep a short list of translations reviewed by a group of people. In our case, it was important to make sure that translations are accurate, and that translators do not feel overwhelmed.

We don’t have a great solution to this problem yet. One of the strategies we used was to search for all the strings in twig templates and custom modules code using a grep search.

egrep -hro "[\>, ]t\('.*'\)" . | cut -c 5-   # Get strings inside ->t(...) and t(...)
egrep -hro "{{ '.*'\|\t" .                   # Get twig strings '....'|t
egrep -hro " trans .*" .                     # Get twig strings inside trans

However, as we figured out later by reading the documentation, twig strings cannot be used as a source for translations. Internally, Drupal maps those strings back to the regular use of t('strings').

This means that strings like:

{% trans >}}Copyright {{ year }}{% endtrans >}}

Are actually converted to

t('Copyright @year')

And that last string is the one you should use as source of the translation.

At the end, we cleaned up the spreadsheet list using visual inspect, and so far it has been working fine.

How We Solved the Problems?

To recap the problems we had:

  • We did not want to translate all the available strings.
  • We did not know all the languages, therefore copy and pasting was a risk.
  • Translators were expecting to have a reduced number of strings to translate.
  • Configuration translations are quite complex to track.

As we mentioned before using the xls-to-po tool, we were able to obtain the PO files to translate one part of the strings that we needed to translate.

We also used drush_language to automate the process.

drush language-import --langcode=fr path/to/po_files/fr.po

This little snippet iterates over all of the po files in the po_files directory and imports the language using the drush command mentioned above.

find po_files -type f -name *.po | xargs basename --suffix=.po | \
xargs [email protected] drush language-import --langcode=@ @.po

The xls spreadsheet has in the first column the Message Id, and the language codes of the system

By using conditional cell colors, we can quickly identify which translations are pending.

Solving the Configuration Translation Problem

The second part of our problem was a bit more tricky to fix.

We used a custom script to get all the config entity strings that were relevant to us.

Here is a simplified version of the script.

$prefix = 'views.view.custom_view';
$key = 'display.default.display_options.exposed_form.options.reset_button_label';

$configFactory = \Drupal::service('config.factory');
$list = $configFactory->listAll($prefix);

$rows = [];

foreach ($list as $config_name) {
  $columns = [];
  // Add the unique identifier for this field.
  $columns[] = $config_name . ':' . $key;

  // Get the untranslated value from the config.
  $base_config = $configFactory->getEditable($name);
  $columns[] = $base_config->get($key);

  $rows[] = $columns;
}

If you wonder how to get the $prefix and $key, they are obtained by inspecting the name of the field we want to translate in the Configuration Translation UI.

You need to inspect the HTML of the page, see the name attribute.

We print the result of the script to obtain a new CSV file that looks like this

The first column is a unique id that combines the prefix and the key.

Then, we copy and paste this CSV file as a new tab in the general translation matrix, and complete the header with the rest of the languages translations.

Finally we use a spreadsheet formula to find the translation we want for the languages we are interested in.

=IFERROR(VLOOKUP($B2,$Strings!$A$2:Y299,COLUMN()-1,0);"")

This will search for a match in the Strings matrix, and provide a translation.

Spreadsheet magic.

Final step: Importing the Configuration Strings Translation Back to Drupal

Once we have all the translations we need. We export the CSV file again and use this other script (simplified version) to do the inverse process:

use Symfony\Component\Serializer\Serializer;
use Symfony\Component\Serializer\Encoder\CsvEncoder;
use Symfony\Component\Serializer\Normalizer\ObjectNormalizer;

$filename = 'path/to/config_translations.csv';

$serializer = new Serializer([new ObjectNormalizer()], [new CsvEncoder()]);
$configFactory = \Drupal::service('config.factory');
$languageManager = \Drupal::service('language_manager');

$serializer->encode($data, 'csv');
$data = $serializer->decode(file_get_contents($filename), 'csv');

foreach ($data as $row) {
  $name_key = array_values($row)[0];
  list($name, $key) = explode(':', $name_key);

  // The languages we care start after the second column.
  $languages = array_filter(array_slice($row, 2));

  foreach ($languages as $langcode => $translation) {
    $config_translation = $languageManager
                            ->getLanguageConfigOverride($langcode, $name);
    $saved_config = $config_translation->get();
    $config_translation->set($key, $translation);
    $config_translation->save();
  }
}

Some Other Interesting Problems We Had

Before finishing the article, we would like to share something interesting regarding translations with contexts. As you may know, context allows you to have variations of the same translation depending on, well… context.

In our case, we needed context to display different variations of a French translation. In particular, this is the string in English that we needed to translate to French:

Our organization in {Group Name}

In France, this translates into Notre organisation en France. But if you want to say the same for Canada, due to French grammatical rules you need to say Notre organisation au Canada (note the change en for au).

We decided to create a context variation for this particular string using context with twig templating.

{% trans with {'context': group_iso2_code} >}}
Our organization in { group_name }
{% endtrans >}}

This worked ok-ish, until we realized that this affected all the other languages. So we need to specify the same translation for each group even if the language was not French

This is not what we want...

After some research we found the translation_fallback module but unfortunately it was a Drupal 7 solution.

Long story short, we ended up with this solution.

{% if group_uses_language_context >}}
  {% trans with {'context': country_iso2_code} >}}
    Our organization in { group_name }
  {% endtrans >}}
{% else >}}
  {% trans >}}Our organization in { group_name }{% endtrans >}}
{% endif >}}

Which basically provides two versions of the same string. But if the group needs some special treatment, we have the change to override it. Lucky for us, xls-to-po has support for strings with context. This is how we structured the translations for strings that require context:

CA, in this case, is the ISO code for Canada

Conclusion

For us, this is still a work in progress. We will have to manage around 20 or more languages at some point in the project. By that point, having everything in a single spreadsheet may not be maintainable anymore. There are other tools that could help us to organize source strings. But so far a shared Google Sheet worked.

We still use configuration management to sync the strings in production. The snippets provided in this post are run against a backup database so we can translate all the entities with more confidence. Once we ran the script we use drush config:export to save all the translations to the filesystem.