Localisation and Translation on the Web

Coming from the English-speaking world, it can be easy to maintain the bubble that is the English-speaking World Wide Web. But in fact, more than half of web pages are written in languages other than English.

Since starting work at eyeo, I’ve had to think a lot more about localisation and translations because most of our websites are translated into several languages, something I previously didn’t have to really consider before. Once you decide to translate a web page, there are many things to take into account, and a lot of them I've found are useful even if your website is written in only one language.

What Language is the Website? #

One of the first things we need to think about is what language the HTML document (or elements within) are written in. Using the lang attribute, we can let browsers know what language the web page is written in.

Typically, we would add this attribute to the root element of the document, which is in most cases the HTML element.

<html lang="en">

Adding this attribute to the root element is really important, particularly for users who’s main language used with their machine is not the same as the web page’s language. For example, a French-speaking user who visits this blog.

In the absence of the lang attribute, the browser will assume the web page is written in the user’s default language, which can lead to some strange results. Here’s an example of a screen reader reading an English web page in a French accent due to a missing lang attribute -

The lang attribute is one of the global HTML attributes which allows it to be applied to any HTML element. This means that we can specify different sections of our web page as being written in different languages. This can be really useful, for example, if you are writing an article that references a text in a different language, comme ça, par exemple.

<html lang="en">
<body>
<h1>Localisation and Translation on the Web</h1>
<p>
This can be really useful, for example, if you are writing an article that references a text in a different language, <strong lang="fr">comme ça, par exemple</strong>
</p>
</body>

Specifying Language for External Pages #

When using the lang attribute, we can tell user agents what language the content on the current webpage is, but what about if we need to link to an external page/resource?

We can specify the language of an externally linked resource using the hreflang attribute. As it’s name implies, it sets the language of a resource linked to via the href attribute, and can thus only be applied to elements that have this attribute, i.e. the <a>, <link>, and <area> elements.

<a href="https://adblockplus.org/ar/" hreflang="ar">adblockplus.org (Arabic)</a>

Controlling Translations #

In some cases, we may want a section of the web page to always be displayed in a certain language, never translated. This is the idea behind the new HTML5.1 translate attribute.

The translate attribute can accept one of two values:

  • yes: The element’s contents should be translated
  • no: The element’s contents should not be translated
<html lang="de">
<p>Übersetze mich!</p> <!-- Translate me! -->
<p translate="no">Übersetze mich nicht</p> <!-- Do not translate me -->
</html>

Not currently supported

Unfortunately, this attribute is not currently supported by any browser. However, it's effect can be simulated by using the .notranslate class, which is respected by Google’s Web Page Translator. Take, for example, the following two paragraphs:

<html lang="de">
<p>Übersetze mich!</p> <!-- Translate me! -->
<p class="notranslate">Übersetze mich nicht</p> <!-- Do not translate me -->
</html>

If this page is translated to another language, only the first paragraph will be translated.

First paragraph translated, second paragraph not translated

Text Direction #

In many languages, the direction in which text is written is not left-to-right like it is in English. In languages such as Arabic, text is written (and read) from right-to-left.

To change the direction in which text is written, we can use the dir attribute, which accepts one of three values:

  • ltr: Left to Right
  • rtl: Right to Left
  • auto: Allow the user agent to decide which direction based on the text content
<html lang="ar" dir="rtl">

Based on this root direction, most browsers will apply the corresponding CSS styles to switch the direction in which text is displayed, using the direction property.

The CSS direction property accepts one of two values - ltr or rtl.

html[dir="rtl"] {
direction: rtl;
}

This property works in the same way as the text-align property. It doesn’t re-order the words in any way, it just aligns the text in the appropriate direction.

Other relevant CSS properties for controlling text direction include:

  • writing-mode: This determines if text is laid out horizontally or vertically and the direction (See MDN)
  • text-orientation: This determines the orientation of each character (See MDN)

Alternates #

For most sites that are translated into different languages, there are separate pages for each language. For example, there might be several versions of the homepage -

  • https://adblockplus.org/en/ for the English version
  • https://adblockplus.org/ar/ for the Arabic version
  • ...

In order for user agents to know of all these separate pages and classify them correctly as the same page, just translated into different languages, we can use the <link> element, with the alternate relationship type. In the document <head> we can write out all the alternate versions for the page -

<html lang="en">
<head>
<link rel="alternate" href="https://adblockplus.org/ar" hreflang="ar">
...
</head>
</html>

Note that we use the hreflang attribute in combination with the alternate type to set the language each alternate page is in.

Alternates for Social Media #

When a link to a web page is shared, it’s language is typically determined from the og:locale meta tag.

<meta name="og:locale" content="en_US">

If there are multiple available locales, we can specify this using the og:locale:alternate meta tag.

<meta property="og:locale:alternate" content="ar_AR">

Left, Right, Start, End #

Because most of the web was originally written with only English in mind, a lot of CSS was written with the mindset that the start of a line is the left, and the end of the line is the right. But as the web becomes more internationally aware, things are changing.

For example, with Flexbox, the default “left” side of the box is called the “start”, because this can be on any of the four sides of the box itself. A lot of new CSS properties are starting to work this way, for example the new margin-inline-start property.

The margin-inline-start property corresponds to the inline “start” margin of an element, and can be equal to any of the four sides of the element depending on the direction of the document. For example, if the direction on an element is right-to-left, then the start margin will be equivalent to the right margin.

span {
direction: rtl;
margin-inline-start: 20px; /* Equivalent to margin-right */
}

Right Margin on a span element

Similarly, if the writing-mode of an element is set as vertical and left-to-right, then the start margin will be equivalent to the top margin.

span {
writing-mode: vertical-lr;
margin-inline-start: 20px; /* Equivalent to margin-top */
}

Top Margin on a span element

There are other properties that work in this way, for example the corresponding margin-inline-end, which works similarly to margin-inline-start, but applies to the end of the element. Taking the first example from above, if the direction on an element in right-to-left, then the end margin will be equivalent to the left margin.

span {
direction: rtl;
margin-inline-end: 20px; /* Equivalent to margin-left */
}

Left Margin on a span element


There are so many more considerations that are involved when creating a localised website, particularly one that accepts user input. Feel free to share any tips you have come across in the comments below.

Keep in touch KeepinTouch

Subscribe to my Newsletter 📥

Receive quality articles and other exclusive content from myself. You’ll never receive any spam and can always unsubscribe easily.

Elsewhere 🌐