International brands have their work cut out for them. Building a consistent brand experience across multiple continents and to audiences that speak different languages is no easy task, and the process of translating individual pages from one language to another is time consuming and resource intensive.
Unfortunately, much of this work can go to waste if the right steps aren’t taken to help search engines understand how your site has been internationalized.
To help you prevent this, we’ve collected a list of “Do’s and Don’ts” to help guide your internationalization efforts and ensure that your pages get properly indexed by search engines.
Do conduct language specific keyword research
The direct translation of a keyword will not necessarily be what users are searching for in that language. Rather than simply taking the translation at face value, you will have more success if you take a look at your options in the Google Keyword Planner to see if there are other phrasings or synonyms that are a better fit.
Remember to update your location and language settings within the planner, listed just above the “keyword ideas” field:
Don’t index automatic translation
Automatic translation can be better than nothing as far as user experience goes in some circumstances, but users should be warned that the translation may not be reliable, and pages that have been automatically translated should be blocked from search engines in robots.txt. Automatic translations will typically look like spam to algorithms like Panda and could hurt the overall authority of your site.
Do use different URLs for different languages
In order to ensure that Google indexes alternate language versions of each page, you need to ensure that these pages are located at different URLs.
Avoid using browser settings and cookies to change the content listed at the URL to a different language. Doing so creates confusion about what content is located at that URL.
Since Google’s crawlers are typically located in the United States, they will typically only be able to access the US version of the content, meaning that the alternate language content will not get indexed.
Again, Google needs a specific web address to identify a specific piece of content. While different language versions of a page may convey the same information, they do so for different audiences, meaning they serve different purposes, and Google needs to see them as separate entities in order to properly connect each audience to the proper page.
We highly recommend using a pre-built e-commerce platform like Shopify Plus or Polylang for WordPress in order to ensure that your method for generating international URLs is consistent and systematic.
Don’t canonicalize from one language to another
The canonical tag is meant to tell search engines that two or more different URLs represent the same page. This doesn’t always mean the content is identical, since it could represent page alternates where the content has been sorted differently, where the thematic visuals are different, and other minor changes.
Alternate language versions of a page, however, are not the same page. A user searching for the Dutch version of a page would be very disappointed if they landed on the English version of the page. For this reason, you should never canonicalize one language alternate to another, even though the content on each page conveys the same information.
Do use “hreflang” for internationalization
You may be wondering how to tell search engines that two pages represent alternate language versions of the same content if you can’t use canonicalization to do so. This is what “hreflang” is for which explicitly tells the search engines that two or more pages are alternates of one another.
There are three ways to implement “hreflang,” with HTML tags, with HTTP headers, and in your Sitemap.
1. HTML Tags
Implementing “hreflang” with HTML tags is done in the <head> section, with code similar to this:
<title>Title tag of the page</title>
<link rel=”alternate” hreflang=”en”
<link rel=”alternate” hreflang=”es”
<link rel=”alternate” hreflang=”it”
Where hreflang=”en” tells search engines that the associated URL https://example.com/page1/english-url is the English alternate version of the page. URLs must be entirely complete, including http or https and the domain name, not just the path. The two letter string “en” is an ISO 639-1 code, which you can find a list of here. You can also set hreflang=”x-default” for a page where the language is unspecified.
Each alternate should list all of the other alternates, including itself, and the set of links should be the same on every page. Any two pages that don’t both use hreflang to reference each other will not be considered alternates. This is because it’s okay for alternates to be located on different domains, and sites you do not have ownership of shouldn’t be able to claim themselves an alternate of one of your pages.
In addition to a language code, you can add an ISO 3166-1 alpha-2 country code. For example, for the UK English version of a page, you would use “en-GB” in place of “en.” Google does advise having at least one version of the page without a country code. You can apply multiple country codes and a country-agnostic hreflang to the same URL.
2. HTTP header
As an alternative to HTML implementation, your server can send an HTTP Link Header. The syntax looks like this:
Link: <https://example.com/page1/english-url>; rel=”alternate”; hreflang=”en”,
<https://example.com/page1/spanish-url>; rel=”alternate”; hreflang=”es”,
<https://example.com/page1/italian-url>; rel=”alternate”; hreflang=”it”
The rules regarding how to use them are otherwise the same.
Finally, you can use your XML sitemap to set alternatives for each URL. The syntax for that is as follows:
Note that the English version of the page is listed both within the <loc> tag and as an alternate.
Keep in mind that this is not complete. For it to be complete you will also need separate <url> sections for the Spanish and Italian pages, each of them listing all of the other alternates as well.
Don’t rely on the “lang” attribute or URL
Google explicitly does not use the lang attribute, the URL, or anything else in the code to determine the language of the page. The language is determined only by the language of the content itself.
Needless to say, this means that your page content should be in the correct language. But it also means:
- The main content, navigation, and supplementary content should all be in the same language
- Side by side translations should be avoided. Don’t display translations on the page, just make it easy for users to switch languages.
- If your site offers any kind of automatic translation, make sure that this content is not indexable
- Avoid mixing language use if at all possible, and if it is necessary, make sure that the primary language of the page dominates any others in substance
Do allow users to switch languages
For any international business, it’s a good idea to allow the users to switch languages, usually from the main navigation. For example, Amazon allows users to switch languages from the top right corner of the site:
Do not force the user to a specific language version of the page based on their location. Automatic redirection prevents both users and search engines from accessing the version of the site that they need to access. Google’s bots will never be able to crawl alternate language versions of a page if they are always redirected to the US version of the site based on their location.
Turning to Amazon for our example once again, we are not prevented from accessing amazon.co.jp, but we do have the option of switching to English:
Don’t create duplicate content across multiple languages
While you should not canonicalize alternate language versions of one page to another, if you use alternate URLs for pages meant for different locations but the language and content are identical, you should use the canonical tag. For example, if the American and British versions of a page are identical, one should consistently canonicalize to the other. Use hreflang as discussed above to list them as alternates with the same language but for different locations.
Use these guidelines to make sure users from all of your target audiences will be able to find your pages in the search results, no matter where they are located or what language they speak.