Webmaster Central Blog
Official news on crawling and indexing sites for the Google index
Working with multilingual websites
venerdì, marzo 19, 2010
Webmaster Level: Intermediate
A multilingual website is any website that offers content in more than one language. Examples of multilingual websites might include a Canadian business with an English and a French version of its site, or a blog on Latin American soccer available in both Spanish and Portuguese.
Usually, it makes sense to have a multilingual website when your target audience consists of speakers of different languages.
If your blog on Latin American soccer aims to reach the Brazilian audience, you may choose to publish it only in Portuguese. But if you’d like to reach soccer fans from Argentina also, then providing content in Spanish could help you with that.
Google and language recognition
Google tries to determine the main languages of each one of your pages.
You can help to make language recognition easier if you stick to only one language per page and avoid side-by-side translations.
Although Google can recognize a page as being in more than one language, we recommend using the same language for all elements of a page: headers, sidebars, menus, etc.
Keep in mind that Google ignores all code-level language information, from “lang” attributes to Document Type Definitions (DTD). Some web editing programs create these attributes automatically, and therefore they aren’t very reliable when trying to determine the language of a webpage.
Someone who comes to Google and does a search in their language expects to find localized search results, and this is where you, as a webmaster, come in:
if you’re going to localize, make it visible in the search results
with some of our tips below.
The anatomy of a multilingual site: URL structure
There's no need to create special URLs when developing a multilingual website. Nonetheless, your users might like to identify what section of your website they’re on just by glancing at the URL. For example, the following URLs let users know that they’re on the English section of this site:
http://example.ca
/en/
mountain-bikes.html
http://
en.
example.ca/mountain-bikes.html
While these other URLs let users know that they’re viewing the same page in French:
http://example.ca/
fr
/mountain-bikes.html
http://
fr
.example.ca/mountain-bikes.html
Additionally, this URL structure will make it easier for you to analyze the indexing of your multilingual content.
If you want to create URLs with non-English characters, make sure to use UTF-8 encoding. UTF-8 encoded URLs should be properly escaped when linked from within your content. Should you need to escape your URLs manually, you can easily find an
online URL encoder
that will do this for you. For example, if I wanted to translate the following URL from English to French,
http://example.ca
/fr/
mountain-bikes
.html
It might look something like this:
http://example.ca
/fr/
vélo-de-montagne
.html
Since this URL contains one non-English character (é), this is what it would look like properly escaped for use in a link on your pages:
http://example.ca
/fr/v
%C3%A9
lo-
de-montagne
Crawling and indexing your multilingual website
We recommend that you do not allow automated translations to get indexed. Automated translations don’t always make sense and they could potentially be viewed as spam. More importantly, the point of making a multilingual website is to reach a larger audience by providing valuable content in several languages. If your users can’t understand an automated translation or if it feels artificial to them, you should ask yourself whether you really want to present this kind of content to them.
If you’re going to localize, make it easy for Googlebot to crawl all language versions of your site. Consider cross-linking page by page. In other words, you can provide links between pages with the same content in different languages. This can also be very helpful to your users. Following our previous example, let’s suppose that a French speaker happens to land on
http://example.ca
/en/
mountain-bikes.html
; now, with one click he can get to
http://example.ca
/fr/
vélo-de-montagne.html
where he can view the same content in French.
To make all of your site's content more crawlable, avoid automatic redirections based on the user's perceived language. These redirections could prevent users (and search engines) from viewing all the versions of your site.
And last but not least, keep the content for each language on separate URLs - don't use cookies to show translated versions.
Working with character encodings
Google directly extracts character encodings from HTTP headers, HTML page headers, and content. There isn’t much you need to do about character encoding, other than watching out for conflicting information - for example, between content and headers. While Google can recognize different character encodings, we recommend that you use UTF-8 on your website whenever possible.
If your tongue gets twisted...
Now that you know all of this, your tongue may get twisted when you speak many languages, but your website doesn’t have to!
For more information, read our
post on multi-regional sites
and stay tuned for our next post, where we'll delve into special situations that may arise when working with global websites. Until then, don't hesitate to drop by the Help Forum and
join the discussion
!
Written by Xavier deMorales, Google Search Quality
Hey!
Check here if your site is mobile-friendly.
Etichette
accessibility
10
advanced
195
AMP
13
Android
2
API
7
apps
7
autocomplete
2
beginner
173
CAPTCHA
1
Chrome
2
cms
1
crawling and indexing
158
encryption
3
events
51
feedback and communication
83
forums
5
general tips
90
geotargeting
1
Google Assistant
3
Google I/O
3
Google Images
3
Google News
2
hacked sites
12
hangout
2
hreflang
3
https
5
images
12
intermediate
205
interstitials
1
javascript
8
job search
2
localization
21
malware
6
mobile
63
mobile-friendly
14
nohacked
1
performance
17
product expert
1
product experts
2
products and services
63
questions
3
ranking
1
recipes
1
rendering
2
Responsive Web Design
3
rich cards
7
rich results
10
search console
35
search for beginners
1
search queries
7
search results
140
security
12
seo
3
sitemaps
46
speed
6
structured data
33
summit
1
TLDs
1
url removals
1
UX
3
verification
8
video
6
webmaster community
24
webmaster forum
1
webmaster guidelines
57
webmaster tools
177
webmasters
3
youtube channel
6
Archive
2020
nov
ott
set
ago
lug
giu
mag
apr
mar
feb
gen
2019
dic
nov
ott
set
ago
lug
giu
mag
apr
mar
feb
gen
2018
dic
nov
ott
set
ago
lug
giu
mag
apr
mar
feb
gen
2017
dic
nov
ott
set
ago
giu
mag
apr
mar
feb
gen
2016
dic
nov
ott
set
ago
giu
mag
apr
mar
gen
2015
dic
nov
ott
set
ago
lug
mag
apr
mar
feb
gen
2014
dic
nov
ott
set
ago
lug
giu
mag
apr
mar
feb
gen
2013
dic
nov
ott
set
ago
lug
giu
mag
apr
mar
feb
gen
2012
dic
nov
ott
set
ago
lug
giu
mag
apr
mar
feb
gen
2011
dic
nov
ott
set
ago
lug
giu
mag
apr
mar
feb
gen
2010
dic
nov
ott
set
ago
lug
giu
mag
apr
mar
feb
gen
2009
dic
nov
ott
set
ago
lug
giu
mag
apr
mar
feb
gen
2008
dic
nov
ott
set
ago
lug
giu
mag
apr
mar
feb
gen
2007
dic
nov
ott
set
ago
lug
giu
mag
apr
mar
feb
gen
2006
dic
nov
ott
set
ago
Feed
Follow @googlewmc
Give us feedback in our
Product Forums
.
Subscribe via email
Enter your email address:
Delivered by
FeedBurner