Google has always been telling webmasters to treat its crawlers - GoogleBot in other words - just as you would treat any user accessing your website from the United States. By default, Googlebot requests pages without setting an Accept-Language HTTP request header and uses US IP addresses. As a result, not all content variants of locale-adaptive pages may be indexed completely.
This is now going to change because Google may crawl your site from IP addresses outside of the US and also may crawl your site with language settings outside of English-US.
Google has announced that they now support locale-aware crawling by Googlebot - this is a huge change for internationalized sites.
What does this mean?
Some sites that offer internationalized content do so without sending the user to a special URL. Google has always preferred you set up specific URLs or ccTLDs for content tailored to different countries or languages, but many sites just dynamically serve content on their .com based on the IP address origin or their browser language configurations.
Google is now going to support the sites that dynamically serve internationalized content based on IP or language. They will do this based on two methods:
- Geo-distributed crawling where Googlebot would start to use IP addresses that appear to be coming from outside the USA, in addition to the current IP addresses that appear to be from the USA that Googlebot currently uses.
- Language-dependent crawling where Googlebot would start to crawl with an Accept-Language HTTP header in the request.
As these new crawling configurations are enabled automatically for pages Google detects to be locale-adaptive, you may notice changes in how the content on your site is crawled and displayed in Google search results without you altering your server settings.
These new configurations do not alter Google's recommendation to use separate URLs with rel=alternate hreflang annotations for each locale. Google still supports and recommends using separate URLs as they are still the best way for users to interact and share your content, and also to maximize indexing and better ranking of all variants of your content.