SEOIs Your Site Configured For the International Googlebot?

Is Your Site Configured For the International Googlebot?

Google has improved the way it crawls, indexes and ranks sites with international content. Understanding these changes is crucial for those creating content for a global audience.

Every once in a while, a major shift happens in the SEO tectonic crust. In January, no one seemed to notice the tremor, but there was a change in the foundation of international SEO best practices when Google improved the way Googlebot looks at how content changes based on location (IP) or a user’s preferred language settings (Accept-Language HTTP header). More and more sites are choosing to dynamically change content that is presented to a customer based on their country or language.

Google admits to having issues finding and indexing different language and international content in this way. According to Search Console Help, Google may not crawl, index or rank international content because the Googlebot crawler’s default IP addresses are based in the U.S. Additionally, the crawler sends HTTP requests without setting Accept-Language in the request header.

The article goes on to explain that Google is now doing both. For SEOs who have clients creating content for non-English speaking or international audiences, it’s important to understand these changes and how to check site configurations to understand the way they affect international search results.

Googlebot Crawling from Non-U.S. IP Addresses

Googlebot started crawling, not just from the typical U.S.-based IPs, but also from the IPs of different countries. Based on the IP location, this allows Google to understand if content for a particular page changes for users internationally and assess if this newly-discovered version of the page may be more relevant for the search results of that particular country.

As you can imagine, this greatly improves the search experience for international users by ensuring the version of a page that was intended for their particular country is getting presented in search results.

How Can I Check My Site’s Configuration?

If yours or your client’s site is dynamically serving different content based on the user’s IP, you can check by using an international proxy service. Most crawlers – such as Screaming Frog, seen below – allow for proxy configuration, which will help automate finding SEO issues at scale from the perspective of international users.

screaming-frog-crawl-from-proxy

Accept-Language Header

As more sites use the Accept-Language Header to dynamically change the language of a given page, Google’s new locale-adaptive crawling is sending it. Now, when Googlebot sends a request for your site’s page, it can specify a preferred language. This is exactly like configuring your own browser to prefer certain languages, as seen in the image below.

chrome-language-preferences

How Can I Check My Site’s Configuration?

With Merkle’s Local-Adaptive Pages Testing Tool, you can specify the languages you’d like to check; specify up to 10 URLs; choose between Normal, Google, and Bing user-agents; and Run.

locale-adaptive-accept-language-header-tool

As you can see in the results for cloud.google.com, while all content is dynamically changing based on the Accept-Language header, only Japan is properly configured.

locale-adaptive-accept-language-content-comparison-tool

Conclusion

While this change may have not had a huge impact on SEO as it exists today, it’s ramifications on the future are massive. In addition to catering to dynamically-served content (Locale-Aware crawling), Google still prefers using separate URLs with proper annotations for different content.

google-prefers-separate-urls-with-annotations

It’s important to understand the contradiction here. Why does Google prefer separate URLs? Could it be because the more sites begin dynamically changing content, the harder it is for Google to understand what content exists?

The more sites cater to users by changing content dynamically, the more Google has to create different crawl configurations to mimic, in order to understand the content presented. Thinking about that at scale it’s undoubtedly troubling for the search giant.

Resources

The 2023 B2B Superpowers Index
whitepaper | Analytics

The 2023 B2B Superpowers Index

8m
Data Analytics in Marketing
whitepaper | Analytics

Data Analytics in Marketing

10m
The Third-Party Data Deprecation Playbook
whitepaper | Digital Marketing

The Third-Party Data Deprecation Playbook

1y
Utilizing Email To Stop Fraud-eCommerce Client Fraud Case Study
whitepaper | Digital Marketing

Utilizing Email To Stop Fraud-eCommerce Client Fraud Case Study

1y