Google And Bing Jointly Increase The Allowed Size Limits For Sitemap Files To Address Webmasters' Needs

Bing - Google

The sitemaps' protocol stated that each sitemap file should not contain more than 50,000 URLs and should be less than 10MB.

While most sitemaps are indeed less than 10MB, there are times when search engines encounter sitemaps larger than that limit. Usually this is caused by sitemap files having a list of very long URLs that inflate their file sizes.

"While most sitemaps are under this 10 MB file limit, these days, our systems occasionally encounter sitemaps exceeding this limit. … Most often this is caused when sitemap files list very long URLs or if they have attributes listing long extra URLs (as alternate language URLs, Image URLs, etc), which inflates the size of the sitemap file," said Fabrice Canel, principal program manager at Bing.

To address this issue, Google has announced jointly with Bing, to increase sitemaps file and index size. Webmasters can still compress their sitemap files using gzip to reduce the files' size, but when they're uncompressed, each shouldn't be larger than 50MB.

The 50,000 URL limit per sitemaps file has not changed, but the file size has increased significantly: 40MB increase from the previously allowed 10MB.

The update file size change is being reflected in the sitemap protocols on www.sitemaps.org.

As an update to how crawlers work, Google has taken another step to drop support for crawling the web as a feature phone.

According to the company, "most websites don't provide feature-phone-compatible content in WAP/WML any more." Because of this change, those websites have "made changes in how we crawl feature-phone content."

However, there are still websites that provide contents for feature-phones through dynamic serving based on the user's user-agent. Because Google won't be crawling websites using its feature-phone user-agents anymore, those websites need to configure their desktop and smartphone page to have a self-referential alternate URL link for feature-phone devices.

This a change from Google's previous guidance on using only the "vary: user-agent" HTTP header. Google has updated its documentation on making feature-phone pages accordingly.

Google in eliminating the feature, won't create any impacts on how it crawls or indexes smartphone contents as this only affects feature phones like old Nokia phones. And because Google won't be crawling the web as a feature-phone, the crawl error reports for feature phones are no longer available.

"Without the feature-phone Googlebot, special sitemaps extensions for feature-phone, the Fetch as Google feature-phone options, and feature-phone crawl errors are no longer needed," said Google.