Complete Guide to Crawl Budget for eCommerce Sites

You’ve been adding all sorts of new and improved content to your site and submitting it for indexing — but none of it is moving the SEO needle! What gives?

When you’re a digital marketer, it’s frustrating to put so much time and energy into optimizing your site, especially when you’ve checked off every box on the content optimization list. 

But the issue may not be your content at all. Instead, it may be your crawl budget.

If search engines can’t search and index your pages, the hard work you put in will be all for naught, at least when it comes to SEO. Inefficient use of crawl budget could be hindering your efforts for new and improved content and higher organic traffic.

In this guide, we’ll help you identify (and resolve) existing crawl budget issues for your site by explaining: 

  • What a crawl budget is and how it works
  • Why it’s so important for your eCommerce site
  • And which simple strategies you can use to better optimize it

What is Crawl Budget?

Basically, crawl budget is the number of pages a search engine bot will crawl (or index) on your website, for the given time period Google has determined is appropriate for your site.

Despite its importance, crawl budget isn’t one of the popular buzzwords in SEO. It doesn’t help that how much crawl budget Google allocates to each site is a complete mystery. How can you optimize for something that you can’t pin down exact measurements for?

While we don’t know exactly how much Google crawls on each site, we do know what crawl budget is most impacted by: crawl rate and crawl demand

Crawl rate (also known as crawl rate limit or crawl limit) refers to the maximum amount of simultaneous fetching that a search engine bot can activate (as well as how much time needs to pass between those fetches) while still providing a quality experience for users. Your site’s crawl limit can change over time, depending on how quickly and efficiently bots can crawl your URLs.

Crawl demand refers to how popular or stale a specific URL is on the internet.

When you understand these factors, you can maximize your crawl budget by prioritizing the most important pages on your site — ensuring that your budget isn’t being wasted on low-quality pages or pages you don’t want to be indexed.

Why is Crawl Budget Important for eCommerce Sites?

Effective crawl budget optimization is correlated with better search engine rankings — which, in turn, means more traffic to your site. 

Search engine bots are designed to act as much like humans as possible by searching out the best content on your site. Optimizing your crawl budget will help these crawlers (and, eventually, your audience) find what they’re looking for within your site more easily.

eCommerce sites tend to have large websites, with potentially thousands of pages containing specific product SKUs, product categories, and more. In addition, these pages are constantly being updated and re-indexed, with new URLs being generated every time new products are added to the inventory.

If your team isn’t on top of your crawl budget, Googlebot might be crawling the least-important pages on your site, and your technical SEO efforts will take a hit. Instead, you want to make sure that the precious limited time the bots are spending crawling your website is being used effectively — so you can direct customers to the pages where they’re most likely to make a purchase.

How to Optimize Your Crawl Budget: 6 Tips for Online Businesses

A number of factors go into optimizing your crawl budget. But, before you start analyzing those, you need to first understand the lay of the land: how your site is currently being crawled by search engines. 

You can get an idea of how search engine bots move through your pages by using eCommerce site crawlers. With more data in hand, it’s easier to make informed judgments for your next steps.

Below, we share six ways our SEO team improves crawl budget for our clients — so that you can use them to update and improve your site, too.

1. Cut the cruft and index only your most important content.

Indexing pages is critical for your eCommerce site’s success. Without being indexed, your web pages cannot appear in organic search results, and your customers will never find you.

Does this mean that every single page on your site should be indexed, so that each and every page is added to search engine results? The more indexed pages you have, the more content a spider can crawl, the better for your site — right?

Wrong.

In this case, more is not merrier. When bots have to crawl through too many pages and index too many confusing pieces, your crawl budget will be wasted. Unimportant and outdated pages (especially those you’ve forgotten or didn’t even know existed) can distract the bots and prevent those more important pages from being indexed. 

Some pages to watch for:

One of the first things our experts do when auditing a site for a new client is to assess this cruft — the index bloat, the extra pages, etc. You can use our tool, The Cruft Finder, to identify where your superfluous, messy code or index bloat might be, so you can take steps (using robots.txt file, noindex tags, canonical URLs, etc.) to remedy it.

Cruftfinder. Crush the Cruft & improve your rankings!

Indexing only the most relevant and optimized content is a great way to allow the bots to work smarter, not harder, while crawling your site. (And, in one of our client’s cases, it can even boost website sessions by more than 200%.) 

2. Have a well-organized sitemap.

After we deal with cruft and index bloat, we head toward fixing up the sitemap. 

Just like a road map and set of directions, the more organized and clear your sitemap is, the easier it is for bots to crawl your site. You want to avoid sending them down dead-end roads or in endless looping roundabouts. 

Remember: A well-organized eCommerce sitemap not only makes it easier for bots to crawl your most important pages, it’ll also make it easier for your customers to navigate your website and find the products they need — and for your site to offer suggestions for other products they didn’t know they needed.

3. Review your log files.

This step is one of the most important pieces of analyzing and understanding your crawl budget.

Conducting a log file analysis is an excellent way to see which pages are being crawled and where bots actually end up after crawling your internal links. We’ve used this to find error pages, developer tests, redirect chains, and dead-end or circuitous patterns on client websites. 

Your log files are not only helpful in cleaning up your sitemap, but they can also help you identify areas of your site that bots are accessing but shouldn’t be, important pages bots should be crawling often but aren’t, and even sections of your site you never knew existed. 

If you can’t get access to your log files, you can also use the Crawl Stats Report from Google Search Console (GSC). While this report is more of a sampling of data (rather than a robust collection of files), it’s a great starting point for determining where bots are going or what they’re missing on your site.

The information you get from your log files can also help you optimize your internal linking structure and organize both your HTML and XML sitemaps (if you have them).

4. Prune low-performing and duplicate content.

While it’s important to continually produce high-quality content to boost your SEO, you can get similar results by removing low-performing content, too.

Duplicate content is an easy starting point. This can include category or tag pages, especially if, over the years, a company has created hundreds or thousands of tags. Typically, some won’t be used anymore or are too similar to other categories to be worth keeping. We recommend consolidating these pages as much as possible. 

You may also reconsider whether having category or tag pages indexed is in your best interest. Your log files will reveal whether these pages get enough hits to justify indexing.

Out-of-stock product pages are another common issue we see at Inflow. If older products are still leading to traffic, many eCommerce companies avoid removing those pages. But, without optimizing to convert that traffic, you’re just wasting crawl budget. 

If you’re committed to keeping a high-traffic, low-conversion out-of-stock product page, we encourage you to add some other form of conversion opportunity — for example,  showing your audience similar products that are currently in stock or providing another call-to-action to lead them to other pages on your site.

Remember: You don’t want to give the bots more work than necessary when crawling and indexing your eCommerce site. Giving the search engine spiders fewer pages to crawl can lead to an overall increase in content-driven revenue, as it did for this client.

Thin, low-performing, and duplicate content not only takes extra time for bots to crawl, but it will often also affect how your SEO content ranks in the SERPs. Removing these pages from your site (or at least from the indexing process) can go a long way in improving your crawl budget.

Start auditing your site content and identifying top performers with our eCommerce Content Audit Toolkit.

Everything you need to perform a comprehensive audit of eCommerce catalog content. Get Instant Access.

5. Make efforts to improve site performance and page speed.

If the bots’ efforts to crawl your site are hindered by slow load times, this will eat up your crawl budget, too. 

Page load time impacts crawl budget, ranking, and usability. A few tweaks in this area will have big ripple effects across your entire eCommerce SEO strategy.

A good rule of thumb: Ensure that your site loads in under two seconds, even when large numbers of users are accessing your site at once.

A few ways you can improve page load times:

  • Wait to load images until they are needed for display in the browser.
  • Use CSS sprites. These are a collection of images within one image, which reduces how many requests are made on the server and streamlines load times.
  • Use http2 protocols (rather than http1 protocols), which allow for faster site speed and performance.

6. Focus on creating shareable, high-quality content, and update it often.

The more external links a site has, the more efficiently your crawl budget is going to be used. Make sure that your eCommerce content strategy includes publishing new content, pruning ineffective content, and updating aging but popular pieces. 

If you’re creating helpful, informative content that others want to read and share, you’re going to naturally receive those coveted backlinks. Customers (and other websites) are going to share your blog posts, your products, and your other web pages.

Updating product descriptions, adding new blog posts, and keeping evergreen content current are some of the keystones of a solid content strategy that’ll help maximize your crawl budget and provide value to your audience.

Maximize Your Site’s Crawl Budget Now

Your website’s crawl budget is integral to your SEO campaigns’ success. 

By taking time to review current crawl paths and using that information to organize, prune, update, and speed up your site, you’re going to maximize whatever Google crawl budget is allocated to you. In turn, your optimized crawl budget will help improve pagerank and user experience. 

Have more questions about optimizing your crawl budget, or want an expert to take a look for you? Our team of eCommerce SEO experts is always here to help. 

Request a free proposal anytime, or check out any of our other free crawl budget resources: