Spiders, crawlers, servers, hits, caching and what not can be quite daunting for anyone doing SEO from the content marketing perspective. However, it goes without saying, that you really can’t skip technical SEO. At least, the items with big impact on your rankings. One of these, is what is referred to as the crawl budget. Which I will explain in layman terms for a quick understanding in (hopefully) no more than 15 minutes.
So What is Crawl Budget?
Think of it this way. Google’s crawler, Googlebot, basically is a bot that visits and scans the content of a website, going from page to page and learning how the pages are interlinked. The information that the crawler captures during its visit is collected and stored with the data of many other websites. Think of it like a library’s catalogue. Then, when the user is looking for a certain topic (search term), the data from many websites that the crawler has collected is shown to the user to pick which he/she would like to read more about.
The internet is very big, with millions of websites to scan in multiple languages. The crawler has to be smart about how it scans the web. Instead of scanning everything from every single page, the crawler prioritizes what it would scan and thus collect those information back to the catalogue.
In other words, if your crawl budget is low, chances of you showing up in the search results is also low due to low indexing and caching rates. If your crawl budget is high, you would have higher chance of showing up in the search results since there is much more information about your website in the catalogue to serve to search users.
Can You Check How Googlebot is Crawling Your Site?
Absolutely. Simply register your site with a Google Console (Google Webmasters) account if you haven’t already done so. Then, click on the ‘Crawl Stats’ sub-menu on the right-hand side menu. In this report, you can see Googlebot insights such as ‘Pages crawled per day’ and ‘Time spent downloading a page’.
From this report, you can now gauge how healthy your crawl budget is. For example, if you have a 1,000 page website but Googlebot is crawling only 10 pages per day, you can make the assumption that either you have a very low crawl budget or there is something very wrong on the technical side of your website, and you may want to have system admin or developer to take a closer look.
In case you’ve rule out any problems with the website’s set-up and configurations, you may want to start looking into some ways to increase your crawl budget. How exactly can you do that? Let’s first quickly go through what factors Google uses in determining crawl budget. In the end, it’s the top 2 factors you want to keep in mind.
Top #1 factor: How popular your website is on the internet. In other words, your link metrics.
Top #2 factor: The quality of your page content. Is it high quality enough to warrant demand for frequent crawling?
Other factors: There are other factors such as duplicate content issues, faceted navigation, and soft error page. See all at the Google Webmaster Central Blog.
How To Increase Your Crawl Budget
#1 Check your server logs.
You’re looking for how many 404 errors show up. These pages are getting shares from your crawl budget, and they shouldn’t. Error pages would not convert. Having search engines crawl these pages, is a waste of effort and sends a negative signal about your website.
#2 Check your webpage status.
Besides 404 errors, anything else that isn’t 200 or 301 codes are not okay. You can use Screaming Frog SEO tool to help you identify these pages if you don’t have direct access to your server logs.
#3 Build Your Link Juice!
According to Google, an important factor is how popular your website is on the internet. The way this is measured, more or less comes down to websites pointing back to you (backlinks). The more backlinks your website has, the more link juice it has to propagate to the important pages. Which in turn, sends a signal to crawlers that these pages are worth crawling. Link building can be done in many ways, whether it’s publishing a business update, hosting social media activities and much more. It’s an entirely other topic to explore. If you’d like to know more, feel free to contact our enablers here.
#4 Stop Link Juice Leaking to Low Value Pages
What the heck does this mean? Well, besides putting the efforts to build more link juice, efforts can also be made to stop valuable link juice you have earned to flow into low value pages that does little for your business. You can use the ‘nofollow’ tag to stop crawlers to follow through the link and block these low value page from the robots.txt file.