Fixing JS Crawling Issues: A Practical Guide

Author

Stephan

Why does it matter if search engine crawlers can process your website?

Javascript-heavy websites are much harder to crawl and analyse for the search engine bot. If the search engine bot cannot successfully read your web page, it will not be indexed and you will not receive organic traffic (as shown on google search console).

Search engines use search engine crawlers to crawl the web and understand and rank the right web pages for the right search queries. Search engine crawlers often do not execute javascript code (or at least not in the first pass of their crawling exercise). The web is huge and analysing all websites is impossible. To crawl javascript websites javascript is an additional burden to the search engine crawler. To be on the safe side, assume that the search engine crawler does not execute your javascript code.

On javascript sites, it is not uncommon to see that the meta data (e.g. robots meta tag) or worse the main text of the web page is dependent upon the execution of javascript.

Example of a bad meta title:

<meta title={"Buy the newest games, Updated: " + new Date().getFullYear()}/>

The following is referred to as a javascript link:

<href target={"yourwebsite.com/page=" + getCurrentPage()+1}

The internal linking depends on the execution of Javascript. Likely, the search engine crawler cannot successfully crawl the link and move around your website.

A quick these javascript issue for this is to use a web rendering service like prerender.io. A javascript website will also exhaust the crawl budget of a search engine crawler faster. Pre-rendering the page with a prerender.io helps to reduce the crawl budget that is spent on every page. We have an exhaustive list of common JS errors and an overview of javascript frameworks which are great for SEO.

Diagnose javascript issues with the javascript checker

If you want to check the difference between the HTML and the rendered version of a website, you can check out our Javascript checking tool. We scrape both versions of a website and we let you know should we find any differences between both versions.

Differences in the meta-information (title, description, robots and canonicals) might pose a serious problem. Specifically, a search engine crawler will likely not be able to able to read the meta information if it requires javascript rendering. For meta information and (internal) links, it is best to avoid the usage of javascript altogether.

Diagnose javascript crawling errors using Chrome

If you want to get an idea of what googlebot sees when it visits your website, you can turn on and enable Javascript in your browser. You might be surprised to see that your page is largely empty now (here, we are using Chrome). An empty page is obviously extremely bad. It will likely never be indexed because it is missing content and it’s obviously not showing any headings like H1, H2, etc. Even if only some parts of your website show, this is bad for you because very likely you will miss out on a big chunk of organic keywords you would otherwise rank for.

If you have identified potentially problematic areas of your website, you should confirm your findings using https://search.google.com/test/mobile-friendly/. The google mobile-friendly test gives you a more real-world indication of what googlebot sees when it crawls your website.

In chrome, you should also check out your (internal) links (href) and all canonicals (rel=”canonical)”. To give some context, canonicals are used to indicate to google which version of a webpage should be indexed should there be multiple pages. For example, should you have 5 identical versions of the very same product page, you should likely set the canonical of 4 pages to the target page.

You can again confirm your finding using the mobile-friendly test. Should the URL show up in the rendered HTML you are likely fine.

Diagnose javascript crawling errors with ScreamingFrog

Some of the crawling issues of javascript can be picked up using ScreamingFrog. The software can crawl your website and will report critical issues. It’s a fantastic tool with way more functions that are out of the scope of this post. For a quick first check, we can check out the “Indexability” on the left window and the “Issues” on the right.

Screamingfrog also allows you to look at your whole website as a graph. If you click on “Visualisations” and “Crawl Tree Graph”, you can see how the crawler moved through your site (where it fell off) and what errors it encountered. Typical errors that can come up are 404, 500, or 301/302 redirects. All these errors might not be due to issues rendering javascript, but it is definitely worth checking them out.

How to fix missing page content

You might find that the page content is completely absent after you have disabled javascript. The most common reason is that the rendering of the text is dependent upon javascript execution. Below is an example that likely cannot be rendered by Googlebot. Suffice it to say, this is true for all elements on your website. Please read more about what to include on your website here (Link to speeding up next js).

2024-04-04core-web-vitals