Beginners guide to performing a Technical SEO Audit (Part 1 of 2)
Want to perform a Technical SEO Audit but don’t know where to start?
Step this way… we’ll tell you about some great tools and let you know the key areas to focus on, whether you’re performing audits for yourself, investigating issues for clients or just simply want to find out a little bit more!
Part one of this blog series is going to focus on gathering and analysing your data. While in part two we’ll look at the key on-site factors that you need to take into consideration within an audit.
What is a Technical SEO Audit?
Simply put, a Technical SEO audit is a process which brings to light any problems that might be causing issues for search engines when they crawl your website, and it will also highlight any possible improvement opportunities. This involves looking at the title & meta description tags, sitemap.xml & robots.txt files, URL structure, page speed, page status codes and duplicate content to name but a few key areas.
Helpful Data Capturing Tools
The first and most important step before doing anything else, is to gather your data for analysis; for this you need to crawl the entire website.
There are a few tools out there to do this, and you can even write your own, but the easiest and best one by far is the wonderful Screaming Frog SEO Spider. What’s even better is that the basic version of the tool is free to use and will crawl up to 500 URLs, which is great if you just want to get to grips with what the tool can do, or if your site is small and you don’t need a deep analysis. Once you get the hang of the tool, or if you are auditing larger sites, you can choose to pay the yearly subscription and get the full version. There are so many brilliant options and features using this premium tool, but for now we’ll just focus on the free version.
Once the tool is downloaded, it’s just a case of putting in the website address you want to look at, and then clicking start and letting the crawler do its thing, which might take a while depending on how large your website is. When the crawl is complete you’ll have most of the important data to hand to help you within your audit process.
Make sure that you have access to your site in Google Search Console, this is invaluable when conducting an audit because as well as notifying you about any penalties or errors, it also gives you access to a variety of tools and checkers which will greatly speed up the audit process.
1. Data Analysis
Once you have your data, it’s the time to check how accessible your website is, to be sure search engines and users alike can successfully crawl and access all your important website content.
Did the crawler manage to access all your pages and content successfully? If there are a strangely low number of pages or even none at all, it could mean that your pages are being blocked to search engines via a robots.txt file. This file is stored on the root of a website and tells search engines which pages or directories they can and can’t crawl. A common problem is when a robots.txt file is set to disallow the entire website, this is normal during the development process of a website so that the development site isn’t indexed, but sometimes when the website goes live this file doesn’t get updated and so Google can’t start seeing or indexing your website.
The robots.txt file is designed to signify to search engines any groups or pages that you would rather they didn’t spend time crawling because they are not relevant or low quality, such as search pages or admin and login sections.
Below is an example of a robots.txt file stopping search engines from crawling a website:
If you wanted to disallow a folder but still wanted search engines to crawl a specific page in that folder you can combine the disallow with an allow command.
Robots.txt Tester (Search Console)
If you wish to check whether certain URLs are accessible according to your robots.txt file, you can use the robots.txt Tester within Search Console (Crawl > robots.txt Tester). This tool will also signify any errors that the file itself may have.
Fetch as Google (Search Console)
Want to check if your robots file is affecting how your page is displayed? Just go to Crawl > Fetch as Google and enter your URL, making sure you select Fetch and Render. Wait for it to complete and then click on the link, which will show how googlebot and the user will be seeing your page. If they are noticeably different it’s most likely due to certain CSS files or folders being blocked, which will be detailed in the list below the render.
It is important to make sure that view for the googlebot and the user look as similar as possible.
A sitemap.xml contains all the URLs for Google to follow (a bit like a road map) and should show the structure of your site. It can enable search engines to find pages that might normally be quite hard to find or are inaccessible. Like the robots.txt file, this is also kept on the root and is normally available at /sitemap.xml
This file needs to be in a valid xml format and should contain all your important pages. It should be as up to date as possible and should always be submitted within Search Console (Crawl > Sitemap) to Google to start indexing.
It’s also the place where you can find how many URLs are indexed and if there are any errors. If the number of URLs submitted vs the number that are indexed does not match it can suggest that there are problems as Google hasn’t found all URLs useful to index. This can be use to issues like non canonical URLs being used or ones that 301 redirect or 404.
Sitemap.xml should only contain canonical URLS and not any duplicate URLs or ones that 301 redirect or 404.
Websites that have generated sitemaps should also be carefully checked, while plugins within WordPress websites like Yoast are fantastic for automatically generating sitemap files, they will often include all files and sections including ones that may not be linked within the website or are not useful (like banner URLs or thank you pages). Thankfully tools like Yoast provide an array of options so that you can remove sections and pages from the sitemap, but they need to be manually removed via the admin so checking what has been added first to the sitemap is vital.
Generating a XML file (XML-Sitemaps or Screaming Frog)
Screaming Frog can also create a static sitemap for you from the URLs it has crawled by going to Sitemaps > Create XML Sitemap.
A free tool which can generate a static sitemap for you can be found at XML-Sitemaps.com. Just enter your website address and click start. There is a maximum of 500 URLs, but this should be fine for a small to medium website. Bear in mind that the sitemap should be kept up to date, if the website URLs are changed a new sitemap file must be generated and submitted each time.
Checking Sitemap URLs (Screaming Frog)
A great feature in Screaming Frog is the ability to check that all the links within a sitemap are valid. Simply go to Mode > List and click on Upload List > From a file and select your .xml file, and it’ll crawl each of the available links and let you know of any errors with the file. This is very useful if you have large sitemap files to check.
1.1.3 Response Codes
When you carried out your website crawl with Screaming Frog you would have seen a column called Status code with different numbers in, such as 200, 404, 301 or 302. A status code is what a web page sends back when accessed, and i’ll run though the most common ones.
200 – The most common status code you will see, it says the request has succeeded and the page has been found.
301 (Moved Permanently) – You’ll see this next to a URL in Screaming Frog when it’s been moved or deleted, and a redirect has been added pointing to its new destination. This redirect tells Google to pass the link juice from the old URL to the new one, and eventually replace the old URL with the new one in the index.
302 (Found) – This is also known as a temporary redirect, which essentially says this page has been temporarily moved but could come back at a later date. In the past 302 redirects did not pass on link quality but Google has since confirmed that they now they do. However tests have since shown that this is not necessarily the case so it’s better to always ensure you are using 301 redirects and not any 302s throughout the site.
404 (Not Found) – The page has not been found. It’s important to keep on track with any 404s you find in your crawl, and investigate where they are coming from. Sometimes the 404 can simply be caused by an incorrectly spelled URL on a link or nav item, or when a link exists to a deleted page somewhere on the site.
What should I do?
You should investigate your 404s and existing redirects to decide the best action or improvement.
For visitor experience and SEO purposes you should make sure that any 404s within your website are fixed and a 301 redirect is created. This will pass on the link quality and notify search engines but also shows visitors the new and correct page.
Updating internal 301 redirects is also important, as having redirects within your website gives Google more URLs to crawl meaning that more important pages may not be found. The 301 redirect might also be in a redirect chain where one redirect leads to another and so on, all of which add to the crawl budget and make it hard to crawl your website (Google does also not go beyond a certain number of redirect hops) The best course of action is to make sure that any internal 301 redirecting links are updated to the correct URL.
Important Note on Site Migrations
If a new version of your website goes live always make sure 301 redirects are in place for all old URLs so that existing link quality is passed across, the URL in Google is updated and that users can find the new URL. Site migrations without 301 redirects in place have have disastrous consequences with rankings disappearing and URL dropping out of the index altogether.
Crawl Errors (Search Console)
Within the Crawl Errors Section you’ll find all the 404s for your website including any for mobile and other devices. This is a good place to check for any old website URLs that might not have redirects in place, as Screaming Frog just looks at links within your site and not what may be on other sites or indexed in Google. Just be aware that it isn’t a live view of the website!
Get a list of 301s and 404s (Screaming Frog)
In Screaming Frog you can export lists of internal redirects and 404 errors (as well as any redirect chins) by going to Bulk Export > Response Codes and selecting the preferred option. This will show you where the issue is coming from and also any relevant anchor text or alt text attached to it to make it even easier to find.
1.1.4 Site Structure
A good site architecture is beneficial for both users and search engines, as it provides a clear structure of your website making it easier to crawl for search engines, and ensures that users can access all the website content. Make sure that all the important pages are accessible from the main nav and are not buried many clicks away. Within the site’s hierarchy, all the important pages should be nearer to the root of the site and ensure any pages that are related to each other are logically grouped.
Tree View (Screaming Frog)
There is a ‘View’ dropdown on the right of the crawl where you can switch to seeing your website in tree structure, which helps to better visualise your current site structure. Are your important pages nearer to the root of the website?
1.1.5 Page Speed
Page speed is how long it takes for your browser to fully display the content on a page. Google uses page speed as a ranking factor so it’s important that your website is as fast as possible. It also benefits the user experience as the bounce rate of visitors is likely to be higher on slower loading pages.
Page Speed Analyser (PageSpeed Insights)
Google Developers PageSpeed Insights is a great tool to see some basic issues that your website may be having and improvement suggestions.
The speed performance tool from GTMetrix is another fantastic free tool which goes into more detail about the page speed issues on your website. It will highlight the exact resources that are causing the problems and rate each issue.
1.1.6 Mobile Site
On April 21st 2015 Google began rolling out an algorithm update which now uses mobile-friendliness as a ranking signal within the mobile search results.
This means that any websites which are not optimised for mobile can see some significant ranking and visibility changes. Ensuring your website is mobile friendly is now a very high priority, especially if a lot of your website’s traffic is coming from these devices. The Devices report in Google Analytics is a helpful guide for seeing the amount of mobile visitors coming to your website.
Check out our mobile website optimisation tips for more information on getting your website mobile ready. Making sure your website is ready for mobile first indexing is also key by providing google with the same content, schema markup and optimisation on mobile.
Mobile Usability (Search Console)
For a rough check of whether your website or certain pages are not optimised for mobile, go to Search Traffic > Mobile Usability which will list the main pages with issues.
Mobile Friendly Testing Tool (Google)
The Mobile Friendly Testing Tool has been created by Google to give an indication of the mobile friendliness of your website.
Want to see how many of your pages are indexed within Google? One of the easiest ways of doing this is through a google search.
You can use the “site:” command within the search bar, which will bring back the rough number and list of pages indexed.
As mentioned earlier, it’s very useful for comparing the number of pages that are indexed versus the numbers crawled by Screaming Frog and within your Sitemap. Ideally they should be roughly the same, meaning that all the pages are indexed correctly.
If the index count is quite a lot smaller, it means there might be indexing problems or that some important pages are not accessible from search engines. For example, if they are disallowed within the robots.txt file. If the index count is quite a lot larger, this can point to duplicate URLs being available for the same page. Google marks this as duplicate content, so it should be avoided. We will go how issues like this can be avoided in part two.
So there we go, a beginner’s guide to starting a Technical SEO audit beginning with basic data capture and analysis. We hope this has been helpful, this is just one small part of the many things you can look at and tools to use if you want to fully audit your site, and it is entirely dependent on your website and its requirements. Look out for part two when we will move onto the on-site ranking factors and duplicate content issues that every website audit should be watching out for.
If you would like to know more or how we can help you get in touch with us today!
Photo by Giorgio Montersino, available under a Creative Commons Attribution-ShareAlike 2.0 Generic license.