Managing the technical SEO aspect of a website is massively important. If not properly managed or considered, then your site may well struggle to perform as expected. Having these foundations in place is key. Even if you have the greatest content around, it won’t perform if it can’t be found.
There are quite a few areas that must be attended to when you take technical SEO into consideration. This means that there are a lot of things that can go wrong. It’s not too uncommon to see a relatively simple issue cause massive problems with a site, as technical SEO simply wasn’t considered at the time.
Seeing as I spend a fair bit of time running through sites, whether it’s part of a full technical audit or just me being nosey, I’ve seen my fair share of technical SEO issues. Here’s a quick look at some of the more common technical SEO-related issues, the problems they can create for your site, and what can be done to get them sorted which will make sure they aren’t an issue in the first place.
Poorly Implemented Site Migrations
Though not something that you’re likely to see from a cursory glance, botched migrations are a rather common issue with regards to technical SEO. Website migrations take place for a number of reasons, whether it be a switch from HTTP to HTTPS, a website restructuring taking place on the same domain, or an entire move from one domain to another.
A large amount of planning is required going into website migration, as this is where the majority of the work will take place. Sites can suffer massively if a migration is poorly planned and executed.
In Modestos Siotos’ superb website migration guide, there are a few examples of botched migrations and the effects they had on the search visibility of the sites that went through them. There are also examples of migrations that went well, showing that massive benefits can be made when the migration is undertaken with proper planning and execution.
In terms of migrations that went well, here’s a recent example:
All pages were mapped out and redirected properly, alongside the consolidation of content across several pages into fewer pages, as well as performance improvements, resulting in a near-instance improvement in search visibility and organic traffic.
In terms of the most common issues, there’s one real problem that most sites have trouble with: setting up proper redirects.
One of the key parts of the planning stage while preparing to migrate a site would be ensuring that redirects are planned out and set up properly. If you’re making structural changes to a site which will see the changing of URLs, they’ll need to be mapped out prior to the new site going live.
If they’re not, then you may well see your site’s visibility absolutely tank.
If not properly planned out, users attempting to access older URLs may be taken through to 404 pages, or ones which aren’t relevant.
This is something I’ve seen a few times, where proper redirects weren’t set up. One mistake I’ve unfortunately seen more than once would be where all URLs from the migrating site were redirected to the homepage of the new version.
Users need to be redirected to the most relevant page on the new site, while being sent through to the corresponding version of the page. Simply blanket redirecting users to one page means they aren’t being taken to a page that meets their initial intent.
It’s also common to see redirect chains – this is where instead of URL A redirecting to URL B, it goes through a number of URLs before doing so. This is often done with legacy redirects – ones already in place on the site prior to the migration. It’s important to ensure that these are added as part of the redirect mapping process.
In terms of addressing redirects with a site, one thing you can do is audit the redirects that have been set up. My preferred way of doing this would involve taking a list of former URLs – from a list of existing redirects, Google Analytics, Google Search Console, an older crawl of the site etc and popping them into Screaming Frog’s list mode.
From there, click “Always Follow Redirects” within the Configuration section, crawl the URLs, then export the Redirect Chains report. There, you’ll get a list of where these URLs go, including any potential redirect chains and the status codes for each redirect.
This is a great way to identify long redirect chains and older URLs redirecting to 404 pages. This is mentioned in our recent article on handy Screaming Frog features.
All in all, you really have to make sure that your migrations are done with an enormous amount of prior preparation. The aforementioned migration guide is a fantastic resource, with there also being a migration checklist to check out as well.
Canonical Tags
A canonical tag essentially lets search engines know what the preferred URL of a page is. They’re most associated with addressing duplicate content issues, by canonicalising said duplicate variants of a URL through to the preferred version which is to be treated as the page that’ll be indexed.
- Not Using Canonical Tags: by far the most common problem with canonical tags is sites simply not using them at all. If you haven’t already, it’s recommended that canonical tags are set up across each page of the site. They’re generally recommended as best practice, though if you’re running a site that does indeed have potential content duplication issues – such as an ecommerce site with internal filters – then the use of proper canonical tags is imperative.
- Incorrect Canonical Tags: we also see sites using incorrect URLs within their canonical tags, such as HTTPS sites referencing the HTTP version of the URL within the tag. Each URL referenced within a canonical tag should be final – it shouldn’t be broken, it shouldn’t redirect, and it shouldn’t point towards an incorrect version of the page. For example, just earlier I saw a site’s homepage canonicalising to a /welcome URL which didn’t actually exist.
With canonical tags like this, it’s likely that search engines will just ignore them entirely.
- Multiple Canonical Tags: not as common as the others, but sometimes pages will have multiple canonical tags in play at one time. This should absolutely be avoided, considering that Google have stated that they’ll just ignore all canonicals on the page and decide the canonical URL themselves.
Source: https://webmasters.googleblog.com/2013/04/5-common-mistakes-with-relcanonical.html
Overall, it’s key to ensure that canonical tags are set up properly on your site. To get a better idea of a site’s situation regarding canonical tags, you can give the site a crawl in a tool like Screaming Frog or Sitebulb which will help identify any issues with the tags themselves.
You can also delve into the new version of Search Console. The Index Report will show you several different warnings regarding canonical tags, such as whether Google have ignored them, if there are alternative/duplicate pages with canonical tags, and if Google have chosen a different canonical entirely. We’ve got a guide to the new Search Console which looks at the new canonical warnings.
Sitemap Files
A Sitemap File is where you can list the pages of your site, providing the URLs that you’d like search engines to crawl and index, while also providing information about the structure of your site.
It’s also a very common thing to get wrong, or be ignored entirely. In terms of the issues that you’ll find with sitemap files, these are the more common ones:
- Incorrect URLs: A sitemap file may well have the wrong URLs listed within it. By wrong, this could mean that they’re outdated + are redirecting to other pages, or may be using the wrong protocol. Tying into an upcoming point regarding the implementation of HTTPS migrations, it’s fairly common to see HTTPS sites include HTTP versions of URLs within their sitemap files. It’s always fun seeing this when crawling a sitemap in Screaming Frog:
It’s key to ensure that the proper URLs are being listed within your site’s sitemap file. The only case where redirecting URLs would be viable in a sitemap would be during a migration, where a separate sitemap is created for older URLs which helps search engines to identify the redirects more quickly.
- Including Bloat: It’s also common for unnecessary URLs to be included within the sitemap file. This could mean unnecessary duplicates of pages, such as /home for the homepage, or the recent issue affecting Yoast users where an attachments child sitemap was created, resulting in a large number of pages being indexed. If possible, ensure that your sitemap isn’t filled with unnecessary junk.
If you want more information on sitemaps, our head of tech Michelle has put together a complete guide on XML sitemaps.
Robots.txt Issues
The robots.txt file provides rules for search engines, largely regarding which areas of the site they shouldn’t be crawling. For such a small text file, it’s an enormously powerful tool that plays a massive role in the crawlability of a site, as well as helping out with the crawl optimisation + crawl budget of a site.
Considering that one rogue rule can prevent key pages or even an entire site from being crawled, it’s incredibly important to ensure that the page is set up properly. This also regards the proper use of rules in the file to ensure that unnecessary URLs aren’t being crawled, such as parameters generated through faceted navigation.
Now the one issue that stands out in terms of absolute avoidance would be the Disallow: / rule. This will prevent your site from being crawled entirely. This isn’t something that crops up in audits, thankfully though there are other issues that tend to come up.
The first issue being the blocking of resources within the robots.txt file. Resource URLs include the likes of Javascript (.js), CSS (.css) and image URLs – all of which are vital in determining the style of a page/site. It’s not too uncommon to see the folder in which these are being hosted blocked within the robots.txt file via a Disallow rule. One recent issue was with the /wp-includes/ folder in a WordPress site – this was being disallowed via the robots.txt file, meaning that CSS + JS files were being blocked.
You’ll be able to identify this by running a Fetch & Render of any page on the site within Search Console – you’ll get screenshots showing how both Google and users will see your site. There’s also Merkle’s fantastic Fetch & Render tool which can be used without Search Console access to the site.
In terms of looking into potential robots.txt issues, you can look at the Robots.txt Tester tool in Search Console. If you feel a page is being blocked via the robots file, you can pop it into there and confirm as such. This can also be done in Screaming Frog, where you can even customise the robots.txt file to your liking, test individual URLs, and run a crawl of the site with your new robots rules.
HTTPS Configuration
Over the past few years, we’ve seen an enormous push for sites to move to HTTPS. It’s been known as a ranking signal since 2014 – a rather minor one, but still a ranking signal nonetheless. We’ve also seen updates to Chrome, such as the upcoming marking of non-HTTPS sites as “not secure”, for the release of Chrome 68.
Considering its importance and how much Google have been pushing for websites to move towards HTTPS, it’s still rather common to see sites either not making the switch or poorly implementing the said switch.
In terms of sites not making the switch to HTTPS, it’s recommended that sites do so at some point in the near future.
Regarding technical issues that I’ve seen when it comes to HTTPS, these are the ones which stand out:
- Not Implementing Redirects: there are plenty of cases where the site will resolve under both HTTP and HTTPS protocols, with the proper redirects having not been put in place. This will essentially leave both versions as separate entities.
- HTTP Resources Being Served: While the page itself may be resolving under HTTPS, the images of the page – such as resource URLs and images – may well be served via HTTP. When this is the case, the page won’t appear as fully secure due to the serving of non-secure resources.
Here’s what the URL will look like – notice that the green padlock is missing, with this warning appearing upon clicking the icon to the left of the URL:
You can see which resources are being served via HTTPS by going into the Security tab within Chrome DevTools. At first, you’ll get an initial security overview – you can then view the requests in the Network panel, showing you the exact resources being served via HTTP. It’s crucial to make sure that all resources are being served securely.
If you’re conducting a switch from HTTP to HTTPS, Aleyda Solis has put together a fantastic HTTPS migration checklist and Patrick Stox has his own guide on properly securing a site.
In closing, these are just a few of the more common tech issues that cause issues with websites. These issues happen to startups and to some of the biggest names around – they’re all prone to technical hiccups, brainfarts, and fully-fledged meltdowns.
If you would like some advice regarding your website or any aspects of Technical SEO please don’t hesitate to contact us!