Skip to main content
whitep4nth3r logo

How I improved your Google Lighthouse SEO score with a lot of research and one quick PR

Why has Google Lighthouse been penalising us for canonical links on different domains? I set out to solve this conundrum once and for all.

⚠️ This post is over two years old and may contain some outdated technical information. Please proceed with caution!

I cross-post blog content regularly on DEV.to, and I like to cross-post articles I write for companies to my personal blog site. This means that after a post is published in its original location, I publish it on another domain, word for word. This helps me get my content to more people. Cross-posting is perfectly legitimate, and when you do cross-post, you need to let Google know that your content is a duplicate of the original.

How to reference duplicate content

When cross-posting content to other domains, reference duplicate content with a canonical link in the head tag of the web page, like so.

<link rel="canonical" href="https://originaldomain.com/link/to/original-post">

If you run regular Google Lighthouse checks, you might have noticed that you were penalised for pointing your canonical link to a different domain where the original content lives! But hold up. Isn't this what canonical links are for?

A screenshot of a Google Lighthouse SEO report showing a score of 92% due to a canonical link pointing to a different domain error.

Canonical links on the same domain

Now, there are perfectly valid reasons to include canonical links that point to the same domain. Google Search Console tells us:

canonical URL is the URL of the best representative page from a group of duplicate pages, according to Google.

Why might you have duplicate pages on your site? Take for example an e-commerce site, where your search page URLs might exist in duplicate forms. Without a specified canonical link, Google will choose one (at random, maybe) as canonical.

For the following URL examples, you should set your canonical link to https://shop.com/search.

https://shop.com/search?brand&page=1&filters=cat:T+Shirts
https://shop.com/search?brand&page=2&filters=cat:T+Shirts
https://shop.com/search?brand&page=3&filters=cat:T+Shirts

The investigation begins

I was confused. And I was ashamed that my Lighthouse SEO scores were lower than I thought they should have been! (Oh, gamification, how you taunt me!) And so, I took to Twitter to investigate. Here's a thread started by Tamas, reaching out to Martin — a Developer Advocate at Google.

Tamas Piros asks on Twitter: Legit question by @whitep4nth3r - I hope you can help @g33konaut. In this article, it's stated that canonical links can be added cross domain. However, there's a lighthouse test that fails such tests (
Martin Splitt replies to Tamas: While that is absolutely acceptable for Google Search (as documented in the link you shared), Lighthouse is vendor-agnostic and Bing and Yahoo seem to be unhappy about cross-domain canonicals. See this comment explaining that in Lighthouse.

Martin suggests that Yahoo and Bing don't like cross-domain canonical links, which was referenced in the source code for Lighthouse. I wasn't happy with this! And so I continued down the rabbit hole, and found a light at the end of the tunnel.

Bing Webmaster to the rescue

I found the Bing Webmaster Guidelines, and the guidance for canonical links in 2021 stated:

Do not reuse content from other sources. It is critical that content on your page must be unique in its final form. If you choose to host content from a third party, either use the canonical tag (rel="canonical" to identify the original source or use the alternate tag (rel=" alternate").

Given that Bing was recommending rel="canonical", and Yahoo uses results crawled from Bing, I opened an issue on the Google Lighthouse repo with my findings.

I opened a PR to Google Lighthouse

After a few months of discussion on the issue, we concluded that the advice for canonical links was indeed, outdated, and that all the major indexers began to support cross origin canonical urls in 2009. This was great news! And so, I opened a pull request to remove the cross-origin check for canonical URLs, which was merged to main in November 2021.

Your scores are now improved

A few more months of waiting in excitement, and the code change is now available to everyone in Chromium browsers! Your SEO scores for pages that use canonical links that point to a different domain are now improved.

A screenshot of Google Lighthouse showing the same article with a canonical link, but this time with an SEO score of 100%.

What's my one piece of advice after all of this? Question everything. You never know what it might lead to. You could end up improving the web for everyone!

Like weird newsletters?

Join 337+ subscribers in the Weird Wide Web Hole to find no answers to questions you didn't know you had.

Subscribe

Salma is looking at you, with a rather large smile. She's pointing across herself up to her left, with a very tatooed arm. She's wearing a black shirt and black rimmed glasses.

Salma Alam-Naylor

I'm a live streamer, software engineer, and developer educator. I help developers build cool stuff with blog posts, videos, live coding and open source projects.

Related posts

11 Feb 2022

How to build an HTML-only accordion — no JavaScript required!

You don't need JavaScript to build accordions! Use HTML only and just four lines of code.

Tutorials 2 min read →

10 Apr 2021

How to build, test and release a node module in ES6

If you Google "build test release npm module" this is the top result. Cool, huh?

Tutorials 5 min read →