Skip to main content

How I improved your Google Lighthouse SEO score with a lot of research and one quick PR

I cross-post blog content regularly on DEV.to, and I like to cross-post articles I write for companies to my personal blog site. This means that after a post is published in its original location, I publish it on another domain, word for word. This helps me get my content to more people. Cross-posting is perfectly legitimate, and when you do cross-post, you need to let Google know that your content is a duplicate of the original.

How to reference duplicate content

When cross-posting content to other domains, reference duplicate content with a canonical link in the head tag of the web page, like so.

<link rel="canonical" href="https://originaldomain.com/link/to/original-post">

If you run regular Google Lighthouse checks, you might have noticed that you were penalised for pointing your canonical link to a different domain where the original content lives! But hold up. Isn't this what canonical links are for?

A screenshot of a Google Lighthouse SEO report showing a score of 92% due to a canonical link pointing to a different domain error.

Canonical links on the same domain

Now, there are perfectly valid reasons to include canonical links that point to the same domain. Google Search Console tells us:

canonical URL is the URL of the best representative page from a group of duplicate pages, according to Google.

Why might you have duplicate pages on your site? Take for example an e-commerce site, where your search page URLs might exist in duplicate forms. Without a specified canonical link, Google will choose one (at random, maybe) as canonical.

For the following URL examples, you should set your canonical link to https://shop.com/search.

https://shop.com/search?brand&page=1&filters=cat:T+Shirts
https://shop.com/search?brand&page=2&filters=cat:T+Shirts
https://shop.com/search?brand&page=3&filters=cat:T+Shirts

The investigation begins

I was confused. And I was ashamed that my Lighthouse SEO scores were lower than I thought they should have been! (Oh, gamification, how you taunt me!) And so, I took to Twitter to investigate. Here's a thread started by Tamas, reaching out to Martin — a Developer Advocate at Google.

Martin suggests that Yahoo and Bing don't like cross-domain canonical links, which was referenced in the source code for Lighthouse. I wasn't happy with this! And so I continued down the rabbit hole, and found a light at the end of the tunnel.

Bing Webmaster to the rescue

I found the Bing Webmaster Guidelines, and the guidance for canonical links in 2021 stated:

Do not reuse content from other sources. It is critical that content on your page must be unique in its final form. If you choose to host content from a third party, either use the canonical tag (rel="canonical" to identify the original source or use the alternate tag (rel=" alternate").

Given that Bing was recommending rel="canonical", and Yahoo uses results crawled from Bing, I opened an issue on the Google Lighthouse repo with my findings.

I opened a PR to Google Lighthouse

After a few months of discussion on the issue, we concluded that the advice for canonical links was indeed, outdated, and that all the major indexers began to support cross origin canonical urls in 2009. This was great news! And so, I opened a pull request to remove the cross-origin check for canonical URLs, which was merged to main in November 2021.

Your scores are now improved

A few more months of waiting in excitement, and the code change is now available to everyone in Chromium browsers! Your SEO scores for pages that use canonical links that point to a different domain are now improved.

A screenshot of Google Lighthouse showing the same article with a canonical link, but this time with an SEO score of 100%.

What's my one piece of advice after all of this? Question everything. You never know what it might lead to. You could end up improving the web for everyone!

Read next 👇

  • An icon of a laptop with angled brackets on the screen.
  • An icon of angled brackets with a forward slash in the middle.
A screenshot of the HTML details page on MDN.

How to build an HTML-only accordion — no JavaScript required!

11 Feb 2022 2 min read

  • A yellow square with the black letters JS at the bottom right.
  • The NodeJS logo in white, featuring the letters JS outlined by a hexagon.
A YouTube thumbnail showing a screenshot from a live stream with the words NODE MODULES?! and the whitep4nther logo

How to build, test and release a node module in ES6

10 Apr 2021 5 min read

See all blog posts