A Story about Referrals and Google Analytics (with a happy ending)

On Referrals in GA when reports just don't match up

I have a client who is a publisher of a niche directory that's been around for 24 years, I've worked with them for over 14. Over time their customers have become more tech savvy, who isn't with the growth of the internet and digital marketing? Something that's come up more and more frequently are questions from their customers about the mismatch in the number of referrals they see in their Google Analytics.

Long ago, when this site had it's beginnings, modern analytics weren't a thing, they just didn't exist. Sites relied on log analysis and there were several popular packages like Webalizer for getting reports from your logs, with some basic charts and graphs, etc.

Today we have Google Analytics, which grew out of Urchin which used to be software that you could run on your web server to get reports. Urchin was bought by Google in 2005 and eventually became what we now know as Google Analytics.

So back to the customer's questions about their referrals report. Why do referrals we show from our event tracking, where we track visits sent to a customer's site, sometimes not match up with what the customer sees in their GA referral report?

Disclaimer: This isn't a one size fits all case, it's more a description of what happened in this particular situation and I hope it serves as inspiration to digging into your own situation.

This is a story of two Google Analytics accounts and where did the missing visits go? Were they missing or was something else in play?

Here's the customer's GA report, showing the Referrals section of their GA reporting for a 12 month period. 2,821  is what they showed as attributed to our site.

Here's our report, from my client's Event Tracking in GA. There are two different reports because we have listings as well as banners on the site and they're in different buckets.

Banner clicks in GA

This adds up to over 7,609 visits sent to the customer's site.

Big differences in these numbers. The customer wanted to know why our numbers were higher. So what is going on? That's what I dug in to find out.

So here's what Google has to say about the referrals in GA. https://support.google.com/analytics/answer/1009614?hl=en

“If your site uses redirects, the redirecting page becomes the landing page's referrer. “

Something was causing GA to not pick up on the bulk of the referrals, at least they weren't showing up as being from my client's site, I suspected they were ending up in the Direct or Other sections of the report.

My first stop was to export a spreadsheet with all of their listings which contained the URLs we linked to. There are 20. Next I took these URLs and ran them through a header checker to see what happens, what's the server response, etc.

I use httpstatus.io for this, it's a quick easy tool. Paste your URLs to check and you'll get a report that looks like this:

You can also use Screaming Frog for something like this but for a quick test, I like the visual results httpstatus gives. Wow, that's a mess of redirects!

Every URL in the list, except for one, had multiple redirects, these are called redirect chains. Google mentions this causes the redirecting page to become the landing page's referrer. Let's break down what happens.

http://domain.com/category/my-page is the link we used on the client's site
But the customer's site is now secure and so it redirects to
https://domain.com/category/my-page (note the https)
And then, since they don't have “category” in their URL path anymore, that redirects to the current location:

It's no wonder they didn't have complete data on referrals from my client's site.

Why do the redirects cause this?

Several reasons: The GA javascript doesn't load with redirects, because they're not a page, 301 redirects typically happen at the server or software level, before a page is loaded or created, so it's before the GA javascript code can be triggered.

Since they're happening at that level, the referrer is the URL before it in the chain and not the originating site. In this case all Google Analytics could see was traffic that seemed to be coming from the customer's own site. The traffic was counted but wasn't attributed as coming from my client's site, it wasn't not missing, just in a different bucket.

So what's the solution in this case? The customer is going to update their URLs in our system so they don't go through redirects AND they're going to use utm tagging to gain further insight on those visits. That's a win all the way around.

One thing to be aware of with UTM tagging, if your site has redirects, you'll want to make sure it passes on the query string for those – otherwise you'll lose that data. For Apache it means using QSA in your rewrite rules like this:

RewriteRule page.html /another-page.html [NC,QSA,R=301,L]

This instructs the server to preserve that information which can then be picked up by GA on the destination page. While not a GA issue, having a self referencing canonical tag on your pages can help Google with those various query strings as well.

This is a good reason to check your analytics and to run an audit on your site – make sure that you're not causing yourself grief. This is especially important for paid placements, why using the correct direct URL is so important, to make sure you know what's performing.

In this case, I was able to offer value to my client's customer and help them both get better results. Solving problems like this is all in a days work for digital marketing and SEO.

Scroll to Top