Geek guide to removing referrer spam in Google Analytics


Referrer spam occurs when your website gets fake referral traffic from spam bots and this fake traffic is recorded by your Google Analytics.


What is a bot?

A bot is a crawler program which is developed to perform repetitive tasks with high degree of accuracy and speed.

Bots are generally used for web indexing (indexing the contents of websites). But they are also used for malicious purposes like:

  1. to commit click fraud
  2. harvest email addresses
  3. scrape website contents
  4. spread malware
  5. artificially inflate website traffic etc.

Thus depending upon a how a bot is used, we can have a good bot and we can have a bad bot.


Good bots and bad bots

Example of a good bot is ‘googlebot’ which is used by Google to crawl and index web pages on the internet.

Majority of bots (whether good or bad) don’t execute javascript but some do.

Bots which execute javascript (like your google analytics tracking code) show up as hits in GA reports and skew the traffic data (direct traffic, referral traffic) and any metric based on sessions like bounce rate, conversion rate etc .

Bots which don’t execute javascript (like Googlebot) do not skew your traffic data and any metric. But their visits are still recorded in your server logs. They still consume your server resources. They still eat your bandwidth and can negatively affect your website performance.

Good bots obey robots.txt directive but bad bots don’t. Bad bots can create fake user accounts, send spam emails, harvest email addresses and can bypass CAPTCHAs.

Bad bots use various methods to disguise themselves, so that they can’t be easily detected by any security measure. They can pretend to be a web browser (like chrome, internet explorer etc). They can pretend to be traffic coming from a legitimate website.

Nobody can say for sure which bad bots can skew your GA data and which can’t. So for you,

all bad bots are threat to data integrity.


Spam bots

When a bad bot is used for spamming purpose, then it is known as the spam bot.

These spam bots crawl hundreds and thousands of websites every day and send out HTTP requests to the websites with fake referrer header. They create and send fake referrer headers to avoid being detected as bots.

The fake referrer header contains the website URL which spammer wants to promote and/or build back links.

When your website receive an HTTP request from a spam bot with fake referrer header, it is immediately recorded in your server log. If your server log is publicly accessible i.e. it can be crawled and indexed by Google, then Google treat the referrer value in your server log as a backlink thus influencing the search engine ranking of the website being promoted by spammers.

This is what spammers may be thinking when they decide to use spam bots for SEO. But I am sure Google is smart enough to detect what it is crawling is a log file and not a real web page and thus devaluing all backlinks from server log.

These spambots have the ability to execute javascript and are thus able to avoid bot filtering methods used by Google Analytics. Because of this ability, you can see traffic from such spambots in your Google Analytics ‘Referrals’ reports.



If the spambot is using a botnet (network of infected computers spread in a particular geo location or around the world), then it can access your website via hundreds of different IP addresses thus making IP blacklisting or rate limiting (rate of traffic sent or received) pretty much useless.

The ability of a spambot to skew your website traffic is directly proportional to the size of the botnet, the spambot is using.

Bigger the size of the botnet, more different IP addresses a spam bot can use to access your website without being blocked out by firewall and other traditional safety mechanism.

Not all spambots send referrer headers.

In that case traffic from such bots won’t appear as referral traffic in your GA reports. Such traffic would appear as direct traffic in your GA reports making it even more difficult to detect them.

Whenever a referrer is not passed, the traffic is treated as direct traffic by Google Analytics.

Spambot can also create dozens of fake referrer headers.

So if you block one referrer, they may send your website another fake referrer. So whether your block the spammy referrer by GA view filter or by using .htaccess, there is no guarantee that your website has completely blocked spambot.

As you know by now, not all bad bots are spambots. Some bad bots are really really bad.


Really bad Spam bots

The objective of really bad spam bots is not just to skew your website traffic or scrape contents or email addresses. Their objective is to also infect your computer with a malware, to make your machine a part of their botnet.

Once your computer becomes a part of botnet it is then used to forward spam, viruses and other malicious programs to other computers on the internet.

There are hundreds and thousands of computers all over the world which are used by real people and which are part of a botnet.

There is a good chance that your computer is a part of a botnet and you don’t know about it.

So if you decide to block a botnet, you will most likely block the traffic coming from real people.

There is also a good chance that as soon as you visit a suspicious looking website in your referrals report, your machine gets infected with malware.

So do not visit any suspicious looking website in your ‘Referrals’ reports without adequate protection (anti virus/anti malware programs installed on your machine). Use a separate machine just for visiting such websites or ask your system administrator to deal with this issue.


Smart Spam Bots

Some spam bots (like can send fake traffic even without visiting your website. They do that by reproducing HTTP requests which comes from GA tracking code and by using your web property ID. Not only they can send you fake traffic but also fake referrer.

For example, they may send as fake referrer. Now bbc is a legitimate website and when you see that referrer in your report, you won’t even think twice that the traffic coming from the website could be fake and that no one actually visited your website from BBC.

These smart bad bots don’t need to visit your website at all or execute any javascript. Since they don’t visit your website, their visit is not recorded in your server log.

Since their visit is not recorded in your server log, you can not block them by any means (IP blocking, user agent blocking, referrer blocking etc).

Related Article: What I learned from trying to fix the Ghost Referrer Spam in Google Analytics

Smart spam bots crawl your website in search of web property ids. People who don’t use Google Tag Manager, leave their Google Analytics tracking code hard coded on their web pages.

The hard coded Google Analytics tracking code contains your web property ID. This ID is scraped by smart spam bots and could be shared with other spam bots. There is no guarantee that the bot which scraped your web property ID and the bot which sent you fake traffic are the same bot.

You can fix this issue to an extent by using Google Tag Manager.

Use GTM to deploy Google Analytics tracking code on your website. If your web property ID has already been scraped then I am afraid you are too late to fix this issue. All you can do now is to use another web property ID or wait for Google to fix this issue from their end.


Not every website is equally affected by spam bots

This is because spam bots are designed to detect and exploit website’s vulnerabilities.

They attack the weak and they attack often. So if your website is hosted on some cheap shared hosting platform or is using a custom CMS/shopping cart, you are more likely to get attacked.

Often custom CMS/Shopping carts are not rigorously tested to find and fix application’s vulnerabilities. So it is wise to use reputed hosting provider, CMS and shopping cart solutions.

If you are running Affilate marketing campaigns on a large scale, your website is more likely to be assaulted by spam bots. So choose your affiliates wisely.

I used ‘Godaddy’ for hosting my websites. It is not that Godaddy is cheap or some third class web hosting but as long as I used their service, my website was always under a constant threat from bad bots deploying malware and was compromised often.

I spent months fighting malware on my website when it was hosted on the Godaddy.

This prompted me to write this article on finding and fixing malware: Malware Removal Checklist for WordPress – DIY Security Guide.

It may help you in avoiding referral spam. Website security is not something which really excites me. But Godaddy made me learn every trick in the book to fight malware.

When I changed my hosting provider, all the attacks stopped. Now i am not saying that all websites hosted by Godaddy are vulnerable. But this was certainly the case in my situation.

So if your website is often attacked by bad bots, then changing your web host just might help you.


Follow the steps below to detect and fix referrer spam:

Step-1: Go to Referrals report in your GA account and then sort the report by bounce rate in descending order:

google analytics referrer spam

You can also download the Google Analytics report which can automatically find all the Referrer Spam on your website.

Step-2: Look for referrers with 100% or 0% bounce rate and 10 or more sessions. They are most likely spammy referrers.

Step-3: If one of your suspicious looking referrer belong to the list of websites listed below then it is a spammy referrer and you don’t need to check the website to make sure:


Exhaustive list of spammy referrers can be found here:

Step-4: If you can not confirm the identity of your suspicious looking referrer, then you need to take the risk and visit the website to make sure it is a legitimate website and it is actually linking out to your website.

Make sure that you have anti virus/ anti malware software installed on your website before you visit such websites as they may infect your machine as soon as you visit them.

Step-5: Once you have confirmed the identity of bad bots, the next step is to block them ASAP from visiting your website again. You can do this by blocking spammy referrer through custom advanced filter as shown below:

block spam bots

block spam bots filter


Do not exclude the referrer spam website from the referral traffic via ‘Referral exclusion list’ – this will not solve your problem.

It will just hide the problem as then the traffic from spambot will appear as direct traffic in your GA reports and you will no longer be able to measure the impact of the spambot on your website traffic.

Once a spam bot visit is recorded as a hit by Google Analytics, your traffic data is skewed for good. You can not revert it.

So what you can do then?

#1 Create an annotation on your chart and write a note explaining what caused the unusual traffic spike. You then need to discount this traffic from your analysis.

bot traffic

bot traffic annotation


#2 Block referrer used by spambot

Access your .htaccess file (or web config if you use IIS) and add the following code:

RewriteEngine On

Options +FollowSymlinks

RewriteCond %{HTTP_REFERER} ^https?://([^.]+\.)*semalt\.com\ [NC,OR]

RewriteRule .* – [F]

This code will block all http and https referrals from and all subdomains of “”


#3 Block the IP address used by the spam bot

Access your .htaccess file and add a code like the one below:

RewriteEngine On

Options +FollowSymlinks

Order Deny,Allow

Deny from

Note: Do not copy paste this code into your .htaccess, it won’t work. This is just an example to show you how to block an IP address in .htaccess file.

Spambots can come from many different IP addresses. So you need to keep adding IP addresses used by the spambots effecting your website.

Block only those rogue IP addresses which are effecting your website.

Don’t try to block all known rouge IP addresses as this will make your htaccess file very large and hard to manage and will impact your web server performance.

If your blacklisted lP address list keep getting bigger and bigger than you have got serious security issues. Contact your web host or system administrator. Search google to find list of blacklisted IP addresses.

You should automate this process by writing a script which can automatically find and ban known rogue IPs.


#4 Block the IP address range used by spam bot

If you are sure that a particular range of IP addresses is being used by spam bots then you can block the whole IP address range like the one below:

RewriteEngine On

Options +FollowSymlinks

Deny from

Allow from all

Here is a CIDR range.

CIDR is a method used for representing range of IP addresses.

Blocking by CIDR is more effective than blocking by individual IP addresses as it takes less space on your server.

Note: You can covert a CIDR to a IP range and vice versa via this tool:


#5 Block the rogue user agents used by spambots

Go through your server log files once in a week and find and ban malicious user agents (user agents used by spambots). Blocked user agents can not access your website. You can block rogue user agents like the one below:

RewriteEngine On

Options +FollowSymlinks

RewriteCond %{HTTP_USER_AGENT} Baiduspider [NC]

RewriteRule .* – [F,L]

A simple search on google can give you a big list of several websites which maintain records of known rogue user agents. Use these records to identify rogue user agents on your website.

You should write a script to automate this process. Maintain a database of all known rogue user agents and then use your script to automatically identify and block user agents by looking at the database. Keep your database up to date as new rogue user agents keep popping up and old one keep disappearing.

Block only those rogue user agents which are effecting your website. Don’t try to block all known rouge user agents. This will make your htaccess file very large and hard to manage and will impact your web server performance.


#6 Use the Google Analytics ‘Bot filtering’ feature ‘Exclude hits from known bots and spiders’ (under Reporting View Settings)

bot filtering google analytics

#7 Monitor your server logs at least once a week

Fighting with bad bots start at the server level. If you can stop them from visiting your website in the first place, you don’t need to exclude them later from your GA reports.


#8 Use firewall – Firewall act as a filter between your computer/web server and the internet and can protect your website from bad bots. If you work for a large organization, you are most likely already using a firewall.


#9 Take help of system administrator – Protecting your client/company’s website from malicious mischief is 24/7 activity and is not really your job. Your system administrator or whoever is in charge of network security is the best person to deal with bad bots attacks. So whenever you discover a new bad bot, inform him/her.


#10 Use Google Chrome to surf the web – If you are not using a firewall, then the second best option to surf the web is to use Google Chrome.

Chrome detect malware deploying websites faster than any other web browser and malware scanner I know.

So if you use chrome, your machine is less likely to get infected when you visit any suspicious looking website in your GA ‘Referrals’ reports.


#11 Use custom alerts to monitor unusual spikes in traffic esp. direct and referral traffic. If you are using custom alert in GA, you can quickly detect and fix bad bots issues and thus minimize their impact.


#12 Invest in penetration testing – If bots traffic is considerable skewing your website traffic inspite of regular IP address blocking, rogue referral blocking or changing the website host then consider investing in penetration testing or bot protection service.


Other article you will find useful: One tip that will Skyrocket your Analytics Career


Quick Announcement about my new books

maths and stats bottom banner email analytics bottom banner attribution modelling bottom banner

Book #1: Maths and Stats for Web Analytics and Conversion Optimization - This expert guide will teach you, how to leverage the knowledge of maths and statistics, in order to accurately interpret data and take actions, which can quickly improve the bottom-line of your online business.

Book #2: Master the Essentials of Email Marketing Analytics - This book focuses solely on the ‘analytics’ that power your email marketing optimization program and help you in dramatically reducing your cost per acquisition and increasing marketing ROI, by tracking the performance of the various KPIs and metrics used for email marketing.

Book #3: Attribution Modelling in Google Analytics and Beyond - Attribution modelling is the process of determining the most effective marketing channels for investment. This book has been written to help you, in implementing attribution modelling. It will teach you, how to leverage the knowledge of attribution modelling, in allocating marketing budget and understanding buying behaviour.

Himanshu Sharma

Certified web analyst and founder of

My name is Himanshu Sharma and I help businesses in finding and fixing their Google Analytics and conversion issues.
  • More than ten years' experience in SEO, PPC and web analytics
  • Certified web analyst (master level) from
  • Google Analytics certified
  • Google AdWords certified
  • Nominated for Digital Analytics Association Award for Excellence
  • Bachelors degree in Internet Science
  • Founder of and
I am also the author of the book Maths and Stats for Web Analytics and Conversion Optimization If you have any questions or comments please contact me