If your Google Analytics is getting referrer spam, ghost spam or any other type of fake traffic or you would like to know whether you are getting such type of traffic then this article is for you.
In this article, I will show you how to minimise or even completely eliminate the negative impact of fake traffic on your GA reports.
Introduction to fake traffic in Google Analytics
In the context of Google Analytics, fake traffic is defined as one or more fake hits sent to your GA property.
A ‘hit’ is that user interaction with your website that results in data being sent to your Google Analytics property. A hit can be a ‘pageview’, ‘screenview’, ‘event’, ‘transaction’ etc.
A fake hit is the one which is generated by a program or a bot instead of a result of a living breathing human being who interacted with your website.
At present, it is possible to fake any GA hit :(
What that means, a spammer can send fake referral traffic, fake organic traffic, fake direct traffic, fake traffic from social media etc.
With adequate knowledge of the measurement protocol, it is possible to inflate, deflate or delete event the sales data from any GA property.
A spammer/hacker just needs your GA property ID to do their dirty magic. They can then practically rewrite your analytics data from any location around the world without any GA account access.
This is a big data security risk which many people are not aware of. Even using the premium version of Google Analytics does not protect from you being hacked/spammed.
Who Could Possibly Benefit From Sending Fake Traffic?
Affiliates are most likely to benefit from sending fake traffic as they get a commission. Internet marketers (particularly SEOs) can also benefit from sending fake traffic.
It is not very hard to artificially inflate organic search traffic in GA and then boast about one’s marketing efforts in front of client/boss.
In fact,
Any person who can benefit financially, in any shape or form, by sending fake traffic can send fake hits to your GA property.
It Is All About ‘Bots’
A bot is a program that is developed to perform repetitive tasks with a high degree of accuracy and speed.
Bots are generally used for web indexing (indexing the contents of websites).
But they are also widely used for malicious purposes like:
To commit click fraud (for increasing advertising revenue or depleting competitors’ advertising budget )
Harvest email addresses (for mass spamming)
Create fake user accounts
Submit comments for spamming purpose.
Scrape website contents (for creating spam website to host Google AdSense ads)
Spread malware (for advertising and getting ransom from webmasters)
Scrape Google Analytics Ids for sending fake traffic
Send fake website traffic etc.
Thus depending upon how a bot is used, we can have a good bot and we can have a spam bot.
Example of a good bot is ‘Googlebot’ which is used by Google to crawl and index web pages on the internet.
Good bots obey robots.txt directive but spambots don’t.
Spambots can use various methods to disguise themselves so that they can’t be easily detected by any security measure.
They can pretend to be a web browser (like Chrome, Internet Explorer, etc). They can pretend to be traffic coming from a legitimate website.
Not all spambots are developed to send fake traffic to Google Analytics.
But whether or not they skew your analytics data, they can still eat your website bandwidth and can negatively affect your website performance.
In a worst-case scenario, they can be used to hack your website or install your website with malware.
In the context of Google Analytics, there are two types of spambots:
Spambots which visit websites
Spambots which do not visit websites
Spam Bots which visit websites (the first generation bots)
These bots actually visit websites in order to send fake traffic (mainly fake referral traffic). These bots can crawl hundreds and thousands of websites every day and send out HTTP requests to the websites with fake referrer header.
They create and send fake referrer headers to avoid being detected as bots. The fake referrer header contains the website URL which spammer wants to promote and/or build backlinks.
For example, spambots may use ‘bbc.co.uk’ as a fake referrer.
Because the BBC is a legitimate website when you see that referrer in your report you won’t even think twice that the traffic coming from the website could be fake and that no one actually visited your website from BBC.
When your website receives an HTTP request from a spam bot with a fake referrer header, it is immediately recorded in your server log.
Some SEOs use such spambots for link building purpose.
They spam under the belief that if a server log is publicly accessible (i.e. it can be crawled and indexed by Google) then Google treats the referrer value in the server log as a backlink thus positively influencing the search engine ranking of the website being promoted.
But I am confident that Google is smart enough to detect what it is crawling is a log file and not a real web page and thus devalue all backlinks from server logs.
The first generation spambots have the ability to execute javascript and are thus able to avoid bot filtering methods used by Google Analytics.
Because of this ability, you can see traffic from such spambots in your Google Analytics ‘Referrals’ reports.
For examples, bots from buttons-for-website actually visit your website and send out HTTP requests to the websites with fake referrer header.
Spambots which do not visit websites (the second generation bots)
These spambots (like darodar.com) can send fake traffic even without visiting your website.
They do that by sending raw fake hit data (commonly known as Ghost traffic) directly to your Google Analytics server via measurement protocol.
All they need is your GA property ID.
They can procure property IDs in two ways, that I know:
Through spam bots which crawl websites and scrape GA property IDs.
By randomly generating property IDs
People who do not use Google Tag Manager, leave their Google Analytics tracking code hardcoded on their web pages.
The hard coded Google Analytics tracking code contains your web property ID. This ID can be scraped by spambots and could be shared with other spambots.
There is no guarantee that the bot which scraped your web property ID and the bot which sent you fake traffic is the same bot.
Because of this reason, there is no guarantee that your GA property won’t get any fake traffic just because your property ID does not contain ‘1’ at the end (a common misconception).
I have seen many GA properties receiving fake traffic even when their property IDs do not contain the number ‘1’ at the end.
You can fix this issue to an extent by using Google Tag Manager (GTM) which hide the property ID at least from the source code.
Since these spambots do not visit your website, their visit is not recorded in your server log.
Since their visit is not recorded in your server log, you can not block them by any traditional methods: IP blocking, user agent blocking, referrer blocking etc
Following are some of the common characteristics of fake GA traffic:
Browser size (visible browser viewport size) is (not set)
Spammers can fake almost any GA hit but ‘browser size’ is not one of them. This is because the browser size dimension gets its value (the visible browser viewport size) from the actual browser.
The hostname is different from your domain name or (not set)
The bounce rate is either very high (close to 100% or 100%) or very low (close to 0%).
The number of goal conversions or transactions attributed to fake hits is almost always zero.
There is almost always a pattern that fake traffic follows. For example, they may all have the same geolocation, browser version, screen resolution, request URI etc.
Hostnames and their role in blocking ghost traffic
In the URL: https://www.optimizesmart.com/, the part of the URL: ‘‘www.optimizesmart.com’ is called the ‘hostname’.
When a user (including spam bots which crawl websites) really visit your website from another website, the hostname (in most cases) point to your domain name.
But when a fake visit is recorded for your website by Google Analytics then the hostname is usually either blank (i.e. not set) or it points to any domain name other than your domain name.
For example, if a user clicks a link on a page hosted on ‘bbc.co.uk’ website and then visit your website say ‘www.abc.com’ then Google Analytics will record and report the hostname as: ‘www.abc.com’.
But in case of a fake visit (like that seems to be coming from ‘bbc.co.uk’ website), Google Analytics either do not report the hostname or report a hostname other than your website name.
This happens because spammers generally, randomly target Google Analytics properties.
They generally do not know your website name (hostname) unless they are specifically targeting your website or using spam bots which crawl websites. So they either fake the hostname or leave the hostname value field blank.
Whenever Google Analytics is not able to track the hostname, it reports it as ‘(not set)’.
What that means, if you include traffic from only those hostnames in your GA view which you recognise, you can greatly minimise the impact of ghost traffic on your website.
Any website where you are using your GA property ID (example: ‘UA-12345-1’) is a valid hostname. This can also include the domain name where you may have hosted your shopping cart. You need to identify all such valid hostnames.
One valid hostname which we always use is the one, pointing to our own website.
You definitely want to keep all the traffic coming from the hostname that points to your website in your GA view.
How to block ghost traffic from Google Analytics
There are two methods to block ghost traffic from skewing your Google Analytics data:
#1 Exclude all traffic from your GA reporting view where the browser size is ‘(not set)’.
#2 Only allow that traffic to be recorded in your GA reporting view where hostname is your domain name.
#1 Exclude all traffic from your GA reporting view where the browser size is ‘(not set)’.
Follow the steps below:
Step-1: Navigate to the admin section of your GA reporting view.
Step-2: Click on the ‘Filters’ link under the ‘View’ column:
Step-3: Click on the ‘+Add Filter‘ button:
Step-4: Create a new filter with following specifications:
Step-5: Make sure this is your first filter. You can change the filter order by clicking on the ‘Assign Filter Order’ button:
#2 Only allow that traffic to be recorded in your GA reporting view where hostname is your domain name.
Follow the steps below:
Step-1: Navigate to your main GA view (the view that you regularly use for analysing your website traffic).
Step-2: Navigate to the ‘Network’ report (under ‘Audience’ > ‘Technology’):
Step-3: Click on the ‘Hostname’ primary dimension:
Step-4: Set the date range of your report to the last 3 months.
Step-5: Make a note of all of the ‘hostnames’ whose traffic you want to include in your GA view:
Step-6: Convert the list of your hostnames into regular expressions.
For example if following is the list of your hostnames:
Follow the steps below to find referrer spam in Google Analytics:
Step-1: Navigate to ‘Referrals’ report in your GA view:
Step-2: Change the date range of the ‘Referrals’ report to the last two months.
Step-3: Sort the report by bounce rate in descending order or you can use the following regex (not foolproof) to filter out all the spam referrers in the ‘Referrals’ report:
Step-4: Look for referrers with 100%, close to 100% or 0% bounce rate and 10 or more users/sessions. They are most likely spam referrers:
Note: Exhaustive list of spam referrers can be found here: https://perishablepress.com/blacklist/ultimate-referrer-blacklist.txt
Step-5: If you can not confirm the identity of a suspicious-looking referrer, then you need to take the risk and visit the website to make sure whether or not it is a legitimate website and it is actually linking out to your website.
Make sure that you have anti-virus/ anti-malware software installed on your website before you visit such websites as they may infect your machine as soon as you visit them.
Use Google Chrome web browser to visit suspicious-looking websites. Chrome detects ‘malware deploying websites’ faster than any other web browser and malware scanner I know.
So if you use chrome, your machine is less likely to get infected when you visit any suspicious-looking website listed in your GA ‘Referrals’ reports.
Step-6: Make a note of all of the spam referrers whose traffic you want to block from your Google Analytics view:
Step-7: Convert the list of your spam referrers into regular expressions. For example, if following is the list of the spam referrers you discovered:
This will not solve your problem. It will just hide the problem, as then the traffic from spambot will appear as direct traffic in your GA reports and you will no longer be able to measure the impact of spambots on your website traffic.
Use Google Analytics ‘Bot Filtering’ feature
Google Analytics has a bot filtering feature which is set up at the view level. You should enable this feature so that GA excludes all hits from known bots and spiders.
Follow the steps below:
Step-1: Navigate to ‘Admin’ section of your main GA reporting view and then click on the ‘view settings’ link:
Step-2: Scroll down the page and then select the checkbox ‘Exclude all hits from known bots and spiders‘:
Use Annotation on your Google Analytics charts to exclude bot traffic from your analysis
Create an annotation on your chart and write a note explaining what caused the unusual traffic spike. This way you can easily discount this traffic from your analysis.
If you are using custom alert in GA, you can quickly detect and fix bad bots issues and thus minimise their impact.
List of widely known spam referrers
If one of the suspicious-looking referrers belong to the list of websites mentioned below then it is a spam referrer and you do not need to visit the website to confirm that:
buttons-for-website.com
blackhatworth.com
7makemoneyonline.com
ilovevitaly.com
ilovevitaly.co
ilovevitaly.ru
iloveitaly.ro
priceg.com
prodvigator.ua
resellerclub.com
savetubevideo.com
screentoolkit.com
kambasoft.com
socialseet.ru
superiends.org
vodkoved.ru
o-o-8-o-o.ru
iskalko.ru
luxup.ru
myftpupload.com
websocial.me
ykecwqlixx.ru
slftsdybbg.ru
seoexperimenty.ru
darodar.com
econom.co
edakgfvwql.ru
adcash.com
adviceforum.info
hulfingtonpost.com
europages.com.ru
gobongo.info
cenoval.ru
cityadspix.com
cenokos.ru
ranksonic.info
lomb.co
lumb.co
econom.co
54.186.60.77
srecorder.com
see-your-website-here.com
76brighton.co.uk
paparazzistudios.com.au
powitania.pl
sharebutton.net
tasteidea.com
descargar-musica-gratis.net
torontoplumbinggroup.com
searchenginewatch.com
Introduction to Botnet
A botnet is a network of infected computers spread in a particular geo-location or all around the world.
If a spam bot is using a botnet then it can access your website via hundreds of different IP addresses thus making IP blacklisting or rate-limiting (rate of traffic sent or received) pretty much useless.
The ability of a spam bot to skew your website traffic is directly proportional to the size of the botnet, the spambot is using.
Bigger the size of the botnet, more different IP addresses a spambot can use to access your website without being blocked out by firewall and other traditional safety mechanisms.
Many spambots are designed to infect your computer with malware, to make your machine a part of their botnet.
Once your computer becomes a part of botnet it is then used to forward spam, viruses and other malicious programs to other computers on the internet.
There are hundreds and thousands of computers all over the world which are used by real people and which are part of a botnet.
There is a good chance that your computer is a part of a botnet and you don’t know about it. So if you decide to block a botnet, you will most likely block the traffic coming from real people.
If spambot traffic is considerable skewing your website traffic in spite of regular IP address blocking then consider investing in ‘penetration testing’ or ‘bot protection service’.
Not every website is equally affected by spambots
This is because
spam bots are designed to detect and exploit the website’s vulnerabilities. They attack the weak and they attack often.
So if your website is hosted on some cheap shared hosting platform or is using a custom CMS/shopping cart, you are more likely to get attacked.
Often custom CMS/Shopping carts are not rigorously tested to find and fix application’s vulnerabilities. So it is wise to use reputed hosting provider, CMS and shopping cart solutions.
If you are running affiliate marketing campaigns on a large scale, your website is more likely to be attacked by spambots.
So choose your affiliates wisely.
I used ‘GoDaddy’ for hosting my websites. It is not that GoDaddy is cheap or some third-class web hosting but as long as I used their service, my website was always under a constant threat from bad bots deploying malware and was compromised often.
Website security is not something which really excites me. But GoDaddy made me learn every trick in the book to fight malware. When I changed my hosting provider, all the attacks stopped.
Now I am not saying that all websites hosted by GoDaddy are vulnerable. But this was certainly the case in my situation.
So
If your website is often attacked by bad bots, then changing your web host may help you.
Also, consider using a firewall. It acts as a filter between your computer/web server and the internet and can protect your website from spambots.
If you work for a large organization, you are most likely already using a firewall.
How to block traffic from spam bots which visit your website
Since the spam bot visit is recorded in your server log, you can block such bots through .htaccess file (or equivalent).
Following are the various methods you can use to block traffic from spam bots which visit your website:
Block the referrer used by spambot
Block the IP addresses used by the spambot
Block the IP address range used by spambot
Block the user agents used by spambots
Method #1: Block the referrer used by Spambots
For example, if you want to block traffic from searchenginewatch.com then follow the steps below:
Step-1: Access your .htaccess file
Step-2: Add the following code to block all HTTP and HTTPS referrals from searchenginewatch.comand all subdomains of “searchenginewatch.com”:
Spambot can also create dozens of fake referrer headers. So if you block one referrer, they may send your website another fake referrer.
So whether your block the spammy referrer by GA view filter or by using .htaccess, there is no guarantee that your website has completely blocked traffic from a spambot.
Method #2: Block the IP Addresses Used by Spam Bots
Access your .htaccess file and then add code like the one below:
RewriteEngine On
Options +FollowSymlinks
Order Deny, Allow
Deny from 234.45.12.33
Note: Do not copy-paste this code into your .htaccess, it won’t work.
This is just an example to show you how to block an IP address in .htaccess file. Spambots can come from many different IP addresses.
So you would need to add all the IP addresses used by the spambots and which are affecting your website.
Tip:
Block only those rogue IP addresses which are effecting your website.
Do not try to block all known rouge IP addresses as this will make your htaccess file very large and hard to manage and will impact your web server performance.
If your blacklisted lP address list keeps getting bigger and bigger than you have got serious website/network security issues.
Contact your web host or system administrator. Search google to find a list of blacklisted IP addresses.
You should automate this process by writing a script which can automatically find and ban known rogue IPs.
Method #3: Block the IP Address Range Used by Spam Bots
If you are sure that a particular range of IP addresses is being used by spam bots then you can block the whole IP address range like the one below:
RewriteEngine On
Options +FollowSymlinks
Deny from 76.149.24.0/24
Allow from all
Here 76.149.24.0/24 is a CIDR range.
CIDR is a method used for representing a range of IP addresses. Blocking by CIDR is more effective than blocking by individual IP addresses as it takes less space on your server.
Note: You can covert a CIDR to an IP range and vice versa via this tool: http://www.ipaddressguide.com/cidr
Method #4: Block the user agents used by spambots
Go through your server log files once in a week and find and ban malicious user agents (user agents used by spambots). Blocked user agents can not access your website.
You can block rogue user agents like the one below:
RewriteEngine On
Options +FollowSymlinks
RewriteCond %{HTTP_USER_AGENT} Baiduspider [NC]
RewriteRule .* – [F,L]
A simple search on Google can give you a big list of several websites which maintain records of known rogue user agents. Use these records to identify rogue user agents on your website.
You should write a script to automate this process. Maintain a database of all known rogue user agents and then use your script to automatically identify and block user agents.
Keep your database up to date as new rogue user agents keep popping up and old one keeps disappearing.
Block only those rogue user agents which are effecting your website. Do not try to block all known rogue user agents.
Otherwise, this will make your .htaccess file very large and hard to manage and will impact your web server performance.
Take the help of a system administrator
Protecting your client/company’s website from malicious mischief is 24/7 activity and is not really your job.
Your system administrator or whoever is in charge of network security is the best person to deal with spambots attacks. So whenever you discover a new spam bot, inform them.
Register for the FREE TRAINING...
"How to use Digital Analytics to generate floods of new Sales and Customers without spending years figuring everything out on your own."
Here’s what we’re going to cover in this training…
#1 Why digital analytics is the key to online business success.
#2 The number 1 reason why most marketers are not able to scale their advertising and maximize sales.
#3 Why Google and Facebook ads don’t work for most businesses & how to make them work.
#4 Why you won’t get any competitive advantage in the marketplace just by knowing Google Analytics.
#5 The number 1 reason why conversion optimization is not working for your business.
#6 How to advertise on any marketing platform for FREE with an unlimited budget.
#7 How to learn and master digital analytics and conversion optimization in record time.
My best selling books on Digital Analytics and Conversion Optimization
Maths and Stats for Web Analytics and Conversion Optimization
This expert guide will teach you how to leverage the knowledge of maths and statistics in order to accurately interpret data and take actions, which can quickly improve the bottom-line of your online business.
Master the Essentials of Email Marketing Analytics
This book focuses solely on the ‘analytics’ that power your email marketing optimization program and will help you dramatically reduce your cost per acquisition and increase marketing ROI by tracking the performance of the various KPIs and metrics used for email marketing.
Attribution Modelling in Google Analytics and BeyondSECOND EDITION OUT NOW!
Attribution modelling is the process of determining the most effective marketing channels for investment. This book has been written to help you implement attribution modelling. It will teach you how to leverage the knowledge of attribution modelling in order to allocate marketing budget and understand buying behaviour.
Attribution Modelling in Google Ads and Facebook
This book has been written to help you implement attribution modelling in Google Ads (Google AdWords) and Facebook. It will teach you, how to leverage the knowledge of attribution modelling in order to understand the customer purchasing journey and determine the most effective marketing channels for investment.
About the Author
Himanshu Sharma
Founder, OptimizeSmart.com
Over 15 years of experience in digital analytics and marketing
Author of four best-selling books on digital analytics and conversion optimization
Nominated for Digital Analytics Association Awards for Excellence
Runs one of the most popular blogs in the world on digital analytics
Consultant to countless small and big businesses over the decade
Learn and Master Google Analytics 4 (GA4) - 126 pages ebook
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept All”, you consent to the use of all the cookies.
This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
Cookie
Duration
Description
cookielawinfo-checkbox-analytics
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional
11 months
The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy
11 months
The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.