As usual, first thing in the morning I was looking into our website’s analytics.
Everything looked fine. Users, landing pages, traffic numbers.
But then suddenly I’ve noticed something strange.
A landing page from a different domain showed up in our analytics reports.
I panicked for a couple of seconds.
What is this?
Has our site been hacked!?
Has our traffic been redirected?
How can a site from a different domain show up in our analytics?
I quickly ran through alerts, logs, WordFence status.
wpHelp24.com seemed to be clean.
Then I clicked on wptilt.com/wordpress-support/ and saw that it was an exact copy of a landing page from our site.
Here is wpHelp24com:
Here is a stolen copy of wpHelp24.com:
Both look pretty much the same.
And then I understood.
Someone has stolen a copy of our site and put it on their domain.
They have stolen everything: picture, html, css, all code and functionality.
And they did not even bother to remove our analytics tracking code.
So, we could view traffic to a stolen copy of wpHelp24.com in our google analytics reports.
Even more ridiculous was that they did not remove our Olark chat code.
So, if a visitor to a stolen copy of our WordPress site would like to chat, they would be able to chat with us.
How someone can copy and steal your WordPress content
Stealing someone’s site is easier than you think.
If you google “data scraping tools,” you’ll get 1.6 million results.
If you have at least some technical ability, you can install an app, copy someone’s site and put it on your own domain.
There is a free chrome extension which lets you scrape sites.
There is even a WordPress plugin for scraping.
If you have no idea how to use any of these tools, no problem.
You can hire a freelancer who will do all the work for a small fee.
Quite disturbing, isn’t it?
Guys, we have no other choice but to learn how to protect our sites.
Otherwise scrapers and other human parasites will flourish off of our hard work and effort.
5 steps to prevent your WordPress site being copied
- Use Absolute URLs
- Use CloudFlare to prevent content scraping and hotlinking
- Use one of these three WordPress Plugins to protect your site
- Monitor if someone has copied your WordPress site
- File a DMCA complaint to take the offender down
Use Absolute URLs
What is an absolute URL?
In absolute URL you put out an entire web address of the page you are linking to.
In relative URL you omit the domain and use only part of the web address you are linking to.
Here is an example of an absolute url:
And here is an example of a relative url:
A lot of WordPress sites use relative URLs because they are easy to code and implement.
However, if your site uses relative URLs, it becomes an easy prey.
A scrape will copy it, put on a new domain and it will just work.
No additional work required.
Great for a scraper.
Sucks for you.
So, implement absolute urls everywhere:
- Internal Links
A scraped site with absolute URLs will not work out of the box on a new domain.
The scraper will have to replace your domain with their own in all URLs.
This can lead to a lot of manual work for a scraper.
If you use absolute URLs and prevent hotlinking, all pictures on the copied site will break.
We’ll talk later how to stop hotlinking.
Let’s be honest.
Absolute URLs do not shield your site from being copied, but they make it so much harder to make it usable on a new domain.
On wpHelp24.com we use absolute urls.
I went to the scraped copy of whHelp24.com to check how it works.
The scraper did not replace our absolute URLs with theirs.
It means if a visitor clicks a link on the scraped site, they will land on wpHelp24.com.
If you want to learn more, here is an entire video by Moz about absolute and relative URLs.
And don’t forget to use plenty of internal links.
They get you backlinks from the websites which stole your content.
You if are lucky, you can even steal their audience.
Use CloudFlare to protect your WordPress site being copied
CloudFlare is a freemium service.
It puts a firewall around your website, stops bad bots and improves speed.
We are on a paid plan. But Cloudflare’s free plan includes most of the things I’ll describe below.
Sign up for CloudFlare or log in to your account.
Go to ScrapeShield.
And make sure all three protection shields are on.
Email obfuscation does not let to copy your email addresses.
Hotlink Protection prevents direct image loading from your site to someone else’s site.
Turn Server-side Excludes on then put your sensitive content inside of the <!–sse–> <!–/sse–> tag.
It will hide your content from automatic scrapers and bad bots.
Hotlinking allows scrapers to load images from your site on the scraped site directly from your site.
They steal both your images and bandwidth.
To stop this malpractice turn CloudFlare’s Hotlink Protection on.
CloudFlare did not prevent wpHelp24.com from being copied.
However, it does give an additional protection layer.
WordPress plugins to prevent your WordPress content from being copied
At least several scrape bots have already signed up to your RSS feed.
As soon as you publish something new, they republish it on their sites.
This is not dangerous for big and established websites because search engines index their content immediately after being published.
But if you have a new site, copied content can hurt your search engine rankings.
Look how this happens.
After you publish new content on a new site, Google will need several days to find and index it.
In the meantime a bad bot copies your new content and publishes it on its site.
If Google indexes the copied content before it finds and indexes your original content, you are in trouble.
Google might not be able to determine whose content piece is the original.
It might decide that your content is the duplicate.
Then you get the blame and low rankings.
Scraper gets the fame and all the traffic.
Let’s talk how to prevent this.
Use RSS feature in WordPress SEO by Yoast plugin
I assume you have WordPress SEO by Yoast plugin installed.
If you don’t, install it, it can really help you improve search engine rankings.
In your WordPress dashboard go to Settings > Advanced > RSS and make sure Yoast plugin adds links to your RSS feed.
It should look like this:
Yoast SEO plugin does not prevent your content from being copied, but it adds links back to your blog.
So both readers and search engines know whom this content belongs to.
Google will find the links to the original article and will have it much easier to decide which content is the duplicate.
If you need more protection than just links back to your site, let’s look at the next plugin.
WordPress ScrapeBreaker Plugin
WordPress ScrapeBreaker Plugin does what its name says.
It protects your site from iframes and server side scraping.
Frames are used to put your content in a frame and display on someone else’s site as their content.
ScrapeBreaker will monitor if your content is present on a different domain, and if so, will redirect all visitors back to your site.
Server side scraping is when a server script monitors, automatically scrapes and inserts your content on another site.
ScrapeBreaker will stop automatic scraping of your content.
We are testing ScrapeBreaker on our site now and will update this article as soon as we have the first results.
WP Content Protection plugin
WP Content Protection plugin claims to do a lot.
Here is the list of its free features:
- Disables right click context menu on all content (except href links)
- Disables text selection (globally) on PC and mobile devices
- Disables text and image drag/drop/save on PC and mobile devices
- Basic image protection (image link URL’s are automatically removed)
- Copy methods disabled from onscreen keyboard and shortcut context key
- Secures your uploads directory and sub-directories from public access
- Disables right click and save function on default video and audio embeds
- Disables keyboard copy controls (CTRL A, C, X) – Windows only
- Disables ‘Source view’, ‘Save Page’, and ‘Print’ key functions
Find out if your WordPress site is being copied
We use Copyscape.
It lets you find duplicate content on the web.
Simply copy paste URL of the page you want to check into the search window and press Go.
Copyscape will list all web pages which have same or similar content as your page.
The second tool we use is Google Alerts.
It is a great for early detection.
Go to Google Alerts and create alerts for content you want to monitor.
- copy paste your content into the search window,
- choose what types of websites should be monitored,
- provide an email to which the results should be sent.
You can create as many alerts as you like and adjust the settings to be notified on a daily, weekly, or “as it happens” basis.
How to remove the stolen content from the web
If you want to remove your content from the offending websites, there are several things you can do.
First, ask the webmaster of the offending site to remove it.
This almost never works, but you should always start with this step.
Use the contact form or try to find the contact email to ask the offender to remove your content from the site.
If there is no contact form or email on the offending website, use whois lookup tool to find out who is the owner of the domain and send them an email.
If you get no response, move on to the next step.
Send a DMCA complaint to the offender’s hosting company
Digital Millennium Copyright Act (DMCA) criminalizes production and dissemination of pirated technology, devices and services.
You can read more about it here: Digital Millennium Copyright Act.
First find out who is hosting the offending website.
You can use a tool like Whoishostingthis.com.
Simply copy paste the URL into the search window and get the result who is hosting the offender.
Then search on the hosting company’s website for Infringement Notification or DMCA complaint.
Most large hosting companies offer a possibility to send an infringement notification.
Fill it out and send.
Here are links to infringement notifications of several major hosting companies:
Send a DMCA complaint to Google and Bing
You can send a DMCA complaint to Google, Bing or any other search engine and ask them to remove the stolen content from the search results.
Here is a YouTube video how this works on Google.
Here is a link to Google’s legal page for Removing Content From Google.
And an article which explains how the removal process works: How To Issue A DMCA Takedown Notice To Google.
Filing a DMCA complaint is a long and tedious process, so you might consider using a third party for help.
One of the most well known takedown companies is dmca.com
Do it yourself stolen content takedowns cost $10 per month.
Professional takedowns start from $199.
There is no perfect solution to protect your content.
However, if you use all of the above methods, you will be pretty safe.
Guys, if you have something to add, have a question or have other ideas how to protect a WordPress site being copied, please, use the comments.
We always respond and do our best to help.