What’s the problem?
If you only had a bricks and mortar business, what would you do if it was visited by 1m thieves every day” Or if two thirds of visitors to your premises were actually sent by your competitors, to deliberately block your legitimate customers” How about if the competition managed to undercut your prices every time, no matter what you did What about finding out your customers were actually not your consumers, but third party agents who marked-up your prices before selling your products on”
If I told you that your online business is likely to be suffering similar events, chances are you wouldn?t even have realised. Many organisations are suffering the wholesale theft of web site content (known as web or screen scraping, data harvesting, or web data extraction), and don’t even know about it.
As the first IT security consultancy to identify web scraping as a business threat and develop a solution, we have been asked by all manner of businesses to help them with scrapers. If you thought the scenarios above were a bit far-fetched, they weren?t. One client we have didn?t know that their site was being scraped a million times a day; or that 60 per cent of the traffic on its site was web scrapers, causing legitimate customer to abandon transactions, because the scraping was slowing or blocking access.
Another client hired us because they suspected that third parties were scraping its site and making purchases on behalf of consumers, which meant our client was losing sell-on opportunities as well as no longer being in control of the brand relationship.
When we developed our anti-web scraping solution back in 2006, nobody advertised their web scraping services, but do a search now and the majority of the 7m+ hits returned are companies offering to scrape sites for a small fee. And that number is only going to get bigger.
Some competitors scrape sites for price sensitive information and adjust their own prices accordingly, to ensure they have the competitive edge. Other scrapers are wholesale thieves who either use your content themselves, possibly even posing as you to customers, or sell it on to others to use however they want. Others are agents or touts who sell on your goods or services, with mark up.
Depending on your budget and approach to customer service, there are a number of options you could consider.
- Implement a captcha request for visitors, before they complete transactions. But, the only people it inconveniences is the customer, as scrapers can easily bypass the system;
- Interrogate every visitor to your web site, and check their IP address against a list of known offenders assuming you had that information available – and then decide if you want them on your site;
- Increase your IT spend and buy more servers to allow the scrapers to carry on but without slowing down your legitimate customers; or
- Get an expert in.
There are subtleties and nuances in reasons for scraping, and your decision on how to handle the problem may change depending on the, who, what, when, and why you are being scraped. A good outcome would be to use a solution that allows you a flexible approach.
Sometimes you might want to let scrapers on to your site and possibly direct them to some irrelevant or misleading information, which is especially useful if it is the competition who is scraping you. Or it might be that you are want to do a deal with the people behind the scraping and actually turn it into a revenue earning situation for you both. Or you might simply want to block the scrapers and have done with the problem.
However you want to handle web scrapers, it is important for your brand and your customers that you find out if your online business is being scraped and put a stop to it.
Martin Zetterlund is founding partner of Sentor