Sentor
Blockscraping.com What is scraping? Prevent scraping News about scraping Data seeding Scraping FAQ Risk assessment Managed anti-scraping service About Sentor Contact us

What's web scraping?

What's the problem?

The growing threat of web scraping is definitely to be taken serious by online businesses dependant on the sharing of information on their web sites. As you read this article you will learn more about why scraping is a serious threat.

Web scraping defined

Web scraping (screen scraping, data scraping) is what you do when you copy data from a web site - manually or with a script or program - with the intention of using it in for business gain often on another web site.

Web scraping can sometimes be benevolent and totally acceptable. The web bots that search engines use to index the Internet is one example of that. On this web site we concern ourselves with malicious web scraping which is carried out for commercial gain - and how to block it.

Malicious scraping

Malicious web scraping is in fact systematic theft of intellectual property. Let's illustrate this further with an example - an online directory company. An online directory publishes its intellectual property - for example directory information such as names and addresses - on the Internet. The information is free for all to use as long as they comply with the terms and conditions of the directory. This works fine until someone - a web scraper - decides to abuse the service and systematically download large amounts of information for his own personal gain. The result is that the directory totally looses control over the information which they have invested time and money to gather, maintain and make available as a part of their service offering.

Why web scraping hurts your business

Because when you are the victim of web scraping, someone steals your content and can use it for personal gain - often even in competition with your own business! Imagine this: a scraper downloads the intellectual property on your company website and adds it to his own site. Suddenly you have competition from a company that offers the same services as you do - including whatever services they offered before - but had no costs whatsoever for gathering the material. In this case it's not hard to see why scraping can be a serious threat.

Scrapers stealing bandwidth

Web scraping is often performed by automated tools capable of downloading large numbers of web pages per second which will use considerable bandwidth. It is not uncommon that ten percent of the traffic on a website comes from malicious scrapers. This often results in two things. (1) Legitimate users get a website with lesser performance than they're used to. (2) Noticing the decrease in performance the company being scraped may feel forced to invest in further server capacity and more bandwidth - but this just helps to service the scrapers!

Who's threatened?

All online businesses dependent on the sharing of information on their websites are threatened by scraping. Examples of these include:
  • Directories
  • Dating sites
  • Property sites
  • Betting sites
Are you unsure whether your company may be at risk? Contact Sentor, a leading expert on scraping risk assessments and the blocking of malicious scraping.
News

Is Screen Scraping Legal? Read news about web scraping.

Wikipedia on Web Scraping

"In some instances, plagiarized content may be used as an illicit means to increase traffic and advertising revenue. The typical scraper website generates revenue using Google AdSense, hence the term 'Made for AdSense' or MFA website."

Learn more at Wikipedia »
© Sentor 2008.