Sentor
Blockscraping.com What’s scraping? What’s your business? Directory Airline Online property B2B-portals What’s your position? Executive IT Information security Legal Scraping news Prevent scraping Data seeding Scraping FAQ Sentor Services Risk assessment Managed anti-scraping service Scraping consulting About Sentor Contact us

Simplifying screen scraping in 2010

2010-01-29

With the turn of the new year, many will take on fresh approaches to aspects of their life and work. Perhaps they will need to take a new stance on their attitudes to data and screen scraping as well in 2010, with some predicting that it will become a more widespread problem in the future.

As Frank Jennings, partner for and on behalf of law firm DMH Stallard, has commented: "Scraping in its various forms (screen, web, data) will increase over the next few years as traditional content distribution dies out and we are left with online-only, as is being witnessed with music CDs."

In terms of how important screen scraping is to businesses today, a recent poll of Next Gen Market Research Group members may indicate how significant it is. Of those surveyed, 19 per cent said they use either web or screen scraping. While this was low when compared to how many respondents used other analytical techniques or methodologies, such as data mining (59 per cent) and web analytics (52 per cent), almost a third said they would like to use either web or screen scraping in the future.

This meant that of all the techniques cited, screen scraping was the third most desired, coming behind social network analysis and blog mining, which both saw 38 per cent of members wanting to use them.

Indeed, Next Gen market researcher Tom H C Anderson indicated that screen scraping could even rise in comparative importance. Writing in his blog, Mr Anderson said: "I believe we will see social network analysis as well as screen/web scraping surpass blog mining in the near future."

This kind of screen scraping may not prove damaging for a company and the organisations being scraped might not be irked by such actions.

However, there are cases where scraping could prove to be more contentious. One issue that emerged this month surrounded search engines' ability to scrape data from other websites.

Companies such as Google could find that they are exempt from UK law of any liability for copyright infringement if an amendment proposed by Conservative Lord Lucas to the digital economy bill is passed, reports have indicated.

"Every provider of a publicly accessible website shall be presumed to give a standing and non-exclusive licence to providers of search engine services to make a copy of some or all of the content of that website, for the purpose only of providing said search engine services," Lord Lucas said in his recommendation.

However, if the amendment were to be made, it would not necessarily mean that search engines can just scrape as much data as they want and from any site they wish. Under the amendment, publishers could make the decision to block out search engines trying to scrape them.

Of course, for those firms unhappy about scraping activity, there are ways to stop it. One option is to head to Sentor for help.

The company has plenty of experience in the scraping world, having appeared as an expert witness and found stolen data in the past. It offers consulting services, as well as an automated anti-scraping surveillance network, that firms can take advantage of in order to learn more and perhaps prepare themselves for the predicted rise in screen scraping occurrence. As the saying goes, better to be safe than sorry.

Executive scraping report
Web scraping information
Get a scraping report!
Facts about web scraping

Like the evil one, data scraping has many names. Below is a list of expressions which all are similar to "data scraping".

  • Web scraping
  • Screen scraping
  • Page scraping
  • HTML scraping
  • Scrapping
Learn more about scraping »
© Sentor 2009