Does spam CAPTCHA your attention?
August 4th, 2008 | Griff Published in Technical
We all hate spam. It’s a daily frustration for most of us and poses an even bigger problem for businesses with an online presence who can be both the victim and an unknowing accomplice to the spammers.
E-mail harvesting and website hacks are the main ways for spammers to get to you. For example, a plain text e-mail address on a webpage can be harvested by e-mail “bots” then added to mailing lists and sold to the highest bidder.
To stop this, developers quickly adopted the use of graphical representations of the letters, displayed using a static JPEG or GIF image. But the bot programmers began to use Optical Character Recognition (OCR) to read that address just like the human eye.
In response, developers got smart and started to use contact forms. That was a good idea, only humans can fill in a web form, right? Wrong! It didn’t take long for the bots to get smarter. E-mail bots can now fill in those contact forms themselves.
So, what’s wrong with that? If you get spammed with your website form, you just delete when the spam hits your mailbox? That will work for maybe 1 or 2 spams a day, but what about 20? 200? Once your contact form is found by the bots, then you will start getting hit with more offers for fake Rolex watches and Viagra than you can deal with.
What is worse, this will impact on your e-mail server. If your e-mail server is a shared server from your ISP, you could find yourself in breach of their terms of service, and have your site shut down. Worse still, often those forms will be filled in using already harvested addresses, meaning any automated replies could be sent to innocent victims using YOUR mail server, from YOUR e-mail address. This could result in you and even your ISP being blacklisted which may again find you in breach of contact. In the worst case, compensation may be sought by your ISP or third parties as a result of system downtime, loss of revenue and time for remedial work.
So what can be done? How can you tell the bots from real people?
It’s actually relatively easy to spot a robot. The Turing Test has been around for a while, and so far no one has made a computer which can fool a human into thinking they’re real. What is needed then is a reverse Turing Test which challenges anyone using the form, with a question only a human could possibly answer. You will be glad to know that there are many different kinds of test which have been developed and are collectively referred to as CAPTCHA: “Completely Automated Public Turing test to tell Computers and Humans Apart”
You may already have used a CAPTCHA, they typically take the form of a graphical rendering of a word. The user must type the word in response thus passing the test. Early versions were easily broken using the same OCR methods as with e-mail harvesting, but the test have quickly been made harder by deforming the words, and adding extra graphics to make the words hard for the OCR to decode.
As a result, some CAPTCHA systems are very hard to read and have serious usability and accessibility issues, but systems such as re-CAPTCHA give very useable results and have audio versions of the code for increased accessibility.
Many ISPs are now making it mandatory for CAPTCHA systems to be installed on any web form they host, and with the risks discussed above, it’s clear why this is the case. No ISP wants to be blacklisted, accused of spamming and they certainly don’t want the performance of their servers compromised either.
This is why CAPTCHA systems are here to stay and anyone not taking on the new technology will be left behind … Still clearing their inbox probably.








