Completely Automated Public Turing Test to Tell Computers and Humans Apart (CAPTCHAS) are being used to transform old texts riddled with smudges, crooked type, and other distortions into searchable files. Optical character recognition (OCR) software often cannot correct mistakes in old texts, so human transcribers are enlisted.
Carnegie Mellon University researcher Luis von Ahn worked out a method to recruit CAPTCHA solvers on sites such as Ticketmaster and Facebook to correct such textual errors by replacing the randomly generated CAPTCHAS with words in need of clarification. Von Ahn estimates that reCaptcha is being employed by 70% to 90% of sites that have CAPTCHAS.
Two distinct OCR programs scan a photographic image of the text, and any word that is deciphered differently by the two programs or that does not appear in an English dictionary is labeled as suspicious by ReCaptcha. Each suspicious word is converted to aCAPTCHA, paired with a second CAPTCHA whose correct translation is already known, and then several Web users seeking entry to secure sites are provided both words and asked to decipher them separately. Answers for the unknown word are compared with the OCR guesses and the context analysis, and if the system is satisfied that the answer is correct, the game ends.
From The New York Times
View Full Article
Abstracts Copyright © 2011 Information Inc. , Bethesda, Maryland, USA
No entries found