Using Machine Learning to Break Visual Human Interaction Proofs (HIPs)

Machine learning is often used to automatically solve human tasks. In this paper, we look for tasks where machine learning algorithms are not as good as humans with the hope of gaining insight into their current limitations. We studied various Human Interactive Proofs (HIPs) on the market, because they are systems designed to tell computers and humans apart by posing challenges presumably too hard for computers. We found that most HIPs are pure recognition tasks which can easily be broken using machine learning. The harder HIPs use a combination of segmentation and recognition tasks. From this observation, we found that building segmentation tasks is the most effective way to confuse machine learning algorithms. This has enabled us to build effective HIPs (which we deployed in MSN Passport), as well as design challenging segmentation tasks for machine learning algorithms.

[1]  Henry S. Baird,et al.  BaffleText: a Human Interactive Proof , 2003, IS&T/SPIE Electronic Imaging.

[2]  Patrice Y. Simard,et al.  Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[3]  Jitendra Malik,et al.  Recognizing objects in adversarial clutter: breaking a visual CAPTCHA , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[4]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[5]  Kris Popat,et al.  Human Interactive Proofs and Document Image Analysis , 2002, Document Analysis Systems.

[6]  K. S. Baird,et al.  Anatomy of a versatile page reader , 1992, Proc. IEEE.

[7]  A. M. Turing,et al.  Computing Machinery and Intelligence , 1950, The Philosophy of Artificial Intelligence.