The University of Sheffield
Department of Computer Science

Simon Turner Undergraduate Dissertation 2016/17

Cracking CAPTCHAs

Supervised by R.Clayton

Abstract

CAPTCHAs are a key part of preventing spam by non-human users (bots) on the web. They aim to distinguish between humans and bots by presenting challenges that cannot be completed by a bot, but can be easily done by a human. Many existing CAPTCHA variants have already been shown to be ineffective at this task, but little research has been done to solve more recent CAPTCHAs, such as those produced by Google's reCAPTCHA project. These newer CAPTCHA schemes rely on software being unable to solve image or audio recognition tasks, but advances in machine learning have proven similar tasks to be more feasible for a machine.

This project aims to use modern machine learning techniques, combined with browser automation software, to prove or disprove the hypothesis that, "Google's reCAPTCHA test is unsolvable by software." In order to test this hypothesis, this project aims to construct a piece of software that can solve Google's reCAPTCHA.