Cracking Maths CAPTCHAs

Following on from a previous blog post about cracking weak CAPTCHA implementations this time I played around with solving a more novel type of CAPTCHA, one that offers a mathematical challenge rather than the distorted words that we've all grown accustomed to.

About maths CAPTCHAs

The most popular maths CAPTCHA (with over 300,000 active installs) comes in the form of a Wordpress plugin that challenges the requester with a simple textual calculation.

The decision to use a textual challenge by default, whilst being good for accessibility, isn't good for the resilience of a CAPTCHA, so in an attempt to mitigate this point the CAPTCHA has the following tricks up its sleeve...

  • Calculations consist of both numbers and words.
  • Part of the calculation is missing, either a component or the result.

From the user's perspective they are presented with a calculation containing the aforementioned facets and are then asked to provide the missing piece of the calculation.

CAPTCHA used on a WordPress login

Solving the CAPTCHA automatically

Because the CAPTCHAs use a textual challenge they can be easily parsed and then solved by a script. To prove this I have written a simple Python script to solve match CAPTCHAs.

Usage

$> python maths_captcha_solver.py "http://example.org/wp-login" "container_class_name"
Found CAPTCHA: 3 - = 1
Solution: 2

Process

The process for cracking such a simple CAPTCHA consists of the following short steps:

  1. Extract the CAPTCHA's text from the web-page.
  2. Convert numeric words to numbers.
  3. Extract the parts of the calculation.
  4. Based on the operator of the calculation, the first and second components, as well as the result are passed into the relevant function.
  5. The function then returns the missing part of the calculation.

Circumventing the CAPTCHA

Specific to the aforementioned Wordpress plugin is the ability to circumvent its challenges by submitting the same variables as a previously solved CAPTCHA. The process is as so...

  1. The challenge is presented to the user, which is accompanied with the inputs shown in the HTML below.
  2. The user/script replaces the values of the Input elements with those of a previously solved CAPTCHA.
  3. The user/script submits the CAPTCHA.
    • The server will blindly accept the CAPTCHA as being correct. It does this because the server doesn't keep any information pertaining to the specific CAPTCHA it serves to each user, instead it puts these in the HTML so they are posted back to it.

There is a slight caveat to the process in that the values being resubmitted will need to be renewed every 24 hours due to the server-side 'password' (used to create the value in the 'cptch_result') changing once a day.

<p class="cptch_block">
  <!-- The resulting hash of the solution, salt and secret server-side 'password' -->
  <input type="hidden" value="vZ7n" name="cptch_result">

  <!-- The salt used when producing the hash above -->
  <input type="hidden" value="1426868778" name="cptch_time">

  <input type="hidden" value="Version: 4.0.9"> 8 + eight =

  <!-- Solution to the CAPTCHA provided by the user -->
  <input type="text" name="cptch_number" class="cptch_input" id="cptch_input">
</p>