Story·25 May 2026·7 min read

13 minutes against OWASP Juice Shop: anatomy of a real Pwnkemon scan

A pentest tool that hasn't been pointed at a real, known-vulnerable target is just marketing. So here's Pwnkemon pointed at one: OWASP Juice Shop, the security community's standard intentionally-vulnerable web app. We ran a Standard-tier scan against the public Heroku deployment, didn't pre-load any hints, and let the agent decide what to look for.

Total time: 13 minutes. Total cost: under $1 in compute. Result: 11 confirmed findings, 2 of them Critical, including a SQL injection chain that took the application offline and a JWT signing-key exposure that would let an attacker forge admin sessions.

This post walks through what the agent did, in order, and what made it into the final report.

The setup

Scan target: juice-shop.herokuapp.com, the public OWASP Juice Shop instance. Tier: Standard, the default tier, which unlocks DAST tooling (nuclei, katana) on top of the network + code scanners that ship in every tier. No pre-configured hints. No auth credential. Same flow any Pwnkemon user gets on signup.

The agent had access to its full primitive set:

nmap, TCP port discovery + service version detection
httpx (ProjectDiscovery), HTTP/TLS fingerprinting
nuclei (ProjectDiscovery), ~13,000 templates covering CVEs, exposures, misconfigurations
katana (ProjectDiscovery), headless Chromium crawler for JS-rendered surfaces
http_request / http_path_probe / tls_*, primitives the agent uses to craft custom probes
A live CVE database lookup

The agent picks which tools to run, in what order, based on what it finds. We don't script the chain. That's the whole point.

What the agent did, in order

Reading from the scan's audit trail (every Pwnkemon scan produces one, retrievable from the dashboard):

DNS lookup + TLS certificate inspection on the target hostname, established the cert chain, the Heroku-fronted infrastructure, and the absence of HSTS.
nmap port discovery against the resolved host, found 80 and 443 open. Service-version pass identified the application as Express on Node.js, fronted by Heroku's router.
httpx fingerprint, pulled the default landing page, identified the app stack as Juice Shop / Express, captured the response headers (no HSTS, no CSP, no X-Frame-Options).
katana crawl at depth 2, enumerated the JS-rendered SPA surface, returning ~80 URLs including the /rest/* API endpoints, the /api/* admin paths, and the static-asset directories.
nuclei scan against high-priority URLs with the exposure and misconfig tags, this is where the directory listings on /ftp/, /encryptionkeys/, and /support/logs got flagged.
Targeted HTTP probes the agent crafted itself based on the crawl results, checking the admin-config endpoint, the Prometheus metrics endpoint, and submitting SQL injection payloads against the search and login endpoints.
TLS protocol enumeration via nmap --script ssl-enum-ciphers, mapped the supported cipher suites + protocol versions, surfaced the lack of TLS 1.3 and the presence of non-forward-secret RSA cipher suites.
Verification iteration, the agent went back and confirmed the SQL injection findings by re-issuing benign requests to confirm the application was still in an unrecoverable 503 state, distinguishing “the request errored” from “we crashed the entire app.”

Then it called complete_scan, the triage LLM wrote the report, the box agent submitted the result, and the scan was done.

The headline findings

Critical: SQL Injection → full DoS

The /rest/products/search endpoint accepts a q= parameter that's interpolated directly into a SQLite query. Submitting q='))-- was enough to crash the Node.js process. The Heroku app remained 503 for the rest of the scan, one unauthenticated GET request took the entire application offline for every user.

The agent didn't just find this and move on. It tried the same pattern against /rest/user/login (same result), and then went back to confirm the outage was global and persistent, not just a transient error. That confirmation step is what makes the finding actionable: it documents the exploit as both a data-extraction primitive (UNION-based dumping is possible) AND a one-request DoS primitive, and it makes the criticality unambiguous.

Critical: Persistent service outage

Same root cause, separate severity axis. The SQLi above isn't just an attack on data, it's an attack on availability. Any unauthenticated visitor with one HTTP request can take the app offline indefinitely. That's a censorship vector. That's an extortion vector. It deserves a separate finding row, not a footnote.

High: JWT signing key publicly readable

The directory /encryptionkeys/ is served by a misconfigured static handler that returns a full directory listing. Inside is jwt.pub, the application's JWT signing public key. The agent flagged this and connected it to the SQLi finding: if an attacker can pair the public key with a known RS256-vs-HS256 algorithm-confusion attack (most JWT libraries before 2023 are vulnerable), they can forge admin tokens. Combined with the SQLi-derived password hashes, this is a full unauthenticated path to administrative access.

High: Admin configuration endpoint exposed

/rest/admin/application-configuration returns the full app config, no auth required, including LLM chatbot configuration, feature flags, and internal port numbers. Reconnaissance gold.

High: Prometheus metrics + log directory + FTP listing

Three separate findings, same root cause: missing auth on directories the static file server happily exposes. /metrics reveals operational telemetry (1,894 5xx responses recorded). /support/logs is a directory listing of application log files with stack traces and user activity. /ftp/ is a directory listing including a file marked “CONFIDENTIAL: do not distribute” containing acquisitions strategy.

What didn't make the report (and why)

The agent ran nuclei against the target with the exposure and misconfig tags. Nuclei produced more raw findings than ended up in the report, about a dozen low-impact informational signals that the triage step deduplicated or dropped. Examples:

Detection-only templates that fingerprint the framework (“this is Express”), rolled into the Server Technology Disclosure low-severity finding rather than each getting their own row.
Generic “missing security header” templates for headers that aren't HSTS or CSP, the actually-impactful ones were consolidated into one medium-severity HSTS finding.
Information-level templates flagging public files (robots.txt, favicon, etc) with no security relevance, dropped.

This deduplication is the point. A raw nuclei dump on Juice Shop produces ~30 lines of output. A useful security report produces 11. The difference is the LLM doing the work a human pentester would otherwise do: which of these are actually worth your time?

Attack chains the agent identified

Findings in isolation aren't the whole story. The report's value is in connecting them:

JWT key → algorithm confusion → admin takeover. /encryptionkeys/jwt.pub gives the attacker the RSA public key. Sign a JWT with it using HS256 (treating the public key as an HMAC secret), impersonate [email protected], hit /rest/admin/*. No credentials needed.
SQLi → database dump → offline cracking. The /rest/products/search injection allows UNION-based extraction of every table. Pull the user table, get email + password hash pairs, crack offline, log in.
FTP directory listing → sensitive document exfiltration. The /ftp/acquisitions.md file is in clear text under a public path. Anyone with a browser has it.
Single SQLi request → total outage. The DoS primitive is independent of the data-extraction one. Even an attacker who couldn't exfiltrate anything can keep the service down indefinitely.

What this demonstrates

Three things, in order of importance:

1. The DAST toolchain is real. nmap, nuclei, httpx, katana are the same tools a paid DAST product runs. Same versions. Same templates. The differentiation isn't that we run different scanners, it's that an LLM agent decides which to run, in what order, on what evidence.

2. The agent reasons, not just executes. The SQLi finding's severity-as-DoS-as-well framing wasn't from a scanner template. The JWT-key-to-admin chain wasn't in any tool's output. Both came from the triage step looking at the raw findings and asking: given these primitives, what would a competent attacker actually do?

3. The price floor is the point. The same scan, done by a human pentest consultancy, would have cost you a $200/hr engagement and three business days of waiting. The same scan, done by Acunetix or Burp Suite Enterprise, would have produced a longer list of less-prioritised findings and required you to read every one to decide which to fix first. The Pwnkemon scan cost under $1 of compute, ran in 13 minutes, and produced a report you can hand to a developer.

Try it yourself

Juice Shop is public. Anyone can scan it. Sign in, verify juice-shop.herokuapp.com as a target (you don't need to own it, OWASP authorises Juice Shop for unrestricted security testing), and run a Standard scan. The report you get back will look very similar to the one above.

Then point Pwnkemon at your infrastructure. The agent doesn't care whether the target is intentionally vulnerable or your production app, the same primitives chain, the same triage step runs, the same report shape comes out.

See pricing for plans, or scan tiers for the full per-tier toolchain breakdown.