Prancer Blog / SwarmHack Deep Dive
Inside SwarmHack: Anatomy of a Single-Command Kill Chain
How one swarmhack spawn command orchestrates 32 AI agents through reconnaissance, exploitation and lateral movement against a segmented Docker lab.
SwarmHack Team · 2026-04-15 · 8 min
TL;DR
- How a single
swarmhack spawncommand runs a full external + internal pentest - The lab topology that proves segmentation is no defense against autonomous agents
- The numbers: 11 findings · 35 crown jewels · 6m 7s — reproducible across 11 consecutive runs
SwarmHack is a penetration testing tool, not a vulnerability scanner. The distinction matters: a scanner enumerates every weakness and produces a CVSS-sorted report. A pentest finds one way in, exploits it deep, pivots, escalates, reaches the crown jewels — then stops.
This post is the first of a three-part series walking through a real validated engagement against a multi-host Docker lab.
1. One Command, Full Kill Chain
swarmhack spawn --target http://localhost:8880 \
--token $PRANCER_TOKEN --customer $PRANCER_CUSTOMER
That command — and only that command — produces:
| Metric | Value |
| -------- | ------- |
| Total findings | 11 |
| Crown jewels extracted | 35 |
| Targets compromised | 2 (external + internal via pivot) |
| Wall-clock time | 6 minutes 7 seconds |
| Human intervention | Zero |
| Reproducibility | 11/11 consecutive runs identical |
Each agent stops on first confirmed vulnerability, then shifts into deep exploitation — extracting data, harvesting credentials, mapping crown jewels — instead of scanning every endpoint for the same class. Fewer findings, higher confidence, deeper impact.
2. Lab Architecture
A purpose-built Docker Compose lab with enforced network segmentation: one bridge network for internet-facing traffic, and a second one marked internal: true so Docker drops all outbound packets.
<diagram title="Network topology — two-network segmentation">
ATTACKER
│
┌────────┴────────┐ external_net (172.20.0.0/24)
│ │
┌──┴───────────────┐ │
│ TARGET A │ │ ← dual-homed pivot
│ 172.20.0.10 │ │ Apache + PHP + MySQL + SSH
│ 172.20.1.10 ────┼─┼──┐
└──────────────────┘ │ │
│ │ internal_net (172.20.1.0/24)
│ │ Docker drops all outbound
│ │
┌──┴──┴────────────┐
│ TARGET B │ ← unreachable from outside
│ 172.20.1.20 │ DVWA, no port mapping
└──────────────────┘
</diagram>
external_net: standard bridge, reachable from the host via ports 8880 (HTTP), 2222 (SSH), 33060 (MySQL).internal_net(internal: true): Docker blocks every packet that tries to leave. The only path in is *through* Target A.
That's the segmentation a real network gives you. The whole point of this engagement is showing how an autonomous tool routes around it.
3. The Targets
3.1 Target A — the dual-homed front door
Ubuntu image with Apache 2.4.41, PHP, MySQL (root, no password), and OpenSSH. Vulnerable endpoints span CMDI, XSS, CSRF, SQLi, XXE, session fixation, and exposed .env / .git/config.
| Endpoint | Vulnerability | Parameter |
| ---------- | -------------- | ----------- |
| /ping.php | OS Command Injection (CWE-78) | host (POST) |
| /search.php | Reflected XSS (CWE-79) | q (POST) |
| /login.php | Session Fixation (CWE-384) | username, password |
| /admin.php | XXE, SQLi, CSRF | cmd (POST), XML body |
| /.env | Credential disclosure | Stripe key, SSH creds, DB creds |
The .env file is the seed of the entire kill chain:
DB_HOST=localhost
SECRET_KEY=sk_live_4eC39HqLyjWDarjtT1zdp7dc
STRIPE_API_KEY=sk_test_51ABCDeFgHiJkLmNoPqRsTuVwXyZ
INTERNAL_API=http://172.20.1.20/api/v1
SSH_USER=pentest
SSH_PASS=pentest123
Two SSH accounts exist: root:toor and pentest:pentest123. The pentest user has sudo NOPASSWD: ALL — a trivial privesc once shell access is obtained. The lab mirrors patterns we see in real production environments.
3.2 Target B — the internal-only DVWA
vulnerables/web-dvwa:latest on internal_net only. No port mapping, no route out. The only way to reach it is to pivot through Target A.
4. Phase 1: External Web Scan (T+0s → T+30s)
The GOAP planner runs WebCrawler first (sequential, 90s timeout, max 100 pages), then launches 20+ exploit agents in parallel via tokio::JoinSet with a 25-slot semaphore. A GlobalRateLimiter caps execution at 50 concurrent requests / 50 req/s — enough to stay deterministic, slow enough to avoid tripping a WAF.
<diagram title="Phase 1 — parallel agent fan-out">
SwarmHack → :8880
│
├─ WebCrawler (sequential, 90s timeout)
│ Maps: /, /ping.php, /search.php, /login.php, /admin.php, /info.php
│
└─ 20+ agents in parallel (tokio::JoinSet, 25-slot semaphore)
├─ CMDI → ping.php 'host' → marker SWMHK12019CK confirmed
├─ XXE → admin.php → file:///etc/hostname resolved
├─ CSRF → admin.php → cross-origin POST accepted
├─ SQLi → admin.php 'cmd' → tautology accepted
├─ XSS → search.php 'q' → <script>alert(1)</script> reflected
├─ VulnComp→ Apache/2.4.41 → CVE-2021-44790, CVE-2021-44224
├─ AuthByp → admin.php → accessible without auth
└─ SessFix → login.php → PHPSESSID not regenerated
</diagram>
5. The Star Finding: Command Injection in ping.php
| Field | Value |
| ------- | ------- |
| Severity | critical |
| CWE | CWE-78 (OS Command Injection) |
| MITRE ATT&CK | T1059, T1190, T1059.004 |
| Payload | ; echo SWMHK$(expr 7777 + 4242)CK |
| Detection | Marker SWMHK12019CK found in response body |
| Crown jewels | 15 |
The arithmetic-marker payload (expr 7777 + 4242 = 12019) is the gold standard for CMDI confirmation: if the computed value appears in the response, the server executed the injected shell command. Zero false-positive surface.
Deep exploitation extracted 15 crown jewels in a single pass — credentials, kernel info, network interfaces, the .env file (which seeds Phase 2), and six ready-to-use exfiltration payloads:
| # | Category | Value |
| --- | ---------- | ------- |
| 1–3 | credential | www-data, uid=33, 26 accounts (root + pentest with shells) |
| 4 | api_key | .env contents — Stripe + SSH + DB credentials |
| 5–9 | system_info | Kernel, env vars, 3 interfaces (eth0 / eth1 / lo), FS layout |
| 10–15 | post_exploitation | Bash + nc reverse shells, curl/HTTP exfil, DNS exfil |
Kill-chain significance — the.envextraction is the bridge.SSH_USER=pentest,SSH_PASS=pentest123, andeth1: 172.20.1.10are all the autonomous engine needs to pivot from web exploit to internal compromise. That's the topic of Part 2.
6. Confidence is Calibrated, not Guessed
Every finding ships with a confidence floor derived from evidence quality — not from a language model:
| Agent class | Confidence | Basis |
| ------------- | ----------- | ------- |
| Privilege Escalation | 1.00 | Chain fully validated |
| CMDI | 0.99 | Marker arithmetic proof |
| XSS | 0.90 | Payload reflected unencoded |
| SQLi | 0.75 | Tautology heuristic |
| CSRF | 0.70 | Cross-origin POST accepted |
| XXE | 0.60 | In-band heuristic only |
A reviewer can replay any finding: same payload, same detection logic, same response — same verdict. Across 11 consecutive scans, SwarmHack produced exactly 11 findings and 35 crown jewels every single time.
What's Next
In Part 2 we follow the credentials extracted here through the SSH lateral-movement loop, the privilege-escalation synthesizer, and the SSH tunnel that puts the autonomous engine inside the isolated network — all without a human touching the keyboard.
In Part 3 we explain why this whole architecture is intentionally not an LLM, and what that means for cost, compliance, and reproducibility.