Testing

The toolchain

Pentu is a hybrid. The AI is the brain — it decides what to test and interprets the results — but a real penetration-testing toolbox does the heavy lifting. This is the same toolchain a human tester reaches for, driven by an agent that knows when to use each.

How it fits together

Some tools run automatically as a parallel pass while the agent signs up and explores — so they cost no extra wall-clock. Others the AI invokes on demand, when its own probing suggests it: "this parameter errored on a quote → confirm it with sqlmap," "this is a webhook field → plant an SSRF callback." Everything stays in scope, rate-limited, and non-destructive. Findings a tool confirms are treated as proven — the tool is the evidence.

The toolbox

nuclei
Thousands of community templates — known CVEs, exposed panels, default credentials, misconfigurations, token leaks. Runs against the public host and, with a session, the authenticated app.
testssl.sh
Full TLS audit — protocol versions, weak ciphers, certificate chain and expiry, forward secrecy.
nmap
Port and service discovery. CDN-aware: when your app sits behind Cloudflare or similar, Pentu skips the scan (the edge IPs aren't your server, and scanning them is pointless and abusive).
subfinder
Subdomain enumeration — surfaces forgotten dev / staging / admin hosts.
httpx
Fast tech-stack and server fingerprinting.
ffuf
Content discovery over a curated, high-signal wordlist — hidden routes, backups, admin, API docs — rate-limited so it looks like normal traffic.
nikto
Web-server and known-issue sweep.
retire.js
Known-vulnerable JavaScript libraries, from the scripts your app actually loads.
trufflehog
Secret detection in the front-end bundles — only verified leaks are surfaced.
sqlmap
SQL-injection confirmation in time-based-only mode — it measures response delay and never dumps or modifies data.
dalfox
Reflected / DOM XSS discovery and confirmation.
OAST collaborator
Our own phone-home server. Pentu plants a unique callback URL; if your backend fetches it, we log the hit — the only way to catch blind SSRF.

Why hybrid beats either alone

  • Breadth from the tools. Thousands of templates and a full TLS/port/subdomain sweep would take an AI far too long to reproduce by hand.
  • Judgment from the AI. Tools are noisy and context-blind. The agent decides which templates matter, chases the interesting hits, and folds tool output into its reasoning.
  • Proof from both. Deterministic tool output strengthens the report with reproducible evidence — matching the rigor of a human-led engagement.