Testing

The toolchain

Pentu is a hybrid. The AI is the brain — it decides what to test and interprets the results — but a real penetration-testing toolbox does the heavy lifting. This is the same toolchain a human tester reaches for, driven by an agent that knows when to use each.

How it fits together

Some tools run automatically as a parallel pass while the agent signs up and explores — so they cost no extra wall-clock. Others the AI invokes on demand, when its own probing suggests it: "this parameter errored on a quote → confirm it with sqlmap," "this is a webhook field → plant an SSRF callback." Everything stays in scope, rate-limited, and non-destructive. Findings a tool confirms are treated as proven — the tool is the evidence.

The toolbox

nuclei

Thousands of community templates — known CVEs, exposed panels, default credentials, misconfigurations, token leaks. Runs against the public host and, with a session, the authenticated app.

testssl.sh

Full TLS audit — protocol versions, weak ciphers, certificate chain and expiry, forward secrecy.

nmap

Port and service discovery. CDN-aware: when your app sits behind Cloudflare or similar, Pentu skips the scan (the edge IPs aren't your server, and scanning them is pointless and abusive).

subfinder

Subdomain enumeration — surfaces forgotten dev / staging / admin hosts.

httpx

Fast tech-stack and server fingerprinting.

ffuf

Content discovery over a curated, high-signal wordlist — hidden routes, backups, admin, API docs — rate-limited so it looks like normal traffic.

nikto

Web-server and known-issue sweep.

retire.js

Known-vulnerable JavaScript libraries, from the scripts your app actually loads.

trufflehog

Secret detection in the front-end bundles — only verified leaks are surfaced.

sqlmap

SQL-injection confirmation in time-based-only mode — it measures response delay and never dumps or modifies data.

dalfox

Reflected / DOM XSS discovery and confirmation.

OAST collaborator

Our own phone-home server. Pentu plants a unique callback URL; if your backend fetches it, we log the hit — the only way to catch blind SSRF.

Why hybrid beats either alone

Breadth from the tools. Thousands of templates and a full TLS/port/subdomain sweep would take an AI far too long to reproduce by hand.
Judgment from the AI. Tools are noisy and context-blind. The agent decides which templates matter, chases the interesting hits, and folds tool output into its reasoning.
Proof from both. Deterministic tool output strengthens the report with reproducible evidence — matching the rigor of a human-led engagement.