WebAudit — Web Application Security Scanner
A Python scanner that audits web apps for misconfigs and common vulns, then generates a self-contained HTML report with severity ratings and remediation steps
WebAudit — Web Application Security Scanner
A CLI tool that finds the security issues most production sites have and don't know about, then explains each one clearly enough to hand to a developer.
The Goal
Most security scanners are built for people who already know what the output means. I wanted something that would:
- Cover the highest-frequency, highest-impact web security issues
- Explain each finding in plain language, not just flag it
- Output a clean HTML report you could drop into an email or Jira ticket
- Work fast on a target I owned or had explicit permission to test
That's WebAudit.
How It Works
The tool is invoked from the command line with a target URL. It runs a series of independent check modules, collects all findings into a structured list, and prints color-coded results to the terminal. Optionally it writes a self-contained HTML report and/or a JSON file.
Each finding carries four fields: check category, status (FAIL / WARN / PASS), severity (CRITICAL / HIGH / MEDIUM / LOW), and a remediation recommendation. The scanner sets its User-Agent to WebAudit/1.0 Security Scanner — no attempt to hide what it is.
The architecture is a thin orchestrator (webaudit.py) that imports modular check packages from checks/ and hands results to reporter/ for output. Adding a new check is a matter of adding a module that returns the same finding structure.
Capabilities
Security Headers
Checks for CSP, HSTS, X-Frame-Options, X-Content-Type-Options, Referrer-Policy, and Permissions-Policy. These are free to add and an astonishing number of production sites are missing them. A missing CSP means a single XSS vulnerability can run arbitrary JavaScript in every user's browser. A missing HSTS means an attacker on the same network can strip TLS and MITM the session.
Exposed Sensitive Files
Probes a curated list of paths that developers routinely leave accessible: .env, .git/HEAD, wp-config.php, phpinfo.php, backup SQL files. For each HTTP 200 response, it flags the finding with appropriate severity — a live .env containing cloud keys is CRITICAL. The robots.txt check also parses Disallow entries and surfaces them, since hiding paths in robots.txt works about as well as hiding a house key under the doormat.
SSL/TLS
Checks certificate validity, days until expiry (flags anything under 30 days as HIGH), and TLS version. TLS 1.0 and 1.1 should be disabled everywhere.
Cookie Security
Inspects all cookies for missing Secure, HttpOnly, and SameSite flags. A session cookie without HttpOnly is readable by any JavaScript on the page.
XSS Reflection
Injects a harmless unique string into GET parameters and checks whether it reflects verbatim in the response body without encoding. Not a full XSS testing workflow — it catches the cases where there is literally no output encoding at all.
Open Redirect
Tests common redirect parameter names (?redirect=, ?url=, ?next=, etc.) and checks whether the server follows them to external URLs. Low-effort finding, used frequently in phishing chains where the attacker builds a URL on a legitimate domain that bounces the victim somewhere malicious.
Tech & Tools
- Python 3.8+ —
requests,urllib3,argparse - Modular check architecture under
checks/—headers,exposed_files,ssl_check,cookies,xss reporter/html_reporter— generates a fully self-contained HTML report (no external CSS or JS dependencies)- JSON output for feeding findings into other tools or SIEMs
- Security score (0–100) calculated by deducting points by severity: CRITICAL −25, HIGH −15, MEDIUM −7
What I Learned
On web security in practice:
Security headers are genuinely low-hanging fruit. They're free, they mitigate real attack classes, and a large percentage of production sites skip them. The exposed files check has caught real issues — developers push code and leave .env files, .git directories, or phpinfo.php pages behind without realizing it.
On scanner design: Separating the check modules from the reporter from the orchestrator made the codebase easy to extend. Each check returns the same data shape; the reporter doesn't care where the findings came from. I'd do this differently from scratch in one way: I'd make the check interface explicit (an abstract base class or protocol) rather than relying on convention.
On XSS and redirect testing scope: Reflection testing and open redirect checks are genuinely useful for fast triage, but they only find the obvious cases. A false negative here doesn't mean the app is safe — it means it passed a lightweight probe. I was deliberate about not overstating what these checks prove.
On output format: The HTML report being fully self-contained turned out to matter more than I expected. You can email it, attach it to a ticket, or serve it from a static host with no dependencies. Building it that way required inlining styles in the generator rather than linking to a stylesheet, which was slightly tedious but worth it.
