Web Application Penetration Testing Checklist — What Gets Tested and Why

TLDR: The most common question I get from founders before booking a web app assessment is “what exactly will you be testing?” It’s a fair question — you’re granting access to your production application and you should understand exactly what that involves. This post is the complete answer: the actual checklist I use on every engagement, organised by category, with a plain-English explanation of what each item looks for.

Why Methodology Matters

An unstructured pen test produces inconsistent results. Without a defined methodology, coverage depends entirely on the tester’s mood, time pressure, and whatever catches their attention that day. Critical areas get missed not through negligence but through the absence of a process that ensures they’re always covered.

A repeatable checklist means two things: consistency across engagements, and a defensible answer to “did you check X?” for every X. It’s also the basis for a remediation cycle — when critical findings are fixed and a retest is conducted, the same checklist confirms the fix without introducing new gaps.

What follows is the actual framework I work through on every web application engagement. It maps closely to the OWASP Testing Guide — the industry’s most referenced methodology — adapted from real-world experience of what actually surfaces findings.

The Checklist

Information Gathering

Before active testing begins, reconnaissance builds a map of what’s actually exposed.

Subdomain enumeration — identifying all subdomains associated with the target domain reveals staging environments, internal tools, legacy applications, and API endpoints that aren’t linked from the main application but are still publicly accessible. Staging environments in particular are frequently less hardened than production and sometimes share a database.

Technology fingerprinting — identifying the tech stack (frameworks, server software, CDN, authentication providers) from HTTP headers, response bodies, and observable behaviour. Knowing the stack focuses testing on vulnerability classes relevant to that environment and surfaces version-specific known CVEs.

Error message analysis — deliberately triggering errors to see what the application reveals: stack traces, file paths, database error strings, framework versions. Verbose errors in production are findings in themselves and a reconnaissance goldmine.

Directory and file discovery — systematic requests for common paths (/admin, /backup, /api/v1/docs, /.env, /phpinfo.php) reveal configuration files, forgotten endpoints, exposed documentation, and admin interfaces that weren’t meant to be public.

Authentication

Brute force protection — testing whether login, password reset, and OTP endpoints enforce lockout or rate limiting after repeated failed attempts. Without this, credential stuffing and brute force are unconstrained.

Password reset flow — checking token expiry (does the link stop working after 15–60 minutes?), single-use enforcement (can the same link be used twice?), user enumeration via differential responses, and token predictability.

MFA implementation — testing OTP reuse, rate limiting on OTP submission, and whether MFA can be bypassed through account recovery flows, alternative login paths, or API endpoints that skip the second factor check.

Session token analysis — examining token length, entropy, and structure. A session token derived from a timestamp, username hash, or sequential counter is guessable. Tokens should be cryptographically random, minimum 128 bits of entropy.

Logout and session invalidation — capturing a session token, logging out, and attempting to reuse the captured token. If the server accepts it, logout is cosmetic. Covered in depth in our broken authentication post.

Authorisation

IDOR testing across all object references — every endpoint that accepts an object identifier (numeric IDs, UUIDs, filenames) is tested with identifiers belonging to a different user’s objects. Can User A access User B’s invoices, documents, orders? This is the most consistently found vulnerability class across all assessments.

Privilege escalation — horizontal — same permission level, different user (the IDOR scenario above). Can one standard user access another standard user’s data?

Privilege escalation — vertical — lower permission accessing higher permission functionality. Can a standard user call admin endpoints, access admin dashboards, or modify other users’ accounts?

API endpoint authorisation — separately testing API endpoints rather than assuming frontend-level access controls carry through. Authentication at the route level does not guarantee authorisation at the data level.

Role-based access control validation — if the application has multiple user roles (admin, manager, viewer, etc.), testing each role’s access against every sensitive function to confirm the boundaries hold.

Input Validation

SQL injection — testing all input fields, URL parameters, HTTP headers, and cookies for SQL injection. This includes blind injection techniques where the application doesn’t return error messages, and time-based inference for databases that return no observable difference on successful injection.

XSS — stored, reflected, and DOM-based — systematically testing every input that is reflected back to any user, in any context. DOM-based XSS requires JavaScript source review, not just HTTP traffic analysis, to identify dangerous sinks. Full methodology in our XSS post.

XXE (XML External Entity) — if the application processes XML input (file uploads, API requests, SOAP interfaces), testing for external entity injection that can read local files, perform server-side request forgery, or cause denial of service.

Command injection — in any feature that interacts with system-level operations (file conversion, image processing, DNS lookups, ping functionality), testing whether user-controlled input reaches a system call without sanitisation.

Path traversal — in file download, upload, or access features, testing whether directory traversal sequences (../../../etc/passwd) allow access to files outside the intended directory.

Business Logic

Workflow bypass — testing whether multi-step processes can be manipulated by skipping steps, replaying requests out of sequence, or submitting requests directly to later-stage endpoints without completing earlier ones. Common in checkout flows, approval processes, and onboarding sequences.

Price and parameter manipulation — modifying numeric values in payment requests, cart data, or quantity fields to test whether server-side validation enforces business rules. Can a price be submitted as a negative number? Can a quantity be set to zero?

Rate limiting on critical functions — beyond authentication, verifying that business-critical operations (account creation, voucher redemption, file exports, API calls with side effects) have appropriate limits.

Mass assignment vulnerabilities — testing whether submitting additional fields in request bodies beyond those the form displays causes unexpected server-side behaviour. A user registration endpoint that accepts an isAdmin: true field it wasn’t designed to expose is a mass assignment finding.

Configuration

Security headers — checking for presence and correct configuration of CSP, HSTS, X-Frame-Options, X-Content-Type-Options, and Referrer-Policy. Tested with securityheaders.com and manual header review.

HTTPS enforcement — verifying HTTP requests are redirected to HTTPS universally, that HSTS is set with an appropriate max-age, and that mixed content isn’t loading insecure resources on HTTPS pages.

CORS policy — reviewing whether the Access-Control-Allow-Origin header is scoped correctly. A wildcard (*) or overly permissive CORS policy on authenticated endpoints allows cross-origin requests that the policy is meant to prevent.

Cookie attributes — confirming session and authentication cookies have HttpOnly, Secure, and SameSite flags set correctly. Missing these flags extends the impact of XSS and exposes cookies to network interception.

Verbose error messages — confirming production returns sanitised errors externally while logging detailed errors internally. Stack traces in production responses are both an information disclosure finding and a direct OWASP Top 10 item.

After the Test: Report, Remediation, Retest

The checklist is the testing phase. What follows is where the business value is realised.

The report documents every finding with: what it is, how it was found and reproduced, what an attacker could do with it, its severity (using CVSS scoring), and specific remediation guidance — not generic advice, but the exact fix for your implementation.

Remediation is your team’s work, but a good testing partner is available for clarification during this phase. Questions like “does this fix actually address the root cause?” should have an answer before the retest, not during it.

The retest confirms fixes are effective. Changing a password doesn’t invalidate existing sessions — this sort of incomplete remediation shows up regularly in retests and is caught before it matters.

How Long Does It Take — and What Do You Need to Provide

A focused web application test for a mid-sized SaaS product typically runs 3–5 days of active testing. Larger applications with complex business logic or many distinct user roles take longer.

To start, we need: the application URL and test environment details, a list of in-scope functionality, test accounts for each user role in the application, and a signed scope-of-engagement document. That’s it. We handle the rest.

Ready to Run Through This Checklist on Your Application?

If you’ve read this and found yourself wondering whether your application would pass — that’s a useful signal. The answer is usually “partially,” and knowing specifically where the gaps are is exactly what an assessment provides.

Get in touch to start with a scoping conversation — we’ll talk through your application, agree on scope, and tell you exactly what the engagement covers and costs. You can also explore our full services or learn more about how we work. More posts in this series are on the Kuboid blog.

Kuboid provides web application penetration testing for product teams and startups. Visit www.kuboid.in to learn more.