Two penetration testing firms will quote the same web application and land three times apart, and neither number is wrong. A quote is not a price tag, it is the visible output of four hidden variables multiplied together: how many tester-days the firm plans to spend, the day-rate of the people spending them, how deep the manual testing goes, and what is actually in scope. When one proposal says C$8,000 and another says C$24,000 for "a pentest of our app," the two firms are almost never describing the same test. This is the buyer-side method for reverse-engineering any proposal back into those variables so you compare tests, not prices.
The core question first: how do you compare two penetration testing quotes with very different prices, and why do vendors quote 3x differently for the same app? You recover the hidden math. Take each quote's total, divide by a realistic day-rate, and you get the approximate tester-days that quote is buying. Then you check what those days actually cover: how many user roles, whether testing is authenticated, whether business logic and authorization are in scope or only scanner-class bugs, and whether a retest is bundled. A firm quoting 3x more is usually selling 3x more testing depth, or the cheap firm is selling a scanner run dressed up as a pentest. Decode both, and the "expensive" quote often turns out to be the only real one.
This guide gives you a mechanical process, not a gut feeling. It uses public methodology standards as the yardstick: the OWASP Web Security Testing Guide (WSTG), which enumerates what a thorough web-app test should cover, and the Penetration Testing Execution Standard (PTES), which defines the seven phases a real engagement runs through. If a proposal cannot be mapped onto those, that tells you something.
TL;DR: how to compare pentest quotes in five moves
Recover tester-days from every quote. Divide the total by a plausible day-rate for that firm's tier to estimate how many tester-days you are buying. Two quotes 3x apart usually differ on days, not just rate.
Normalize scope before you compare price. Confirm both proposals cover the same assets, the same number of user roles, and the same authentication model. A quote for "the app" unauthenticated is not comparable to a quote testing three roles logged in.
Separate day-rate from tester-days. A senior boutique at a high day-rate over five days can cost the same as a junior team at a low rate over fifteen, and find far more. Rate alone tells you nothing.
Check depth: scanner-class versus business-logic. The single biggest price driver is whether the test hunts IDOR, broken authorization, and business-logic flaws by hand, or only reports what an automated scanner flags.
Run the red-flags checklist. No named methodology, no tester seniority, no retest, vague scope, or a findings list that reads like raw scanner output are the signals that a low number is a smaller test, not a better deal.
Key takeaways
A quote is four variables, not one number. Price equals tester-days times day-rate, scaled by testing depth and scope. You cannot compare two quotes until you separate all four.
The cheapest quote is frequently the smallest test. When a proposal is a third of the others, the usual reason is fewer tester-days, unauthenticated-only testing, or a scanner run with a report template around it. That is not a discount on the same product.
Day-rate in isolation tells you little. Public 2026 pricing guidance runs from roughly US$1,000 to US$3,000 per tester-day at the low end and US$4,000 or more for senior boutiques. A high rate over few days can be cheaper and deeper than a low rate over many.
Business-logic and authorization depth is what you are really buying. Automated tools catch known-class bugs. The flaws that cause real breaches, IDOR, broken access control, and business-logic abuse, are found by manual hours, and those hours are the line item cheap quotes cut first.
Transparency is a proxy for quality. A proposal that names in-scope assets, roles, the auth model, the methodology, tester seniority, and a bundled retest is describing a real engagement. One that hides those is asking you to buy on trust.
Methodology note
This guide is a buyer-side decoding framework, not a price list. The yardsticks are public standards: the OWASP Web Security Testing Guide v4.2 for web-application test coverage, and the Penetration Testing Execution Standard (PTES) for engagement phases. Day-rate ranges here are third-party market context from public 2026 pricing guides, used only to demonstrate the reverse-engineering math; they are not Stingrai figures. For a fuller cost breakdown, see how much penetration testing costs in 2026. This post does not quote Stingrai's own pricing; that lives on the pricing page.
Why do pentest quotes differ so much for the same app?
Because "the same app" hides a lot. Two firms looking at the same login page can build completely different tests behind it, and the proposal only shows you the total.

A price is the product of four things:
Tester-days. The person-days a human actually spends testing. A three-day and a twelve-day test of the same app are different products at the same day-rate.
Day-rate. What each day costs, which tracks tester seniority and firm tier. A senior tester with years of manual authorization-testing experience does not cost the same as a junior running a tool.
Depth of manual testing. Whether those days go into hunting business-logic and authorization flaws by hand, or into running a scanner and formatting its output.
Breadth of scope. How many assets, endpoints, roles, and authentication states are covered. A single unauthenticated crawl is cheap because it barely touches the app.
Two quotes 3x apart almost always differ on more than one of these at once. The cheap one might be five tester-days, unauthenticated, scanner-led. The expensive one might be twelve tester-days, three roles authenticated, manual business-logic testing, with a retest. Both are honest quotes, for different tests. Your job is to see which test you are buying.
Step 1: reverse-engineer tester-days from the price
You rarely get tester-days stated plainly, so recover them. The arithmetic is simple:
Estimated tester-days ≈ Total quote ÷ assumed day-rate for that firm's tier
Public 2026 pricing guidance puts day-rates roughly in these bands: independent and smaller firms around US$1,000 to US$2,000 per tester-day, established mid-market firms around US$1,500 to US$3,500, and senior boutiques or specialist teams from US$4,000 upward. Those are third-party market figures, not Stingrai rates, and they exist here only to run the math.
Apply it to a worked example. Say you hold two quotes for the same web app:
Quote A: C$8,000. At a low-tier day-rate near C$1,300, that is roughly six tester-days. At a mid-tier rate near C$2,600, barely three. Either way, a short engagement.
Quote B: C$24,000. At a senior day-rate near C$4,000, about six tester-days of senior time. At a mid-rate near C$2,600, closer to nine.
Now the "3x" gap has a shape. Quote B is not three times more expensive for the identical test; it is buying more senior tester-days, and the next steps tell you what those days are spent on. If a firm will confirm both the day-rate and the planned tester-days when you ask, the number stops being a mystery. If they will not, that is your first data point.
Step 2: normalize scope before you compare price
You cannot compare two totals until both cover the same ground. Before looking at price, line up each proposal on the same axes:
Assets. Exact URLs, subdomains, APIs, and environments. "The application" is not a scope. One firm may include the admin panel and public API; another only the marketing site.
User roles. How many distinct roles are tested, and against each other? Testing admin, standard user, and unauthenticated visitor, and checking whether one can reach another's data, is roughly three times the authorization work of testing one role.
Authentication model. Authenticated testing, where the tester logs in and exercises the app as real users do, finds far more than an unauthenticated scan of the front door. A cheap quote is often cheap because it never logs in.
Test type. Web app, external network, internal network, API, cloud, and mobile are different tests. Make sure both quotes cover the same surfaces.
The companion guide on how to scope a penetration test provides a rules-of-engagement checklist so every vendor quotes against an identical scope. Hand every firm the same written scope, and any remaining gap is about depth and seniority, which is exactly what you want to be paying for.
Step 3: score the proposals side by side
Once you have recovered tester-days and normalized scope, put the proposals into one scorecard. Comparing on a single sheet is how a 3x price gap resolves into a clear decision.

Here is a synthetic three-quote comparison for one web application, the kind of table you should build from your own proposals:
Line | Quote A (budget) | Quote B (mid) | Quote C (senior boutique) |
|---|---|---|---|
Total price | C$8,000 | C$16,000 | C$24,000 |
Recovered tester-days | ~4-6 | ~7-9 | ~6-9 (senior) |
User roles tested | 1 | 2 | 3+ |
Authentication model | Unauthenticated | Authenticated (1 role) | Authenticated, role-vs-role |
Business-logic / IDOR testing | Not stated | Partial | Explicit, manual |
Methodology named | No | OWASP WSTG | OWASP WSTG + PTES |
Tester seniority stated | No | Mid | Senior, named certs |
Retest included | No | Add-on | Bundled |
Report sample offered | No | Yes | Yes |
Read down the columns, not across the price row. Quote A is cheapest because it is the smallest test: one role, unauthenticated, no stated methodology, no retest. Quote C costs 3x Quote A because it does 3x the work in the classes of bug that actually cause breaches. The mid quote is a real middle. The price gap is not markup, it is scope and depth made visible.
The scorecard also surfaces the questions to send back to each vendor: if a cell is blank, ask. A firm that fills every cell without hesitation is describing a test it has actually planned.
Step 4: apply the driver-by-driver multiplier
To sanity-check why one quote is larger, decompose the difference into the drivers that stack on top of a base web-app test. Each driver adds tester-days, and tester-days drive price.

Driver | What it changes | Rough effect on tester-days |
|---|---|---|
Base: single-role authenticated web app | The reference engagement | 1x baseline |
Additional user roles | Each role adds authorization paths to test, and role-vs-role IDOR checks | +25% to +50% per extra role |
Authenticated vs unauthenticated | Logging in exposes the majority of the real attack surface | Unauthenticated-only can cut days sharply, and coverage with them |
Business-logic depth | Manual hunting for logic abuse, IDOR, broken access control beyond scanner class | +30% to +100% depending on app complexity |
Box color (black / gray / white) | White-box adds source and design review; gray-box adds credentials and docs | Varies; white-box adds review time but finds deeper flaws |
Tester seniority | Senior testers cost more per day but find more per day | Higher day-rate, often fewer days for equal or better coverage |
Retest | Re-verifying fixes after remediation | Adds a fixed block of days, often bundled by serious firms |
These are directional, not a formula, and the exact figures depend on your application. The point is that a 3x price difference is fully explainable by two or three of these drivers stacking. When a firm cannot tell you which drivers are in their number, they may not have scoped the test at all.
Step 5: run the proposal red-flags checklist
Some price gaps are legitimate. Others hide a test that is not really a pentest. Run every proposal through this checklist before you sign anything.

Scanner output masquerading as a pentest. If the sample report is a list of CVEs and TLS warnings with severity scores and no exploitation narrative, you are buying an automated scan with a cover page. A real pentest shows how findings were chained and what a tester actually reached.
No named methodology. A serious proposal references a recognized standard, OWASP WSTG for web, PTES for engagement phases. "Our proprietary process" with no detail is a flag.
Vague scope. "We will test your application" without assets, roles, and auth model listed means the firm has not scoped the work, or is leaving room to test as little as possible.
No retest. If verifying that your fixes actually worked costs extra or is not offered, the engagement ends before you know whether you are safe. Serious firms bundle or clearly price a retest.
No tester seniority. If the proposal will not say who is testing or what certifications and experience they hold, you cannot judge the day-rate or the depth.
Business logic absent. No mention of IDOR, broken access control, or business-logic testing means the test likely stops at scanner-class bugs, which is where the cheapest quotes stop.
A price with no breakdown. A single number with no tester-days, no scope, and no methodology is not a proposal, it is a guess you are being asked to fund.
Hitting one flag is a question to ask. Hitting several means the low number is not a bargain, it is a different, smaller product.
What this means for defenders
Buying a pentest well is a security control in itself, because a cheap test that misses your authorization flaws leaves them live for a real attacker to find. Turn the decoding process into buying habits:
Write the scope once, send it to everyone. Hand every vendor an identical rules-of-engagement scope so quotes differ only on depth, seniority, and days, not interpretation. The scoping checklist gives you the template.
Insist on business-logic and authorization testing in writing. Automated tools catch known-class bugs; the flaws that cause breaches are found by manual hours. Modern offensive-security firms pair senior testers with purpose-built AI agents such as Stingrai's Snipe, which is trained to hunt exactly those complex IDOR, broken-authorization, and business-logic classes rather than stopping at scanner-level findings, so more of the budget goes into depth instead of setup.
Demand a bundled retest. Verifying fixes is part of the engagement, not an upsell. A proposal that treats it as core describes a full loop from finding to fixed.
Read the sample report before you read the price. The report is the product. If it shows exploitation, chaining, and business-logic findings, the price is buying real testing. If it shows scanner output, no price is low enough.
Judge transparency as a quality signal. A CREST-accredited firm names in-scope assets, roles, methodology, tester seniority, and a retest by default, because that is what its process requires.
The firm that gives you the clearest breakdown is usually the one doing the most defensible work, because it has nothing to hide in the numbers. Learn to evaluate the report and choose the provider on substance, and the 3x price gap stops being confusing and starts being informative.
Frequently Asked Questions
How do I compare two penetration testing quotes with very different prices?
Reverse-engineer each quote into its hidden variables. Divide the total by a realistic day-rate to estimate tester-days, then normalize scope so both proposals cover the same assets, user roles, and authentication model. Finally check depth: whether the test hunts business-logic and authorization flaws by hand or only reports scanner output. Once you have separated tester-days, day-rate, depth, and scope, the price gap resolves into a clear comparison of how much real testing each quote buys.
Why do vendors quote 3x differently for the same app?
Because a quote is four variables multiplied together: tester-days, day-rate, testing depth, and scope, and "the same app" hides all of them. A 3x-cheaper quote is usually fewer tester-days, unauthenticated-only testing, a single role, or a scanner run presented as a pentest. The higher quote is typically buying more senior tester-days spent on manual authorization and business-logic testing. Neither number is wrong; they are quotes for different tests.
What is a normal penetration testing day-rate in 2026?
Public 2026 pricing guidance puts consultant day-rates roughly between US$1,000 and US$2,000 for independent and smaller firms, US$1,500 to US$3,500 for established mid-market firms, and US$4,000 or more for senior boutiques and specialist teams. These are third-party market ranges used to run the tester-days math, not Stingrai figures. Day-rate alone tells you little; a high rate over few days can be cheaper and deeper than a low rate over many. See how much penetration testing costs in 2026 for the fuller breakdown.
How do I recover tester-days from a pentest quote?
Divide the total price by an assumed day-rate for that firm's tier. A C$8,000 quote at a C$1,300 day-rate is roughly six tester-days; at a C$2,600 mid-tier rate it is closer to three. The estimate is approximate, but it converts an opaque total into a number you can compare, and it gives you a concrete question to send back to the vendor: confirm the planned tester-days and the day-rate behind this number.
What are the biggest red flags in a pentest proposal?
Scanner output presented as a pentest, no named methodology, vague scope with no assets or roles listed, no retest, no stated tester seniority, and no mention of business-logic or authorization testing. A single flag is a question to ask; several together mean the low price reflects a smaller, shallower test rather than a genuine discount on the same work.
What is the difference between scanner-class and business-logic testing?
Scanner-class testing finds known bug patterns, injection, misconfiguration, outdated components, that automated tools reliably flag. Business-logic testing is manual and hunts flaws unique to how your application works: IDOR, broken access control, privilege escalation, and abuse of legitimate features. These logic and authorization flaws cause many real breaches and cannot be found by tooling alone, which is why they are the line item cheap quotes cut and the reason serious quotes cost more.
Should a penetration test include a retest?
Yes. A retest verifies that your remediation actually closed the findings, and without it the engagement ends before you know whether you are safe. Serious firms bundle a retest or price it clearly as part of the engagement. A proposal that omits a retest, or treats it purely as an upsell, is describing an incomplete loop from finding to fixed.
How does a CREST-accredited firm's quote differ?
A firm-level CREST-accredited provider follows a defined methodology, so its proposal names in-scope assets, user roles, the authentication model, the standards it tests against, tester seniority, and a retest by default, because its process requires them. That transparency is itself a quality signal: the numbers are traceable to a plan. Stingrai is a CREST-accredited penetration testing provider, founded in 2021 and headquartered in Toronto with a London office.
References
OWASP. Web Security Testing Guide (WSTG) v4.2. December 2020. https://owasp.org/www-project-web-security-testing-guide/. Community standard enumerating web-application test categories and cases used as the coverage yardstick for a thorough web-app penetration test.
The Penetration Testing Execution Standard (PTES). PTES Technical Guidelines. http://www.pentest-standard.org/. Defines the seven phases of a penetration test: pre-engagement, intelligence gathering, threat modeling, vulnerability analysis, exploitation, post-exploitation, and reporting.
Stingrai. How Much Does Penetration Testing Cost in 2026. https://www.stingrai.io/blog/how-much-does-penetration-testing-cost-2026. Buyer breakdown of pentest cost drivers, day-rates, and pricing models referenced for the reverse-engineering math.
Stingrai. How to Scope a Penetration Test in 2026. https://www.stingrai.io/blog/how-to-scope-a-penetration-test-2026. Rules-of-engagement scoping checklist used to normalize scope across competing proposals.



