Positive and negative cases
Suites include true-positive vulnerable fixtures and true-negative safe fixtures. A suite is not considered strong enough for a coverage claim when it only contains vulnerable examples.
Technical proof
This page states what SiteShadow can currently claim, what evidence backs it, and where the boundaries are. It is written for security engineers evaluating detection quality, not as a marketing feature list.
SiteShadow combines rule-based checks, structural heuristics, dependency checks, cross-file project checks, and WASM-powered taint analysis. The numbers below are evidence-backed controlled benchmark and library counts, not estimates from public marketing copy.
"Full" means public taint and rule coverage is benchmark-backed for the listed language family. "Rules" means shipped rule/pattern coverage exists, but public full-taint claims remain qualified.
| Language / Surface | Status | Taint capability | Primary evidence | Evidence boundary |
|---|---|---|---|---|
| Python | Full | Source-to-sink taint + rules | OWASP, Juliet, MultiLang, heuristics, AI-security | Full for represented cases; framework-specific sanitizer coverage continues to expand. |
| JavaScript | Full | Source-to-sink taint + rules | MultiLang, heuristics, AI-security, React XSS cases | Full for represented cases; JSX and template-parser edge cases continue to expand. |
| TypeScript | Full | JavaScript-family analyzer + TS rules | AI-security and JS-family rule coverage | Full through the JavaScript-family analyzer; deeper type-aware framework modeling is ongoing. |
| Java | Full | Source-to-sink taint + rules | OWASP, Juliet, MultiLang | Full for represented cases; large-framework route graph coverage continues to expand. |
| C# | Full | Source-to-sink taint + rules | Juliet, MultiLang, C#/Razor checks | Full for represented cases; ASP.NET middleware and Razor fixture depth continue to expand. |
| Go | Full | Source-to-sink taint + rules | MultiLang and heuristic benchmark evidence | Full for represented cases; router and validation library models continue to expand. |
| Ruby | Full | Source-to-sink taint + rules | Language regression evidence and Ruby analyzer evidence | Full for represented cases; Rails/Sinatra and customer-like corpus coverage continue to expand. |
| PHP | Full | Source-to-sink taint + rules | Language regression evidence and PHP analyzer evidence | Full for represented cases; Laravel/Symfony/WordPress and customer-like corpus coverage continue to expand. |
| PowerShell | Full | Source-to-sink taint + rules | Language regression evidence and PowerShell analyzer evidence | Full for represented cases; enterprise module and shell-argument sanitizer coverage continue to expand. |
| Blazor | Rules | C#/Razor-oriented rules | Blazor rule family and C# analyzer evidence | Rules coverage today; dedicated Blazor benchmark pack is required before Full. |
| YAML / JSON / Dockerfile / Kubernetes / Terraform | Config | Pattern and structural checks, not taint-led | Rule benchmarks where represented | Config checks today; dedicated IaC/cloud benchmark suite is required before Full Config. |
The table below summarizes the claim level by vulnerability class. "Green" means the latest controlled evidence is passing for represented cases; it does not mean the class is exhaustively solved for every framework.
| Class | Claim level | Languages / surfaces | How SiteShadow detects it | Known gap |
|---|---|---|---|---|
| SQL injection | Green | Python, JS/TS, Java, C#, Go, Ruby, PHP; PowerShell where represented | Source-to-sink taint, raw DB driver heuristics, injection rules | More ORM/framework corpus evidence. |
| XSS | Green | JS/TS, Python, Java, C#, Ruby, PHP; Blazor rules | DOM sinks, template sinks, sanitizer recognition, framework checks | More template-engine and context-specific escaping fixtures. |
| Command injection | Green | Python, JS/TS, Java, C#, Go, Ruby, PHP, PowerShell | User-controlled command, shell, process, and eval sink tracking | Broader shell-argument sanitizer modeling across real projects. |
| Code injection (CWE-94) | Partial | Python, JS/TS, Java, C#, Go, Ruby, PHP, PowerShell | Taint into dynamic-code sinks: eval, new Function, and related dynamic-construction shapes | Some indirect-construction bypass variants — see Currently Known Gaps below. |
| SSRF | Green | Python, JS/TS, Java, C#, Go, Ruby, PHP, PowerShell where represented | User-controlled URL to HTTP clients, metadata endpoint patterns, AI-output-to-request flows | Allowlist, DNS rebinding, parser, and network validation modeling. |
| Secrets and credentials | Green | All scanned code/config languages | CWE-798, provider token patterns, config rules, duplicate-secret cross-file detection | Provider pattern drift needs continuous updates and sampled FP review. |
| Path traversal and file access | Green | Python, JS/TS, Java, C#, Go, Ruby, PHP, PowerShell | User-controlled path to read/write/open sinks with sanitizer handling | Upload-storage, normalization, and platform-specific fixtures. |
| Auth, access control, IDOR | Green | Framework-dependent across Python, JS/TS, Java, C#, Go | Route/auth heuristics, cross-file auth consistency, object-access indicators | More real app route graphs and middleware corpora. |
| AI/LLM security flows | Green | Python, JS/TS, Java; MCP tool/config rules | LLM output as tainted source to tools, HTTP, browser automation, shell, storage, email, chat sinks, and unsafe MCP tool access | Sink-library expansion and public AI-agent risk methodology. |
| IaC, container, configuration | Partial | YAML, JSON, Dockerfile, Kubernetes, Terraform patterns | Pattern and structural config checks | Dedicated IaC/cloud-provider benchmark suite not yet complete. |
Active scanner gaps as of 2026-05-20. Each item is tracked as a defect with an owner, and is published here so customers see the same picture the release team does. Closed defects are removed from this table as their fixtures pass a clean rerun.
| Class | Language / surface | What the scanner misses today |
|---|---|---|
Code injection (CWE-94) via Reflect.construct | JavaScript | Reflect.construct(Function, [tainted])() — taint not flagged through this indirect-construction shape |
SiteShadow uses public benchmark suites where strong scored benchmarks exist, and internal regression suites where public benchmarks are immature or unavailable. Internal regression suites are release gates, not independent third-party certifications.
| Evidence layer | Status | What it supports | Current limitation |
|---|---|---|---|
| OWASP Benchmark for Java | Release gate | Java web-application vulnerability detection against a public scored benchmark. Current result: +1.000 Youden (1,698 cases, 873 TP, 0 FN, 0 FP). | Java-focused; it does not validate PHP, Ruby, PowerShell, or JavaScript framework claims. |
| NIST SARD / Juliet CWE suites (Java) | Release gate | CWE-style regression evidence for supported imported Java cases. Current result: +1.000 Youden (6,444 cases, 6,264 TP, 0 FP). | Synthetic corpus; controlled results are not real-world customer-code FP/FN rates. |
| OWASP Benchmark for Python | Release gate | Python web-application vulnerability detection against a public scored benchmark. Current result: +1.000 Youden (1,230 cases, 452 TP, 0 FP). | Python-focused; it does not validate JavaScript, Ruby, Go, or other-language framework claims. |
| NIST SARD / Juliet CWE suites (C#) | Release gate | C# CWE-style regression evidence across path traversal, LDAP injection, XPath injection, command injection, and SQL injection. Current result: +1.000 Youden (2,516 cases, 2,412 TP, 0 FP). | Synthetic corpus; controlled results are not real-world customer-code FP/FN rates. |
| SecBench.js | Evaluation candidate | Server-side JavaScript package vulnerability evidence with executable examples. | Requires an adapter and result mapping before it becomes a release gate. |
| External vulnerable-app corpus | Planned | Realistic JavaScript, PHP, and Ruby smoke tests using projects such as OWASP Juice Shop, OWASP NodeGoat, DVWA, and OWASP RailsGoat. | These are training/demo applications, not scored SAST benchmarks until expected-finding maps are built. |
The release gates below cover languages whose public quality is anchored in internal regression suites rather than external scored benchmarks. Java, Python, JavaScript/TypeScript, and C# are covered by the external benchmark suites listed in the table above.
| Area | What it protects | How to read it publicly |
|---|---|---|
| PHP, Ruby, and PowerShell language regression | Source-to-sink cases, safe negative cases, framework-style sources, command, SSRF, path, and sanitizer behavior. | Supports Full status for represented taint and rule families, with framework-depth limitations still listed above. |
| AI/LLM security regression | LLM output and AI-agent dataflow into tools, shell, HTTP, browser automation, storage, email, and chat sinks. | Supports current AI/LLM rule-family claims, not a universal claim over all agent frameworks. |
| Rule, heuristic, cross-file, and taint regression | Release blocking checks for noisy rules, missed findings, multi-file behavior, and explainability fields. | Shows release discipline. It is not a substitute for public benchmarks or customer-like corpus measurement. |
This page summarizes SiteShadow's technical coverage matrix: Detection Credibility Matrix and the authoritative benchmark rollup maintained for release review. Controlled benchmark and regression results are not the same as statistically measured customer-code false-positive or false-negative rates.
Suites include true-positive vulnerable fixtures and true-negative safe fixtures. A suite is not considered strong enough for a coverage claim when it only contains vulnerable examples.
Taint findings are evaluated as paths from source to sink, including intermediate propagation steps when the analyzer can observe them. Serious findings now expose source, sink, data path, rule id, confidence, vulnerable pattern, remediation, and benchmark/example link when available.
Scanner and rule releases require evidence across the authoritative SiteShadow non-regression suites. Any regression requires severity classification, assigned remediation ownership, and release approval before shipping.
Coverage wording cannot exceed benchmark evidence. Where a language has rule coverage but incomplete benchmark-backed taint evidence, this page labels it as rules coverage instead of full taint support.
| Area | Limitation | What it means | Evidence status |
|---|---|---|---|
| Real-world FP/FN rate | No published customer-like corpus methodology yet. | Controlled benchmark rates must not be used as real-world production rates. | Customer-like sampling methodology not yet published. |
| IaC/container/cloud | No dedicated IaC benchmark suite yet. | Coverage exists where represented by rules, but full benchmark-backed IaC claims stay qualified. | Dedicated IaC benchmark suite pending. |
| Ruby/PHP/PowerShell corpus depth | Full status is backed by controlled language regression evidence, but real-world framework and enterprise-script variety is broader than the controlled fixtures. | Public claims can say Full for represented taint and rule coverage, while still qualifying framework and customer-like corpus depth. | Full in release-gated language regression evidence; ongoing framework and corpus expansion. |
| Framework modeling | Middleware, route graphs, sanitizers, and ORM behavior vary by framework. | Supported languages still need continuous framework fixture expansion. | Ongoing framework fixture expansion. |
| AI/LLM risk classes | AI-agent sink libraries evolve quickly. | AI/LLM coverage is benchmark-backed for current suites and needs continued sink-family expansion. | Current suites covered; sink-family expansion ongoing. |
| Runtime-only issues | Some vulnerabilities depend on deployed configuration, live identity data, secrets, network reachability, or production authorization state. | Static analysis can flag risky code paths, but it cannot prove every runtime policy, tenancy boundary, or environment-specific behavior. | Requires runtime telemetry, integration tests, or customer environment validation. |
| Business logic | Deep workflow abuse, fraud logic, approval bypasses, and policy mistakes can require domain context the scanner does not have. | SiteShadow may identify risky authorization or data-flow patterns, but it is not a complete substitute for threat modeling and application-specific review. | Heuristic coverage exists; domain-specific proof remains customer/application dependent. |
| Dependency and supply chain | SiteShadow detects selected dependency and configuration risks, but it is not a full SCA, SBOM, malware, or exploitability platform. | Use dedicated dependency/SBOM tooling alongside SiteShadow for package vulnerability inventory and transitive dependency governance. | Partial rule coverage only. |
| Generated, minified, and highly dynamic code | Generated/minified bundles and heavy reflection/metaprogramming can hide intent and reduce path explainability. | Findings may be less precise, and some source-to-sink paths may need source maps, original source, or manual review. | Not claimed as fully measured across all generated-code styles. |