SiteShadow Detection Coverage

Current Evidence Summary

SiteShadow combines multiple analysis layers with benchmark-backed evidence across code, configuration, package-CVE, and AI-risk surfaces. The numbers below are controlled benchmark and library counts, not estimates from public marketing copy.

2,011security checks in public coverage

190CWE mappings

31security pattern checks

1,000cross-language benchmark cases

2,817expanded-language benchmark cases

305SecBench.js addressable CVE cases

324AI security benchmark cases

6,444Juliet (Java) benchmark cases

Important: controlled benchmark FP/FN rates are not customer-code FP/FN rates. Controlled suites are designed test sets. Real-world false-positive and false-negative rates require a separate customer-like corpus methodology, which is listed as an open evidence item.

Supported Languages

"Full" means benchmark-backed detection coverage for the listed language family. "Rules" means focused detection coverage exists, but broader language claims remain qualified.

Language / Surface	Status	Detection coverage	Primary evidence	Evidence boundary
Python	Full	Data-flow analysis and security rules	OWASP, Juliet, cross-language, and AI-risk evidence	Full for represented cases; framework-specific evidence continues to expand.
JavaScript	Full	Data-flow analysis and security rules	Cross-language, AI-risk, React XSS, and SecBench.js evidence	Full for represented cases; JSX and template-parser edge cases continue to expand.
TypeScript	Full	JavaScript-family coverage plus TypeScript rules	JavaScript-family and AI-risk evidence	Full through JavaScript-family coverage; deeper type-aware framework coverage is ongoing.
Java	Full	Data-flow analysis and security rules	OWASP, Juliet, and cross-language evidence	Full for represented cases; large-framework evidence continues to expand.
C#	Full	Data-flow analysis and security rules	Juliet, cross-language, and C#/Razor evidence	Full for represented cases; ASP.NET and Razor evidence continues to expand.
Go	Full	Data-flow analysis and security rules	Cross-language and pattern benchmark evidence	Full for represented cases; ecosystem-specific evidence continues to expand.
Ruby	Full	Data-flow analysis and security rules	Language-expansion evidence	Full for represented cases; Rails, Sinatra, and customer-like corpus evidence continue to expand.
PHP	Full	Data-flow analysis and security rules	Language-expansion evidence	Full for represented cases; Laravel, Symfony, WordPress, and customer-like corpus evidence continue to expand.
PowerShell	Full	Data-flow analysis and security rules	Language-expansion evidence	Full for represented cases; enterprise-script evidence continues to expand.
Blazor	Rules	C#/Razor-oriented checks	Blazor and C# evidence	Rules coverage today; dedicated Blazor benchmark evidence is required before Full.
YAML / JSON / Dockerfile / Kubernetes / Terraform	Config	Pattern and structural checks	Configuration evidence where represented	Config checks today; dedicated IaC/cloud benchmark evidence is required before Full Config.

Vulnerability-Class Coverage

The table below summarizes the claim level by vulnerability class. "Green" means the latest controlled evidence is passing for represented cases; it does not mean the class is exhaustively solved for every framework.

Class	Claim level	Languages / surfaces	How SiteShadow detects it	Current boundary
SQL injection	Green	Python, JS/TS, Java, C#, Go, Ruby, PHP; PowerShell where represented	Unsafe database-query detection	More ORM and framework corpus evidence.
XSS	Green	JS/TS, Python, Java, C#, Ruby, PHP; Blazor checks	Unsafe browser and template-output detection	More template and context-specific escaping evidence.
Command injection	Green	Python, JS/TS, Java, C#, Go, Ruby, PHP, PowerShell	Dangerous command-execution detection	Broader real-project command-safety evidence.
Code injection (CWE-94)	Partial	Python, JS/TS, Java, C#, Go, Ruby, PHP, PowerShell	Dynamic-code execution detection	More cross-framework dynamic-code evidence outside the represented SecBench.js classes.
SSRF	Green	Python, JS/TS, Java, C#, Go, Ruby, PHP, PowerShell where represented	Unsafe outbound-request detection	More validation and network-edge evidence.
Secrets and credentials	Green	All scanned code/config languages	Credential and provider-token detection	Provider pattern drift needs continuous updates and sampled FP review.
Path traversal and file access	Green	Python, JS/TS, Java, C#, Go, Ruby, PHP, PowerShell	Unsafe file-path detection	Upload-storage, normalization, and platform-specific evidence.
Auth, access control, IDOR	Green	Framework-dependent across Python, JS/TS, Java, C#, Go	Authorization and object-access checks	More real-app route and middleware corpus evidence.
AI/LLM security flows	Green	Python, JS/TS, Java; AI tooling and config patterns	AI-output risk detection across high-impact actions	Risk-library expansion and public AI-agent methodology.
IaC, container, configuration	Partial	YAML, JSON, Dockerfile, Kubernetes, Terraform patterns	Pattern and structural config checks	Dedicated IaC/cloud-provider benchmark suite not yet complete.

Current Evidence Boundaries

As of 2026-06-25, there is no active scanner-miss gap published for the benchmark-backed SecBench.js classes below. The items here are evidence boundaries: places where SiteShadow has strong measured coverage, but where we are not making a broader universal claim yet.

Boundary	Language / surface	Current honest read
Universal JavaScript package-CVE claims	JavaScript package CVEs beyond the represented SecBench.js classes	Represented code-injection, command-injection, and path-traversal checks are saturated today, including patched-version cases covered by the current benchmark. That is strong upstream package-CVE evidence, not a claim that every JavaScript package, framework, or CVE class is fully covered.

Evidence and Benchmarks

SiteShadow uses public benchmark suites where strong scored benchmarks exist, and controlled SiteShadow suites where public benchmarks are immature or unavailable. Controlled suites support release quality; they are not independent third-party certifications.

Status definitions: Verified benchmark, current measured evidence supports a public coverage claim. Tracked baseline, suite is measured and monitored but not yet a public proof layer. In calibration, suite is under active measurement. Measured upstream CVE corpus, external package-CVE suite has a reviewed runner and current score, but the claim is limited to represented addressable classes. Evaluation candidate, external suite is under review and is not current proof. Planned, the corpus is on the roadmap, but it is not evidence for a current claim yet.

Evidence layer	Status	What it supports	Current limitation
OWASP Benchmark for Java	Verified benchmark	Java web-application vulnerability detection against a public scored benchmark. Current result: perfect recall on this benchmark suite, the scanner caught 100% of the planted vulnerabilities (1,698 cases, 873 TP, 0 FN), with per-CWE discrimination between vulnerable and safe cases (Youden +1.000). The 0 controlled-benchmark false positives are not a measured real-world precision or low-false-positive rate.	Java-focused; it does not validate PHP, Ruby, PowerShell, or JavaScript framework claims.
NIST SARD / Juliet CWE suites (Java)	Verified benchmark	CWE-style evidence for supported imported Java cases. Current result: perfect recall on this suite, the scanner caught 100% of the planted vulnerabilities (6,444 cases, 6,264 TP), with per-CWE discrimination between vulnerable and safe cases (Youden +1.000). The 0 controlled-benchmark false positives are not a measured real-world precision or low-false-positive rate.	Synthetic corpus; controlled results are not real-world customer-code FP/FN rates.
OWASP Benchmark for Python	Verified benchmark	Python web-application vulnerability detection against a public scored benchmark. Current result: perfect recall on this benchmark suite, the scanner caught 100% of the planted vulnerabilities (1,230 cases, 452 TP), with per-CWE discrimination between vulnerable and safe cases (Youden +1.000). The 0 controlled-benchmark false positives are not a measured real-world precision or low-false-positive rate.	Python-focused; it does not validate JavaScript, Ruby, Go, or other-language framework claims.
NIST SARD / Juliet CWE suites (C#)	Verified benchmark	C# CWE-style evidence across path traversal, LDAP injection, XPath injection, command injection, and SQL injection. Current result: perfect recall on these suites, the scanner caught 100% of the planted vulnerabilities (2,516 cases, 2,412 TP), with per-CWE discrimination between vulnerable and safe cases (Youden +1.000). The 0 controlled-benchmark false positives are not a measured real-world precision or low-false-positive rate.	Synthetic corpus; controlled results are not real-world customer-code FP/FN rates.
SecBench.js	Measured upstream CVE corpus	Server-side JavaScript package vulnerability evidence with executable examples. Current represented class checks are saturated for code-injection, command-injection, and path-traversal cases: 305 vulnerable cases, 305 TP, 0 FN, 0 patched-version FP in the current baselines (Youden +1.000).	Not a universal JavaScript proof. The claim is limited to represented package-CVE classes and patched-version cases covered by the current benchmark; broader JS package-CVE classes and application-framework corpora are separate evidence work.
External vulnerable-app corpus	Planned	Realistic JavaScript, PHP, and Ruby smoke tests using projects such as OWASP Juice Shop, OWASP NodeGoat, DVWA, and OWASP RailsGoat.	These are training/demo applications, not scored SAST benchmarks until expected-finding maps are built.

Additional Coverage Evidence

These controlled suites support public language and risk-class claims where mature external scored benchmarks are not available. They are proof of disciplined coverage, not independent third-party certifications.

Area	What it protects	How to read it publicly
PHP, Ruby, PowerShell, Go, and Blazor language coverage	Representative vulnerable and safe cases across common web and scripting risks.	Supports Full status for represented detection coverage (Rules-only for Blazor), with framework-depth limitations still listed above.
AI/LLM security coverage	Representative AI-output risk paths into high-impact actions.	Supports current AI/LLM risk claims, not a universal claim over all agent frameworks.
Quality coverage	Checks for missed findings, noisy findings, multi-file behavior, and explanation quality.	Shows coverage discipline. It is not a substitute for public benchmarks or customer-like corpus measurement.

This page summarizes SiteShadow's technical coverage matrix: Detection Credibility Matrix and the authoritative benchmark rollup maintained for release review. Controlled benchmark and regression results are not the same as statistically measured customer-code false-positive or false-negative rates.

Benchmark Methodology

Positive and negative cases

Suites include vulnerable cases and safe cases. A suite is not considered strong enough for a coverage claim when it only contains vulnerable examples.

Explainable findings

Serious findings are expected to explain the affected code, risk class, confidence, and remediation guidance when that context is available.

Evidence discipline

Public claims are checked against the relevant benchmark evidence before they are updated.

Claim boundary

Coverage wording should be sharp, but cannot exceed benchmark evidence. Where evidence is narrow, this page labels it narrowly; where evidence is strong, the copy should say so plainly.

Current Limitations

Area	Limitation	What it means	Evidence status
Real-world FP/FN rate	No published customer-like corpus methodology yet.	Controlled benchmark rates must not be used as real-world production rates.	Customer-like sampling methodology not yet published.
IaC/container/cloud	No dedicated IaC benchmark suite yet.	Coverage exists where represented by rules, but full benchmark-backed IaC claims stay qualified.	Dedicated IaC benchmark suite pending.
Ruby/PHP/PowerShell corpus depth	Full status is backed by controlled language evidence, but real-world framework and enterprise-script variety is broader than any controlled suite.	Public claims can say Full for represented detection coverage, while still qualifying framework and customer-like corpus depth.	Full in controlled language evidence; ongoing framework and corpus expansion.
Framework modeling	Framework behavior varies widely across libraries and applications.	Supported languages still need continuous framework evidence expansion.	Ongoing framework evidence expansion.
AI/LLM risk classes	AI-agent sink libraries evolve quickly.	AI/LLM coverage is benchmark-backed for current suites and needs continued sink-family expansion.	Current suites covered; sink-family expansion ongoing.
Runtime-only issues	Some vulnerabilities depend on deployed configuration, live identity data, secrets, network reachability, or production authorization state.	Static analysis can flag risky code paths, but it cannot prove every runtime policy, tenancy boundary, or environment-specific behavior.	Requires runtime telemetry, integration tests, or customer environment validation.
Business logic	Deep workflow abuse, fraud logic, approval bypasses, and policy mistakes can require domain context the scanner does not have.	SiteShadow may identify risky authorization or data-flow patterns, but it is not a complete substitute for threat modeling and application-specific review.	Heuristic coverage exists; domain-specific proof remains customer/application dependent.
Dependency and supply chain	SiteShadow detects selected dependency and configuration risks, but it is not a full SCA, SBOM, malware, or exploitability platform.	Use dedicated dependency/SBOM tooling alongside SiteShadow for package vulnerability inventory and transitive dependency governance.	Partial rule coverage only.
Generated, minified, and highly dynamic code	Generated/minified bundles and heavy reflection/metaprogramming can hide intent and reduce explainability.	Findings may be less precise, and some paths may need source maps, original source, or manual review.	Not claimed as fully measured across all generated-code styles.

How to read this page: green means the represented controlled cases are passing today. It does not mean every framework-specific variant, custom safeguard, business workflow, runtime policy, or production coding style is fully covered.