CWE-86 Invalid Characters in Identifiers
What this means
SiteShadow flagged identifiers (filenames, IDs, usernames, resource names) containing unexpected characters that can bypass validation or change interpretation (path separators, control chars, quotes).
Why it matters
Invalid characters can enable injection or path manipulation.
- Traversal and path tricks:
../,..\\,%2f, null bytes, etc. - Header/URL manipulation when control characters like CRLF are present.
- Confusable bypasses: normalization issues allow "same but different" identifiers.
Safer examples
1) Validate with allowlists (recommended)
import re
if not re.fullmatch(r"[a-zA-Z0-9_.-]{1,64}", identifier):
raise ValueError("Invalid identifier")
2) Normalize before validating
Normalize Unicode and URL-decode once (carefully) before applying allowlists.
3) Reject control characters outright
Reject \r, \n, null bytes, and other control characters for identifiers.
How SiteShadow detects it (high level)
- Flags identifiers used in security-sensitive contexts (paths, headers, redirects) without strict allowlists.
- Detects suspicious character classes and normalization/decoding pitfalls.
References
- CWE-86: https://cwe.mitre.org/data/definitions/86.html
---