AD2042: NoUnicodeSymbolsPE¶
Summary¶
| Property | Value |
|---|---|
| ID | AD2042 |
| Name | NoUnicodeSymbolsPE |
| Category | Correctness |
| Severity | Warning |
| Applies to | PE (Windows) |
Description¶
PE binaries should not contain symbols with suspicious Unicode characters that could be used for Trojan Source attacks or visual obfuscation.
How It Works¶
The rule scans exported symbols and debug information for:
- Bidirectional control characters (RLO, LRO, etc.)
- Homoglyph characters that resemble ASCII
- Zero-width characters
- Other potentially deceptive Unicode
Why This Matters¶
Unicode-based attacks can make malicious code appear legitimate to human reviewers while remaining functional.
Trojan Source Attack¶
// Appears as:
if (access_granted) {
safe_action();
}
// Actually executes:
if (access_granted) {
malicious_action(); // Hidden by Unicode
}
Dangerous Characters¶
| Character | Code Point | Risk |
|---|---|---|
| RLO | U+202E | Reverses text display |
| LRO | U+202D | Overrides direction |
| ZWNJ | U+200C | Invisible separator |
| Cyrillic 'а' | U+0430 | Looks like ASCII 'a' |
Supply Chain Impact¶
| Stage | Risk |
|---|---|
| Code review | Malicious code invisible |
| Compilation | Compiler sees real code |
| Binary | Contains misleading symbols |
| Debugging | Confusing symbol names |
Resolution¶
- Audit source code for suspicious Unicode
- Configure editors to reveal hidden characters
- Use compiler warnings for Unicode issues
- Rebuild with clean source files