Background
AgentGate currently enforces policy at two surfaces:
- input surface: intercepts content entering the agent (prompts, context)
- runtime surface: intercepts the agent's tool call decisions
These two layers can see what the agent intends to do, but not what it actually does. A prompt injection attack can instruct an agent to call bash_tool, which internally reads ~/.env to exfiltrate API keys. The runtime surface only sees "agent called bash" — it cannot see inside.
Resource surface is the third enforcement layer, intercepting real resource access at the syscall level.
Design Decisions
Threat model: Agent code is trusted. The threat is malicious input (prompt injection) that tricks the agent into performing dangerous operations via tool calls.
Interception mechanism: seccomp-unotify. The agent process is frozen at the syscall boundary, the SDK reads full context from /proc/{PID}/mem, queries the engine, and writes back the decision. This enables synchronous enforcement — block first, decide, then allow or deny.
Why not eBPF LSM: eBPF programs cannot block on the kernel execution path waiting for userspace. seccomp-unotify is the only Linux mechanism that supports synchronous userspace-delegated syscall decisions.
Why not SELinux/AppArmor: Their rule systems are closed — there is no way to delegate decisions to an external PDP at runtime.
Context Model
Resource access events are linked to their cause via a four-level tree:
session_id → one conversation
task_id → one user input (input surface event)
attempt_id → one LLM tool call decision (runtime surface event)
resource[] → all syscalls from that tool call (flat, includes child processes)
attempt_id is an existing field in SessionContext. Resource surface requests must carry the attempt_id of the tool_attempt that triggered them. This enables cross-surface CEL rules such as:
surface == "resource" &&
"prompt_injection" in context.taints
→ deny
Protocol Changes
RequestKind: Remove resource_egress. Merge into a single resource_access kind. Resource type distinction is handled by ResourceAction.resource_type (file / network / exec).
ActionContext: Add Resource *ResourceAction field.
type ResourceAction struct {
ResourceType string // "file" / "network" / "exec"
Path string // file path
Operation string // "read" / "write" / "exec"
DestIP string // network
DestPort int
DestDomain string
Protocol string
Binary string // exec
Args []string
}
validateDecisionRequest: Require attempt_id for resource_access requests.
Noise Reduction
All filtering happens in the SDK, not core:
- Allowlist (adapter config, per-integration): matching paths are passed through transparently — engine never sees them, no context is produced.
- Decision cache (SDK local, session-scoped): previously allowed paths are not re-queried. deny decisions are never cached.
Architecture Boundary
Core changes are minimal — protocol fields only. The SDK encapsulates all seccomp-unotify mechanics. Adapter plugins reference the SDK and pass attempt_id when invoking tools.
Open Questions
- How does the adapter pass
attempt_id to the SDK at tool invocation time (env var, shared memory, process injection)?
- Does OpenClaw natively expose a per-tool-call unique ID?
- Allowlist config format: glob vs regex?
- Decision cache key granularity: exact path vs prefix?
approval_required semantics when the agent process is frozen: timeout policy, integration with existing approval flow?
Implementation Phases
- Phase 1 (protocol):
ActionContext.Resource, resource_access RequestKind, attempt_id validation, CEL field support
- Phase 2 (SDK): seccomp-unotify wrapper, allowlist, decision cache, syscall → PolicyRequest translation,
attempt_id context propagation
- Phase 3 (integration): OpenClaw plugin, end-to-end tests
Background
AgentGate currently enforces policy at two surfaces:
These two layers can see what the agent intends to do, but not what it actually does. A prompt injection attack can instruct an agent to call
bash_tool, which internally reads~/.envto exfiltrate API keys. The runtime surface only sees "agent called bash" — it cannot see inside.Resource surface is the third enforcement layer, intercepting real resource access at the syscall level.
Design Decisions
Threat model: Agent code is trusted. The threat is malicious input (prompt injection) that tricks the agent into performing dangerous operations via tool calls.
Interception mechanism:
seccomp-unotify. The agent process is frozen at the syscall boundary, the SDK reads full context from/proc/{PID}/mem, queries the engine, and writes back the decision. This enables synchronous enforcement — block first, decide, then allow or deny.Why not eBPF LSM: eBPF programs cannot block on the kernel execution path waiting for userspace. seccomp-unotify is the only Linux mechanism that supports synchronous userspace-delegated syscall decisions.
Why not SELinux/AppArmor: Their rule systems are closed — there is no way to delegate decisions to an external PDP at runtime.
Context Model
Resource access events are linked to their cause via a four-level tree:
attempt_idis an existing field inSessionContext. Resource surface requests must carry theattempt_idof thetool_attemptthat triggered them. This enables cross-surface CEL rules such as:Protocol Changes
RequestKind: Removeresource_egress. Merge into a singleresource_accesskind. Resource type distinction is handled byResourceAction.resource_type(file/network/exec).ActionContext: AddResource *ResourceActionfield.validateDecisionRequest: Requireattempt_idforresource_accessrequests.Noise Reduction
All filtering happens in the SDK, not core:
Architecture Boundary
Core changes are minimal — protocol fields only. The SDK encapsulates all seccomp-unotify mechanics. Adapter plugins reference the SDK and pass
attempt_idwhen invoking tools.Open Questions
attempt_idto the SDK at tool invocation time (env var, shared memory, process injection)?approval_requiredsemantics when the agent process is frozen: timeout policy, integration with existing approval flow?Implementation Phases
ActionContext.Resource,resource_accessRequestKind,attempt_idvalidation, CEL field supportattempt_idcontext propagation