Defense Scope
What Edictum defends against, what it does not, and how it fits alongside OS-level sandboxing, network policies, and LLM safety layers.
Right page if: you need to understand Edictum's threat model -- what it defends against, what is out of scope, and where it fits in a defense-in-depth stack. Wrong page if: you need fail-closed behavior details (what happens when things break) -- see https://docs.edictum.ai/docs/security/fail-closed. For compliance framework mappings (OWASP Top 10 for LLMs, EU AI Act, SOC 2), see https://docs.edictum.ai/docs/security/compliance. Gotcha: Edictum cannot undo WRITE/IRREVERSIBLE side effects after execution -- use preconditions to deny dangerous writes BEFORE they run. Sandbox path matching uses os.path.realpath() but is subject to TOCTOU race conditions with symlinks created between evaluation and execution.
Edictum enforces contracts on AI agent tool calls. It sits between the agent's decision to act and the action itself -- a deterministic enforcement point that the agent cannot negotiate, argue with, or bypass. Contracts are evaluated outside the LLM, on structured data (tool name, arguments, principal), not on natural language.
This page is honest about what that covers and what it does not.
What Edictum Defends Against
Edictum's enforcement model covers threats that manifest as tool calls -- the concrete actions an agent takes in the world.
| Threat | How Edictum handles it |
|---|---|
| Unauthorized tool execution | Preconditions deny tool calls that fail contract checks before execution. The tool never runs. |
| Data exfiltration via output | Postconditions with effect: redact strip sensitive patterns (SSNs, API keys, credentials) from tool results before they reach the agent. |
| Privilege escalation | Principal-based contracts enforce role-level permissions on every tool call. An intern principal cannot run deploy even if the agent tries. |
| Unauthorized sub-agent spawning | Contracts can restrict which tools are allowed to create sub-agents and under what conditions. |
| Secret leakage | The built-in deny_sensitive_reads precondition denies reads of .env, .ssh/, .aws/credentials, key files, and similar paths. Postconditions catch secrets that appear in tool output. |
| Rate abuse | Session contracts cap per-tool, per-session, and per-attempt counts. An agent stuck in a loop hits the attempt cap and is denied. |
| Contract tampering | Contract bundles are immutable YAML with version hashing. Edictum.reload() atomically swaps bundles; malformed YAML is rejected and the previous bundle stays in effect. |
| Sensitive file access | Sandbox contracts define allowlist boundaries for file paths. Anything outside within is denied -- regardless of which command accesses it. |
These protections are deterministic. They do not depend on LLM behavior, prompt engineering, or model capabilities. A contract that denies rm -rf / will deny it whether the request comes from GPT-4, Claude, or a compromised prompt.
Out of Scope
Edictum operates at the tool-call layer. Threats that exist at other layers -- the network, the kernel, the LLM's text generation -- are outside its enforcement boundary.
Write side effects already completed
Postconditions run after the tool executes. For READ and PURE tools, postconditions can redact or deny the output because the action is reversible (hiding a read result loses nothing). For WRITE and IRREVERSIBLE tools, the action has already happened by the time postconditions evaluate. Edictum falls back to warn because suppressing the result would only remove context the agent needs to understand what it did.
What to use instead: Preconditions and sandbox contracts to deny dangerous writes before execution. For writes that must be allowed but monitored, use postconditions with effect: warn and route findings to your audit system.
Kernel-level sandboxing
Edictum is an in-process library. It evaluates contracts in the same process as the agent. It does not enforce OS-level isolation -- it cannot prevent a tool from accessing memory, syscalls, or hardware resources that the process has access to.
What to use instead: gVisor, Firecracker, containers with seccomp profiles, or AppArmor/SELinux policies. These enforce boundaries at the kernel level where the process cannot escape.
Hallucinated text content
Edictum enforces contracts on actions (tool calls with structured arguments), not on words (the LLM's text output). If an agent hallucinates incorrect information in a text response without calling a tool, Edictum has no enforcement point.
What to use instead: LLM output filters, retrieval-augmented generation (RAG) for factual grounding, or content moderation APIs that operate on the text generation layer.
Network-level attacks
Edictum does not inspect network traffic, enforce TLS, or block connections. If a tool makes an HTTP request, Edictum can check the domain via sandbox contracts (allows.domains), but it cannot enforce network-level properties like encryption in transit, certificate pinning, or packet inspection.
What to use instead: Network policies (Kubernetes NetworkPolicy, cloud security groups), service meshes (Istio, Linkerd), or Web Application Firewalls (WAFs).
Prompt injection on text responses
Edictum enforces contracts on tool-call execution, not on text that flows between the user and the LLM. If a prompt injection causes the agent to produce harmful text without calling a tool, Edictum does not intercept it. If the injection causes the agent to call a tool, Edictum evaluates that tool call against contracts -- the injection's influence stops at the enforcement point.
What to use instead: Input sanitization, prompt engineering defenses, LLM-layer safety filters.
Known Technical Limitations
Beyond the architectural boundaries above, Edictum has specific technical limitations that users should be aware of.
String-based boundary matching
Sandbox contracts match file paths and domains using string prefix comparison and fnmatch patterns. This is not semantic analysis. A path like /workspace/../etc/shadow is resolved via os.path.realpath() before comparison (so that traversal is caught), but the matching itself operates on strings, not on filesystem semantics.
Heuristic command parsing
The bash command classifier extracts the first whitespace-delimited token from a command string to identify the command name. This is heuristic, not AST-based. Complex shell constructs like VAR=val command, command substitution ($(cmd)), or chained commands (cmd1 && cmd2) may not be fully parsed. The first token is checked against allows.commands, but subsequent tokens in a chain are not individually validated.
The shell operator detection checks for ${, $(, |, ;, &&, ||, backticks, and other constructs, but does not detect bare $VAR expansions (without braces). A command like echo $AWS_SECRET_ACCESS_KEY classifies as READ because echo is in the read allowlist and $ without { or ( does not trigger operator detection. This means environment variable values can be exfiltrated through commands classified as safe for postcondition purposes. Use sandbox command allowlists (allows.commands) rather than relying on side-effect classification for security-critical enforcement.
Sandbox contracts mitigate this by also checking file paths extracted from the full argument string against within/not_within boundaries -- even if the command token is not fully parsed, the path restrictions still apply.
TOCTOU race conditions
Sandbox contracts resolve paths with os.path.realpath() at evaluation time. A symlink created after Edictum evaluates the path but before the tool actually executes could point to a different target. This race window is inherent to application-level enforcement. See sandbox contracts: known limitations for the full list of resolution edge cases.
Defense in Depth
Edictum is one layer in a defense-in-depth stack. It covers the tool-call layer -- the enforcement point between agent decisions and real-world actions. Other layers cover other threats.
| Layer | Covers | Examples |
|---|---|---|
| LLM safety | Text generation, harmful content, jailbreaks | Model safety training, output filters, content moderation APIs |
| Edictum | Tool-call enforcement, contract evaluation, audit trail | Preconditions, postconditions, sandbox contracts, session limits |
| OS sandboxing | Process isolation, syscall filtering, filesystem namespaces | gVisor, Firecracker, seccomp, AppArmor |
| Network policies | Traffic filtering, domain restrictions, encryption | Kubernetes NetworkPolicy, security groups, service meshes |
| Authentication | Identity verification, credential management | OAuth, API key rotation, certificate-based auth |
Edictum accepts a Principal but does not authenticate it. Your application provides the principal -- Edictum enforces contracts based on it. The authentication layer is upstream.
The strongest deployments use all of these together. Edictum catches the tool-call-level threats that OS sandboxing and network policies cannot see (because they operate below the application layer). OS sandboxing catches the kernel-level escapes that Edictum cannot enforce (because it is in-process). Neither replaces the other.
Next Steps
- Fail-closed guarantees -- what happens when things go wrong
- Sandbox contracts -- allowlist boundaries for file paths, commands, and domains
- Adversarial testing -- testing contract bypasses
- Pipeline architecture -- the full enforcement pipeline
- Compliance -- OWASP Top 10 for LLMs, OWASP Top 10 for Agentic AI, EU AI Act, and SOC 2 mappings
Last updated on