Security Overview
Edictum's threat model, security controls, and compliance posture. What it defends against, what it does not, and the evidence behind it.
Right page if: you need a unified view of Edictum's security architecture -- threat model, controls, test results, and compliance mapping. Wrong page if: you need the detailed behavior of a specific security feature -- see the linked pages below for fail-closed guarantees, adversarial testing, or compliance mappings. Gotcha: Edictum enforces rulesets on tool calls, not on LLM text generation. It is one layer in a defense-in-depth stack. Model safety, OS sandboxing, and network policies handle the other layers.
Edictum is a deterministic enforcement layer between an AI agent's decision to act and the action itself. Rulesets are evaluated outside the LLM, on structured data (tool name, arguments, principal), not on natural language.
Threat Coverage
| Threat | Control | Rule type |
|---|---|---|
| Unauthorized tool execution | Preconditions block before execution | type: pre |
| Data exfiltration via output | Postconditions redact sensitive patterns | type: post |
| Privilege escalation | Principal-based rulesets enforce role permissions | type: pre |
| Rate abuse / runaway agents | Session rulesets cap per-tool, per-session, per-attempt | type: session |
| Path traversal / file access | Sandbox rulesets with within/not_within boundaries | type: sandbox |
| Secret leakage | Built-in deny_sensitive_reads() for .env, .ssh/, .aws/credentials | type: pre |
| Command injection | BashClassifier detects shell operators; sandbox command allowlists | type: sandbox |
| Rule tampering | SHA-256 version hashing; Ed25519 ruleset signing (control plane, server-side; SDK verification planned) | Infrastructure |
| Unauthorized sub-agent spawning | Rulesets restrict tools that create agents | type: pre |
Out of Scope
Edictum operates at the tool-call layer. These threats exist at other layers and require other controls:
| Threat | Why out of scope | Use instead |
|---|---|---|
| Write side effects already completed | Postconditions run after execution; WRITE actions already done | Preconditions + sandbox to block BEFORE execution |
| Kernel-level sandboxing | In-process library; no OS isolation | gVisor, Firecracker, seccomp, AppArmor |
| LLM hallucination (text output) | No enforcement point for text-only responses | Content moderation, RAG, output filters |
| Network-level attacks | Does not inspect network traffic | Kubernetes NetworkPolicy, service meshes, WAF |
| Prompt injection on text responses | Only enforces on tool-call execution | Input sanitization, prompt engineering |
Adversarial Testing
Four scenarios tested against GPT-4.1, DeepSeek v3.2, and Qwen3 235B with identical rulesets:
| Scenario | Result |
|---|---|
| Retry after block (agent retries a blocked tool call) | All retries blocked across all models |
| PII exfiltration (agent tries to leak data via allowed tools) | Caught by postcondition PII patterns |
| Cross-tool chain (multi-step exfiltration) | PII redacted from output |
| Role escalation (agent claims higher privilege) | Principal check blocked escalation |
DeepSeek was more aggressive than GPT-4.1 in exfiltration attempts — model safety is complementary to rulesets, not a replacement.
The core library has 114 @pytest.mark.security tests covering shell metacharacter bypasses, sandbox symlink escapes, input injection, backend failure modes, and session concurrency. The control plane has 43+ adversarial tests across 8 security boundaries (S1-S8).
See Adversarial Testing for full scenarios and results.
Fail-Closed Design
Every ambiguous failure within rule evaluation results in block. False positives are retryable. False negatives may not be. Note: when no rulesets match a tool call, the default is allow — rulesets are opt-in. Add a catch-all tool: "*", action: block rule for block-by-default behavior.
| Failure | Outcome |
|---|---|
| Rule evaluation error | Block (with policy_error: true in audit) |
| Malformed ruleset YAML | Reject load, keep previous rulesets |
| Type mismatch in condition | Block (sentinel evaluates to true) |
| Control Plane unreachable | Agents continue with cached rulesets |
| Session storage error | Block |
| Unknown rule type | Reject load |
| No matching rulesets | Allow (rulesets are opt-in) |
To enforce block-all-by-default, add a catch-all rule: tool: "*", action: block.
See Fail-Closed Guarantees for all seven scenarios.
Control Plane Security Boundaries
The control plane enforces 8 security boundaries, each with dedicated adversarial tests:
| Boundary | Threat | Defense |
|---|---|---|
| S1: Session validation | Account takeover | Redis session tokens; forged/expired cookies rejected |
| S2: API key auth | Unauthorized agent access | Revoked keys excluded; malformed prefixes rejected |
| S3: Tenant isolation | Cross-tenant data leak | Every query filtered by tenant_id; returns 404, not 403 |
| S4: Approval state | Unauthorized tool execution | Immutable once decided; double-approve returns 409 |
| S5: SSE channel | Rule/event leak | Events filtered by env + tenant_id |
| S6: Ruleset signing | Tampered rule deployment | Ed25519 signatures (server-side); private key encrypted at rest (NaCl SecretBox). SDK verification planned. |
| S7: Bootstrap lock | Post-bootstrap privilege escalation | Admin creation only when zero users exist |
| S8: Rate limiting | Credential brute force | Per-IP sliding window (Redis sorted sets) |
See Control Plane Security Model for details.
Known Limitations
| Limitation | Impact | Mitigation |
|---|---|---|
| String-based path matching | Relies on realpath() + prefix comparison | Resolves symlinks and ..; catches common traversals |
| Heuristic bash parsing | BashClassifier is not AST-based | Detects 14 shell operators; sandbox command allowlists add depth |
| TOCTOU race (symlinks) | Symlink created between eval and execution could escape | OS-level sandboxing (gVisor, seccomp) for kernel enforcement |
| Postcondition WRITE fallback | redact/block effects downgrade to warn for WRITE/IRREVERSIBLE tools | Use preconditions to block dangerous writes before execution |
Compliance
Edictum maps to four compliance frameworks:
- EU AI Act (Articles 9, 14) — risk identification, mitigation, documentation, human oversight
- SOC 2 (CC6) — logical access, credentials, authorization, decision log
- OWASP Top 10 for LLM Applications (2025) — prompt injection, insecure output, unbounded consumption, access control
- OWASP Top 10 for Agentic Applications (2026) — 6 of 10 risks mitigated
See Compliance Mapping for detailed evidence and configuration per framework.
Defense in Depth
Edictum is one layer. A complete security posture combines:
| Layer | Tool | What it covers |
|---|---|---|
| LLM safety | Model provider safety filters | Harmful text generation |
| Tool-call enforcement | Edictum | What the agent is allowed to do |
| OS sandboxing | gVisor, Firecracker, seccomp | Process isolation, syscall filtering |
| Network policies | K8s NetworkPolicy, WAF | Traffic filtering, egress control |
| Input validation | Application code | Schema validation, sanitization |
Next Steps
- Defense Scope — detailed threat model and boundaries
- Fail-Closed Guarantees — all failure modes and outcomes
- Adversarial Testing — test scenarios and cross-model results
- Compliance Mapping — EU AI Act, SOC 2, OWASP mappings
Last updated on