Edictum
Reference

Dry-Run Evaluation

Test whether a tool call would be allowed, blocked, or warned without executing it.

AI Assistance

Right page if: you need to test a tool call without executing it -- `evaluate()` returns a decision and matching rules, but does not mutate runtime state. Wrong page if: you need the full runtime pipeline with workflow gates, session state, or audit -- use `run()`. Gotcha: dry-run evaluation skips session rules and workflow gates because they need runtime session state. Check output rules only run when you provide `output`.

evaluate() answers one question: what would Edictum do if this tool call happened right now?

It does not execute the tool, write audit events, advance workflow stages, or touch session counters.

Quick Example

from edictum import Edictum

guard = Edictum.from_yaml("rules.yaml")

result = guard.evaluate("read_file", {"path": ".env"})
print(result.decision)       # "block"
print(result.block_reasons)  # ["Sensitive file '.env' blocked."]

evaluate()

def evaluate(
    self,
    tool_name: str,
    args: dict[str, Any],
    *,
    principal: Principal | None = None,
    output: str | None = None,
    environment: str | None = None,
) -> EvaluationResult

Python evaluate() is synchronous. TypeScript evaluate() is async. Go Evaluate() is synchronous.

Parameters

ParameterTypeDefaultDescription
tool_namestrrequiredThe tool being called
argsdict[str, Any]requiredTool call arguments
principalPrincipal | NoneNoneIdentity context for the call
outputstr | NoneNoneSimulated tool output. When provided, check_output rules are evaluated
environmentstr | NoneNoneOverride the guard's default environment

Behavior

  • Exhaustive evaluation. All matching rules are evaluated. There is no short-circuit on the first block.
  • No tool execution. The tool function is never called.
  • No session state. Session rules are skipped because dry-run evaluation has no runtime session.
  • No workflow gates. Workflow runtime enforcement is skipped for the same reason.
  • Sandbox rules are evaluated. They are stateless, so dry-run includes them.
  • Check output rules require output. Without output, only check rules and sandbox rules run.

Examples

Test a check rule:

result = guard.evaluate("read_file", {"path": ".env"})
assert result.decision == "block"
assert result.rules[0].rule_id == "block-dotenv"

Test with principal context:

from edictum import Principal

result = guard.evaluate(
    "deploy_service",
    {"service": "api", "env": "production"},
    principal=Principal(role="sre", ticket_ref="JIRA-456"),
)
assert result.decision == "allow"

Test a check_output rule by providing output:

result = guard.evaluate(
    "read_file",
    {"path": "data.txt"},
    output="SSN: 123-45-6789",
)
assert result.decision == "warn"
assert len(result.warn_reasons) > 0

Test a sandbox boundary:

result = guard.evaluate("read_file", {"path": "/etc/shadow"})
assert result.decision == "block"

sandbox_results = [rule for rule in result.rules if rule.rule_type == "sandbox"]
assert len(sandbox_results) == 1
assert sandbox_results[0].passed is False

evaluate_batch()

def evaluate_batch(
    self,
    calls: list[dict[str, Any]],
) -> list[EvaluationResult]

Evaluates multiple tool calls. Each call is evaluated independently via evaluate().

Call Format

Each dict in the calls list accepts these keys:

KeyTypeRequiredDescription
toolstryesTool name
argsdictnoTool arguments (defaults to {})
principaldictnoPrincipal as a dict with keys: role, user_id, ticket_ref, claims
outputstr | dictnoSimulated output. Dicts are JSON-serialized automatically
environmentstrnoEnvironment override

Example

results = guard.evaluate_batch([
    {"tool": "read_file", "args": {"path": ".env"}},
    {"tool": "read_file", "args": {"path": "readme.txt"}},
    {"tool": "read_file", "args": {"path": "data.txt"}, "output": "SSN: 123-45-6789"},
    {
        "tool": "deploy_service",
        "args": {"service": "api"},
        "principal": {"role": "sre", "ticket_ref": "JIRA-123"},
    },
])

assert results[0].decision == "block"
assert results[1].decision == "allow"
assert results[2].decision == "warn"
assert results[3].decision == "allow"

EvaluationResult

Returned by evaluate(). It contains the overall decision and per-rule details.

Python fields

FieldDescription
decision"allow", "deny", or "warn"
tool_nameTool name that was evaluated
rulesPer-rule results
block_reasonsMessages from failed check or sandbox rules
warn_reasonsMessages from failed check_output rules
rules_evaluatedTotal number of evaluated rules
policy_errorTrue if any rule raised a policy error

TypeScript fields

FieldDescription
decision"allow", "block", or "warn"
toolNameTool name that was evaluated
rulesPer-rule results
denyReasonsMessages from failed check or sandbox rules
warnReasonsMessages from failed check_output rules
contractsEvaluatedTotal number of evaluated rules. This is a legacy TypeScript field name.
workflowSkippedtrue when a workflow runtime is attached
workflowReasonWhy workflow enforcement was skipped

Go fields

FieldDescription
Decision"allow", "block", or "warn"
ToolNameTool name that was evaluated
RulesPer-rule results
BlockReasonsMessages from failed check or sandbox rules
WarnReasonsMessages from failed check_output rules
RulesEvaluatedTotal number of evaluated rules
WorkflowSkippedtrue when a workflow runtime is attached
WorkflowReasonWhy workflow enforcement was skipped

Each RuleResult contains the rule ID, rule type, whether it passed, the message, whether it was only observed, and whether a policy error occurred.

evaluate() vs run() vs CLI

evaluate()run()edictum check / edictum test
Executes the toolNoYesNo
Session trackingNoYesNo
Workflow gatesNoYesNo
Audit eventsNoYesNo
Async requiredNoYesN/A
Check rulesYesYesYes
Sandbox rulesYesYesYes
Check output rulesOnly with outputAlways--calls only
Short-circuitsNoYesNo

Use evaluate() for fast rule debugging. Use run() when you need real runtime behavior. Use the CLI for ad hoc checks and CI.

Last updated on

On this page