Observe Mode

Observe mode lets you test rulesets against live traffic without blocking any tool calls.

Observe mode lets you test rulesets against live traffic without blocking any tool calls. Preconditions that would fire emit CALL_WOULD_DENY audit events instead of blocking. The tool call proceeds normally.

This gives you real data on what your rulesets would do before you enforce them.

Enforce Mode

Tool Call: read_file(".env")

Precondition: block-dotenv

Rule matches!

args.path contains .env

CALL_DENIED

audit event emitted

Tool NEVER executes

agent receives denial message

Observe Mode

Tool Call: read_file(".env")

Precondition: block-dotenv

Rule matches!

would have denied

CALL_WOULD_DENY

audit event logged

Tool STILL executes

postconditions run on output

The Workflow

1. Deploy rulesets in observe mode
        |
2. Review CALL_WOULD_DENY audit events
        |
3. Tune rulesets (fix false positives, tighten loose rulesets)
        |
4. Switch to enforce mode

Step 1: Deploy in observe mode. Set mode: observe in your ruleset and deploy to production. Agents run normally -- no tool calls are blocked.

Step 2: Review audit events. Every precondition that would have blocked a call emits a CALL_WOULD_DENY event. Query your audit sink (stdout, file, OTel) for these events to see which rulesets fire and how often.

Step 3: Tune. If a rule fires too often (false positives), narrow its when condition. If it never fires, check that the selectors match your tool arguments. Use edictum check to test specific tool calls against your rulesets without running them.

Step 4: Enforce. Change mode: observe to mode: enforce. Rulesets now actively block tool calls.

Enabling Observe Mode

Pipeline-level: all rulesets observe

Set the default mode in your ruleset:

defaults:
  mode: observe

Every rule in the ruleset runs in observe mode. No tool calls are blocked.

Per-rule: test one rule in observe mode

Leave defaults.mode set to enforce and set mode: observe on specific rules:

defaults:
  mode: enforce

rules:
  - id: block-dotenv
    type: pre
    tool: read_file
    when:
      args.path: { contains: ".env" }
    then:
      action: block
      message: "Blocked: read of sensitive file {args.path}"

  - id: experimental-api-check
    type: pre
    mode: observe
    tool: call_api
    when:
      args.endpoint: { contains: "/v1/expensive" }
    then:
      action: block
      message: "Expensive API call detected (observe mode)."

Here, block-dotenv enforces (blocks matching calls) while experimental-api-check observes (logs what it would block but allows the call).

What Changes in Observe Mode

Behavior	Enforce Mode	Observe Mode
Precondition matches	Tool call is blocked	Tool call proceeds
Audit event action	`CALL_DENIED`	`CALL_WOULD_DENY`
Tool executes	No	Yes
Postconditions run	N/A (tool didn't run)	Yes (tool ran)
Audit trail records the match	Yes	Yes
Session counters	Attempt counted, execution not	Attempt counted, execution counted

The critical difference: in observe mode, the tool always executes. The decision log shows you exactly what enforcement would have done, without any impact on the agent.

Postconditions in Observe Mode

Postconditions always produce findings (warnings), never blocks. In observe mode, postcondition warnings are prepended with [observe] in the warning string (e.g., "[observe] PII detected in output"). The audit event is still emitted as CALL_EXECUTED or CALL_FAILED -- there is no separate would_warn action. The on_postcondition_warn callback fires in both modes.

Reviewing Observe-Mode Events

Audit events from observe mode include the same fields as enforce-mode events: tool name, arguments, principal, rule ID, policy version, and session counters. The action field distinguishes them:

CALL_DENIED -- enforce mode, call was blocked
CALL_WOULD_DENY -- observe mode, call would have been blocked

Filter your audit sink for CALL_WOULD_DENY to see the observed block report. Group by decision_name (the rule id) to see which rulesets fire most often.

Dual-Mode Evaluation with `observe_alongside`

Observe mode applies to individual rulesets or to an entire ruleset. But sometimes you need to run two versions of the same rule simultaneously -- the current enforced version and a candidate version that only observes. This is dual-mode evaluation.

The Use Case

You have rulesets running in production. A new version is ready but you want to compare its behavior against the current version before promoting it. You need both versions evaluating the same tool calls, with the current version making real decisions and the candidate only logging.

How It Works

Create a second YAML file with observe_alongside: true at the top level:

# candidate.yaml
apiVersion: edictum/v1
kind: Ruleset
observe_alongside: true

metadata:
  name: candidate-rulesets

defaults:
  mode: enforce

rules:
  - id: block-sensitive-reads
    type: pre
    tool: read_file
    when:
      args.path:
        contains_any: [".env", ".secret", "credentials", ".pem", ".key"]
    then:
      action: block
      message: "Blocked: read of sensitive file {args.path}"

Load both rulesets:

guard = Edictum.from_yaml("rules/base.yaml", "rules/candidate.yaml")

The pipeline evaluates both versions on every tool call:

Enforced rulesets from base.yaml make real allow/block decisions
Observed rulesets from candidate.yaml evaluate in parallel, producing separate audit events with mode: "observe"

Observed rule IDs are suffixed with :candidate (e.g., block-sensitive-reads:candidate). Observed rulesets never block tool calls -- they only produce audit events.

Observed Audit Events

Observed rulesets emit the same audit events as regular observe mode:

CALL_WOULD_DENY -- the observed rule would have blocked this call
CALL_ALLOWED -- the observed rule allowed this call

Filter your audit sink for mode: "observe" and decision_name ending in :candidate to see the observed evaluation results.

When to Use

Rule update rollouts. Deploy the candidate in observe mode. Compare its decision log with the enforced version. If the candidate would have blocked calls that should be allowed (false positives), tune it before promoting.

A/B testing rulesets. Run a stricter version of a rule in observe mode to measure the impact of tightening a rule.

Composition Report

Use return_report=True to see which rulesets were observed:

guard, report = Edictum.from_yaml(
    "rules/base.yaml",
    "rules/candidate.yaml",
    return_report=True,
)

for s in report.observe_rules:
    print(f"{s.rule_id}: observe copy from {s.observed_source}")

See Ruleset Composition for full composition reference.

Next Steps

Rulesets -- writing preconditions, postconditions, and session rulesets
How it works -- the full pipeline walkthrough
Quickstart -- try observe mode in the bonus step
YAML reference -- mode field, defaults block, and observe_alongside

On this page