Edictum
Demo

Benchmarks

Reproduce adapter overhead, end-to-end latency, and prompt-vs-rule experiments.

AI Assistance

Right page if: you need to measure Edictum's performance overhead or compare prompt-based control vs deterministic rulesets with hard numbers. Wrong page if: you need the adapter API docs -- see https://docs.edictum.ai/docs/adapters/overview. For observe mode concepts, see https://docs.edictum.ai/docs/concepts/observe-mode. Gotcha: the adapter overhead benchmark isolates enforcement latency without LLM calls. The prompt-vs-rule benchmark requires OPENAI_API_KEY in .env. Run benchmarks before and after rule changes to quantify regression.

Benchmark source files:

1. Adapter overhead benchmark

Measures enforcement overhead without LLM latency.

cd edictum-demo
python benchmark/benchmark_adapters.py

Use this to compare overhead consistency across all 8 adapters.

2. End-to-end latency benchmark

Measures four phases: baseline tool call, enforcement only, LLM only, and full loop.

cd edictum-demo
python benchmark/benchmark_latency.py

Use this to quantify the enforcement share of total runtime in your own environment.

3. Prompt-vs-rule benchmark

Compares three stages:

  • prompt-only control
  • observe mode rollout
  • enforce mode rollout
cd edictum-demo
python benchmark/prompt_vs_contracts.py
python benchmark/prompt_vs_contracts.py --quick
python benchmark/prompt_vs_contracts.py --runs 3

Requires OPENAI_API_KEY in .env.

How to use results in rollout decisions

  1. Validate that adapter overhead remains flat before/after rule changes.
  2. Confirm enforce mode does not create unacceptable end-to-end latency regression.
  3. Use prompt-vs-rule outputs to justify moving from advisory prompt controls to deterministic rulesets.

Related docs:

Last updated on

On this page