Benchmarks
Reproduce adapter overhead, end-to-end latency, and prompt-vs-contract experiments.
Right page if: you need to measure Edictum's performance overhead or compare prompt-based governance vs deterministic contracts with hard numbers. Wrong page if: you need the adapter API docs -- see https://docs.edictum.ai/docs/adapters/overview. For observe mode concepts, see https://docs.edictum.ai/docs/concepts/observe-mode. Gotcha: the adapter overhead benchmark isolates governance latency without LLM calls. The prompt-vs-contract benchmark requires OPENAI_API_KEY in .env. Run benchmarks before and after contract changes to quantify regression.
Benchmark source files:
benchmark/README.mdbenchmark/benchmark_adapters.pybenchmark/benchmark_latency.pybenchmark/prompt_vs_contracts.py
1. Adapter overhead benchmark
Measures governance overhead without LLM latency.
cd edictum-demo
python benchmark/benchmark_adapters.pyUse this to compare overhead consistency across all 8 adapters.
2. End-to-end latency benchmark
Measures four phases: baseline tool call, governance only, LLM only, and full loop.
cd edictum-demo
python benchmark/benchmark_latency.pyUse this to quantify the governance share of total runtime in your own environment.
3. Prompt-vs-contract benchmark
Compares three stages:
- prompt-only governance
- observe mode rollout
- enforce mode rollout
cd edictum-demo
python benchmark/prompt_vs_contracts.py
python benchmark/prompt_vs_contracts.py --quick
python benchmark/prompt_vs_contracts.py --runs 3Requires OPENAI_API_KEY in .env.
How to use results in rollout decisions
- Validate that adapter overhead remains flat before/after contract changes.
- Confirm enforce mode does not create unacceptable end-to-end latency regression.
- Use prompt-vs-contract outputs to justify moving from advisory prompt controls to deterministic contracts.
Related docs:
Last updated on