Designing Block Messages

The block message is the agent's only feedback on what went wrong. A good message steers the agent toward a productive alternative. A bad one causes retries, confusion, and wasted compute.

The block message is the agent's only feedback on what went wrong. The LLM does not see the rule YAML, the audit event, or the evaluation trace. It sees a single string. That string determines whether the agent recovers gracefully or burns tokens retrying.

then:
  action: block
  message: "Read of '{args.path}' blocked. Use environment variables instead."

This page covers how to write messages that steer agents toward productive alternatives.

Message Anatomy

Every effective block message has three parts:

What happened -- the tool call was blocked, the limit was reached, or the output was modified.
Why -- what condition triggered the block.
What to do instead -- the alternative action the agent should take.

# All three parts in one message:
message: "Write to '{args.path}' blocked. Production config files are read-only. Use the config API to update settings."
#         ^^ what happened              ^^ why                                 ^^ what to do instead

Skipping any part degrades the agent's response. Without "what happened," the agent does not know the call failed. Without "why," the agent cannot reason about the constraint. Without "what to do instead," the agent retries or gives up.

Quality Levels

Three levels of message quality, from worst to best:

Bad: No Context

message: "Blocked."

The agent knows the call failed, but nothing else. It will likely retry with the same arguments, or try a minor variation that also fails. This is the most common mistake.

Better: Context Without Direction

message: "Read of sensitive file blocked: {args.path}"

The agent knows what was blocked and which file triggered it. But it does not know what to do instead. Some agents will try a different path to the same file. Others will give up entirely.

Best: Context With Alternative

message: "Read of '{args.path}' blocked. Use environment variables instead."

The agent knows the call failed, why, and what to do next. This is the target for every block message.

More examples at this level:

# Precondition: file protection
message: "Analysts cannot read '{args.path}'. Ask an admin for help."

# Precondition: destructive command
message: "Destructive command blocked: '{args.command}'. Use a safer alternative."

# Precondition: production gate
message: "Production deploys require senior role (sre/admin). Your role: {principal.role}."

# Session: tool call cap
message: "50 tool calls reached. Summarize what you accomplished, list remaining tasks, and stop."

# Session: per-tool cap
message: "deploy_service has been called 3 times. No more deploys. If the deployment failed, report the error instead of retrying."

Patterns by Rule Type

Each rule type has a different relationship with the agent, and the message should reflect that.

Precondition Blocks

The tool never executed. The agent needs to know what was blocked and what to do instead.

Pattern: "[Action] on [target] blocked. [Alternative]." or "[Target] blocked for [reason]. [Alternative]."

# File protection -- tell the agent to skip, not retry
message: "Sensitive file '{args.path}' blocked. Skip and continue with the next task."

# Role gate -- tell the agent who can do this
message: "Only admins can call {tool.name} in production. Your role: {principal.role}."

# Ticket requirement -- tell the agent what's missing
message: "Production changes require a ticket reference. Attach a ticket_ref to the principal."

# Blast radius limit -- tell the agent the cap
message: "Batch delete of {args.batch_size} records exceeds the limit of 100. Reduce the batch size."

Approval Messages

The tool is paused, not blocked. The agent needs to know the call is waiting for a human decision.

Pattern: "[Action] requires approval. [Who/what is needed]." or "[Action] pending approval from [approver]."

# Production deploy
message: "Production deploy by {principal.role} requires approval. Waiting for admin review."

# High-risk operation
message: "Deletion of {args.table} requires human approval. An admin has been notified."

Keep approval messages factual. The agent cannot influence the approval decision, so the message should set the expectation that execution is paused, not suggest the agent take alternative action.

Session Limit Messages

The session is over. The agent must stop, not retry. This is the one message type where you should be explicit about stopping.

Pattern: "[Limit] reached. Summarize [progress] and stop."

# Total call cap
message: "50 tool calls reached. Summarize what you accomplished, list remaining tasks, and stop."

# Attempt cap (catches retry loops)
message: "200 attempts reached (including blocked calls). Stop retrying and report what happened."

# Per-tool cap
message: "deploy_service has been called 3 times this session. No more deploys allowed. If the deployment failed, report the error instead of retrying."

Session messages should always include the word "stop" or an equivalent directive. Without it, agents continue attempting tool calls and hit the limit repeatedly.

Postcondition Warnings

The tool already executed. The message explains what was detected in the output and how it was handled.

Pattern: "[What was detected] in output. [What happened to it]."

# Secret redaction
message: "API key pattern detected and redacted from output."

# PII warning
message: "SSN pattern detected in {tool.name} output. Review before sharing."

# Full suppression
message: "Secrets detected in output. Full output suppressed."

Postcondition messages are informational. The agent cannot undo the tool call, but it can avoid repeating the action or adjust its approach.

Variable Interpolation

Messages support {placeholder} expansion from the tool call envelope. This makes messages specific -- the agent sees the actual file path, command, or role that triggered the block.

Available placeholders follow the same selector paths as the rule expression grammar:

Placeholder	Source
`{args.path}`, `{args.command}`	Tool call arguments
`{tool.name}`	The tool being called
`{environment}`	The configured environment
`{principal.user_id}`, `{principal.role}`	Principal identity
`{principal.claims.department}`	Custom claims on the principal
`{env.VAR_NAME}`	OS environment variable

Missing placeholders are kept as-is. If the tool call has no path argument, {args.path} appears literally in the message. This is intentional -- no crash, no empty string, and the literal placeholder in the output signals a misconfiguration.

Each placeholder expansion is capped at 200 characters. Values longer than 200 characters are truncated with .... This prevents a large argument (like a full file body) from blowing up the message.

Values that look like secrets are redacted. If an expanded value matches secret patterns (API keys, tokens), it is replaced with [REDACTED] in the message. This prevents rulesets from accidentally leaking secrets in block messages.

For the full variable interpolation reference, see YAML reference.

Tone

Block messages should be:

Imperative. Tell the agent what to do: "Use environment variables instead." Not "You might want to consider using environment variables."
Factual. State what happened and why. No hedging, no apologies.
Helpful. Always include the alternative when one exists.
Concise. The agent processes the message as context. Longer messages consume tokens without adding value.

The agent treats the message as an instruction. Write it like one.

Anti-Patterns

Vague messages

# Bad
message: "Blocked."
message: "Not allowed."
message: "Error."

The agent has no information to act on. It will retry, try variations, or give up. Always include what was blocked and why.

Messages that encourage retrying

# Bad
message: "This action is temporarily unavailable. Try again later."

Rule blocks are deterministic. The same call with the same arguments and the same principal will always be blocked. "Try again later" is misleading -- the agent will retry indefinitely. If the block is conditional on something the agent can change, name that thing explicitly:

# Good
message: "Deploy blocked without a ticket reference. Attach a ticket_ref to the principal and retry."

Threatening or anthropomorphizing messages

# Bad
message: "WARNING: You are attempting an unauthorized action. This has been reported."

The agent is not a person. It does not respond to warnings or threats. It responds to instructions. Keep the message factual and actionable.

Messages that leak information

# Bad
message: "Access blocked. The admin password is stored in /etc/secrets/admin.key."

Block messages are returned to the agent and may appear in logs, decision logs, and user-facing outputs. Do not include sensitive paths, credentials, or internal details that the agent (or a malicious prompt) could exploit.

Overly long messages

# Bad
message: "The tool call you just made has been blocked because it violates the security rule that was put in place to prevent unauthorized access to sensitive files in the production environment. Please use an alternative approach such as environment variables or the configuration API to achieve the same result without accessing the file directly."

This consumes tokens without adding value beyond what a concise message provides. Aim for one to two sentences.

Checklist

Before shipping a rule, check the message against this list:

Does the message say what was blocked?
Does the message say why?
Does the message suggest what to do instead (if an alternative exists)?
Is the message concise (one to two sentences)?
Does the message use variable interpolation to include the specific argument or principal that triggered the block?
Does the message avoid encouraging retries for deterministic blocks?
Does the message avoid leaking sensitive information?

Next Steps

Tutorial: Creating Rulesets -- full rule authoring workflow
Preconditions -- known-bad list rulesets with message examples
Session Rulesets -- session limit messages
YAML Reference -- full rule syntax including message fields
Postcondition Design -- designing postcondition effects and messages

Designing Block Messages

On this page