feat: add composable guardrails system for input/output validation by sergiobayona · Pull Request #59 · chatwoot/ai-agents

sergiobayona · 2026-03-18T14:55:08Z

Introduce a guardrail layer that intercepts content before it reaches an agent (input guards) and before it returns to the caller (output guards). Guards are composable, ordered, and follow the same thread-safe, stateless design as Tools.

A guard's call method returns one of three outcomes:

pass (nil or GuardResult.pass): content proceeds unchanged
rewrite (GuardResult.rewrite): content is replaced before continuing to the next guard or the LLM
tripwire (GuardResult.tripwire): the run is aborted immediately with a dedicated error and metadata on the RunResult

Key design decisions:

Guards are agent-scoped (input_guards: / output_guards: kwargs), not global, enabling fine-grained per-agent policies
Fail-open by default: a guard that raises an unexpected exception logs and passes. strict: true converts exceptions to tripwires
Input guards run once before the first LLM call; output guards run only on the final response (not intermediate tool-call turns)
Guard chains execute in array order; each guard sees the output of the previous guard's potential rewrite
Structured output (Hash/Array from response_schema) is serialized to JSON before the guard chain and deserialized back after rewrite
GuardRunner.run tracks rewrites across the chain and returns action: :rewrite so callers can detect changes
Dedup check (last_message_matches?) runs after input guards so rewritten input is compared against history
Tripwire rescue uses finalize_run with guardrail_tripwire kwarg; StandardError rescue has a safety-net re-raise for Tripwire

New files:

lib/agents/guard.rb — base class, Tripwire exception, DSL
lib/agents/guard_result.rb — value object (pass/rewrite/tripwire)
lib/agents/guard_runner.rb — ordered chain executor

Integration points:

Agent: accepts input_guards/output_guards, propagated through clone
Runner: input guards before LLM, output guards before finalize_run, Guard::Tripwire rescue with guardrail_tripwire metadata on RunResult
RunResult: new guardrail_tripwire field and tripwired? predicate
CallbackManager: new guard_triggered event type
AgentRunner: on_guard_triggered callback registration
Instrumentation: agents.run.guard.* OTel spans with phase/action attributes, compatible with Langfuse

Introduce a guardrail layer that intercepts content before it reaches an agent (input guards) and before it returns to the caller (output guards). Guards are composable, ordered, and follow the same thread-safe, stateless design as Tools. A guard's `call` method returns one of three outcomes: - **pass** (nil or GuardResult.pass): content proceeds unchanged - **rewrite** (GuardResult.rewrite): content is replaced before continuing to the next guard or the LLM - **tripwire** (GuardResult.tripwire): the run is aborted immediately with a dedicated error and metadata on the RunResult Key design decisions: - Guards are agent-scoped (`input_guards:` / `output_guards:` kwargs), not global, enabling fine-grained per-agent policies - Fail-open by default: a guard that raises an unexpected exception logs and passes. `strict: true` converts exceptions to tripwires - Input guards run once before the first LLM call; output guards run only on the final response (not intermediate tool-call turns) - Guard chains execute in array order; each guard sees the output of the previous guard's potential rewrite - Structured output (Hash/Array from response_schema) is serialized to JSON before the guard chain and deserialized back after rewrite - GuardRunner.run tracks rewrites across the chain and returns action: :rewrite so callers can detect changes - Dedup check (last_message_matches?) runs after input guards so rewritten input is compared against history - Tripwire rescue uses finalize_run with guardrail_tripwire kwarg; StandardError rescue has a safety-net re-raise for Tripwire New files: - lib/agents/guard.rb — base class, Tripwire exception, DSL - lib/agents/guard_result.rb — value object (pass/rewrite/tripwire) - lib/agents/guard_runner.rb — ordered chain executor Integration points: - Agent: accepts input_guards/output_guards, propagated through clone - Runner: input guards before LLM, output guards before finalize_run, Guard::Tripwire rescue with guardrail_tripwire metadata on RunResult - RunResult: new `guardrail_tripwire` field and `tripwired?` predicate - CallbackManager: new `guard_triggered` event type - AgentRunner: `on_guard_triggered` callback registration - Instrumentation: `agents.run.guard.*` OTel spans with phase/action attributes, compatible with Langfuse Tests: 12 new examples covering input guard rewrites, output guard rewrites, structured output guards (redact/tripwire/pass-through), dedup regression, and tripwire metadata and callback emission. Existing specs updated to stub the new guard attributes.

netlify · 2026-03-18T14:55:16Z

✅ Deploy Preview for ruby-ai-agents ready!

Name	Link
🔨 Latest commit	`ad5d962`
🔍 Latest deploy log	https://app.netlify.com/projects/ruby-ai-agents/deploys/69babcd126f4f400083cf29e
😎 Deploy Preview	https://deploy-preview-59--ruby-ai-agents.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

sergiobayona · 2026-04-03T14:43:07Z

any feedback welcome

aakashb95 · 2026-04-06T11:04:42Z

Hi Sergio, This looks interesting. I will take a look at this in the next few days.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add composable guardrails system for input/output validation#59

feat: add composable guardrails system for input/output validation#59
sergiobayona wants to merge 1 commit intochatwoot:mainfrom
sergiobayona:feat/guardrails

sergiobayona commented Mar 18, 2026

Uh oh!

netlify bot commented Mar 18, 2026 •

edited

Loading

Uh oh!

sergiobayona commented Apr 3, 2026

Uh oh!

aakashb95 commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sergiobayona commented Mar 18, 2026

Uh oh!

netlify bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for ruby-ai-agents ready!

Uh oh!

sergiobayona commented Apr 3, 2026

Uh oh!

aakashb95 commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

netlify bot commented Mar 18, 2026 •

edited

Loading