Skip to content

feat: add observability hooks to guard and redact methods (#1099)#1142

Open
frankentini wants to merge 1 commit into
superagent-ai:mainfrom
frankentini:feat/observability-hooks-guard
Open

feat: add observability hooks to guard and redact methods (#1099)#1142
frankentini wants to merge 1 commit into
superagent-ai:mainfrom
frankentini:feat/observability-hooks-guard

Conversation

@frankentini

Copy link
Copy Markdown
Contributor

Summary

Closes #1099.

Adds on_guard and on_redact observability callback hooks to the Python SDK so callers can plug in logging, metrics, alerting, or any custom observability pipeline without modifying the guard/redact flow.

Changes

New types (types.py)

  • ObservabilityEvent — dataclass with: method, model, input_preview (first 200 chars), classification, violation_types, prompt_tokens, completion_tokens, total_tokens
  • ObservabilityCallback — union type accepting sync or async callables
  • ClientConfig.on_guard / ClientConfig.on_redact — optional callback slots
  • Added from __future__ import annotations for compatibility

Client (client.py)

  • _fire_observability() — internal helper that invokes the callback (sync or async); swallows all exceptions so a bad callback never disrupts a guard/redact call
  • guard() — fires ObservabilityEvent after all four code paths: plain text, chunked text, PDF, and image
  • redact() — fires ObservabilityEvent after every call (classification=None)
  • create_client() — exposes on_guard / on_redact as top-level kwargs for convenience

Public API (__init__.py)

  • Exports ObservabilityEvent and ObservabilityCallback

Tests (tests/test_observability.py, 404 lines, 17 tests)

  • Sync and async callbacks for guard and redact
  • Block classification + violation_types propagation
  • Exception swallowing (bad callback must not raise)
  • input_preview truncated to 200 chars
  • Chunked guard fires exactly one aggregated event
  • ClientConfig wiring through SafetyClient directly
  • classification=None invariant for redact events

Usage

from safety_agent import create_client, ObservabilityEvent

# Sync callback
def log_event(event: ObservabilityEvent) -> None:
    print(f"[{event.method}] {event.classification}{event.total_tokens} tokens")

client = create_client(api_key="...", on_guard=log_event)
result = await client.guard(user_input, model="openai/gpt-4o")

# Async callback
async def async_log(event: ObservabilityEvent) -> None:
    await metrics_client.record(event)

client = create_client(api_key="...", on_guard=async_log, on_redact=async_log)

Design decisions

  • Fire-and-forget with swallowed exceptions — same philosophy as _post_usage(). A misconfigured callback must never degrade safety checks.
  • One event per public call — chunked guard aggregates token counts before firing, so callers always see a single event per guard() invocation.
  • input_preview not full input — avoids logging PII at scale while still being useful for debugging.
  • classification=None for redact — redact produces sanitized text, not a pass/block verdict; marking it None keeps the type honest.

…-ai#1099)

Add on_guard and on_redact callback slots to ClientConfig and
create_client(), resolving issue superagent-ai#1099.

Changes:
- Add ObservabilityEvent dataclass with method, model, input_preview,
  classification, violation_types, and token usage fields
- Add ObservabilityCallback union type (sync or async callable)
- Add on_guard / on_redact fields to ClientConfig
- Expose on_guard / on_redact kwargs in create_client() for convenience
- Fire observability events in guard() for text, chunked, PDF, and
  image code paths, and in redact() after every call
- Callbacks are invoked after _post_usage; exceptions in the callback
  are swallowed so they never break guard/redact flow
- Export ObservabilityEvent and ObservabilityCallback from top-level
  __init__.py
- Add from __future__ import annotations to types.py for compatibility
- Add 404 lines of unit tests in tests/test_observability.py covering:
  - Sync and async guard/redact callbacks
  - Block classification with violation_types propagation
  - Exception swallowing in bad callbacks
  - input_preview truncation to 200 chars
  - Chunked guard fires exactly one aggregated event
  - ClientConfig wiring via SafetyClient directly
  - Redact classification=None invariant
  - ObservabilityEvent field construction

Example usage:
    from safety_agent import create_client, ObservabilityEvent

    def log_event(event: ObservabilityEvent) -> None:
        print(f"[{event.method}] {event.classification} - {event.total_tokens} tokens")

    client = create_client(api_key="...", on_guard=log_event)
    result = await client.guard(user_input, model="openai/gpt-4o")
@vercel

vercel Bot commented Apr 11, 2026

Copy link
Copy Markdown

@frankentini is attempting to deploy a commit to the Superagent Team on Vercel.

A member of the Team first needs to authorize it.

@superagent-ai superagent-ai deleted a comment from cursor Bot Apr 11, 2026
@homanp homanp self-assigned this Apr 11, 2026
@homanp

homanp commented Apr 11, 2026

Copy link
Copy Markdown
Collaborator

@cursor review

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit dce2496. Configure here.

completion_tokens=redact_response.usage.completion_tokens,
total_tokens=redact_response.usage.total_tokens,
),
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redact observability placed inside wrong try/except scope

Low Severity

The _fire_observability call and ObservabilityEvent construction in redact() are placed inside the try/except Exception block meant for JSON parsing errors. While _fire_observability internally swallows exceptions, the ObservabilityEvent(...) constructor is evaluated before entering that protection. If event construction ever raises (e.g., due to future validation logic), the already-successful redact_response is discarded and a misleading "Failed to parse redact response" error is raised instead. This is inconsistent with guard(), where all observability calls sit outside try/except blocks.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit dce2496. Configure here.

fallback_timeout=fallback_timeout or config.fallback_timeout,
fallback_url=fallback_url or config.fallback_url,
on_guard=on_guard or config.on_guard,
on_redact=on_redact or config.on_redact,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Callbacks silently dropped when config provided without api_key

Medium Severity

When a user calls create_client(config=ClientConfig(api_key="..."), on_guard=my_callback), the on_guard and on_redact kwargs are silently ignored. The if config is None branch is skipped (config exists), and the elif api_key: branch is also skipped (no separate api_key kwarg), so the original config object — without the user's callback — is passed directly to SafetyClient. The user's observability callback is quietly lost with no warning.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit dce2496. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Hook observability into the safety-agent guard.

2 participants