any-guardrail provides a unified interface for AI safety guardrails, for example, letting you detect toxic content, jailbreak attempts, and other risks in LLM inputs and outputs. Switch between different guardrail providers, both encoder-based (discriminative) and decoder-based (generative) models like Llama Guard and ShieldGemma, without changing your code.
Some guardrails are extremely customizable, which any-guardrail fully exposes. See the complete list of supported providers and customization examples in our docs.
- Unified API: Switch between evergrowing list of guardrail providers
- Production-ready: Built for real-world LLM applications
- Flexible: Use encoder-based (fast) or decoder-based (customizable) models
- Python 3.11 or newer
Install with pip:
pip install any-guardrailAnyGuardrail provides a seamless interface for interacting with the guardrail models. It allows you to see a list of all the supported guardrails, and to instantiate each supported guardrail. Here is a full example:
from any_guardrail import AnyGuardrail, GuardrailName, GuardrailOutput
# Initialize guardrail
guardrail = AnyGuardrail.create(GuardrailName.DEEPSET)
# Validate input before sending to your LLM
result: GuardrailOutput = guardrail.validate("How do I hack into a system?")
if not result.valid:
print(f"Blocked: {result.explanation}")
else:
# Safe to proceed with LLM call
response = your_llm(user_input)Every guardrail returns the same GuardrailOutput shape, so you can swap models without changing application code:
result.valid # bool verdict — True means the content passed
result.score # risk score in ~[0, 1], higher = more likely violating (when available)
result.categories # per-category results: CategoryResult(name, description, triggered, score, severity)
result.explanation # human-readable rationale (judge reasoning, raw generation)
result.action # provider-recommended action (e.g. "block"), advisory; None if none
result.usage # provenance: model_id, latency_ms, token counts
result.extra # guardrail-specific structured extras; result.raw holds the backend payload
flagged = [c.name for c in result.categories if c.triggered]A machine-readable JSON Schema for this output is published in the repo (generated from the Pydantic models). Reference it at the stable raw URL, pinning a release tag for a specific version:
https://raw.githubusercontent.com/mozilla-ai/any-guardrail/main/schemas/guardrail_output.schema.json
Full guides at docs link
Some of the models on HuggingFace require extra permissions to use. To do this, you'll need to create a HuggingFace profile and manually go through the permissions. Then, you'll need to download the HuggingFace Hub and login. One way to do this is:
pip install --upgrade huggingface_hub
hf auth loginMore information can be found here: HuggingFace Hub
The guardrail space is ever growing. If there is a guardrail that you'd like us to support, please see our CONTRIBUTING.md for details.