Skip to content

Async support for HTTP-based providers (EncoderfileProvider, LlamafileProvider, AzureContentSafety) #164

@dni138

Description

@dni138

Tracking @javiermtorres's comment on PR #160: #160 (comment)

Motivation

HTTP-based providers (EncoderfileProvider, LlamafileProvider, AzureContentSafety) would benefit from concurrent validate() calls — multiple prompts in flight against a single running provider. The current sync API forces callers to use asyncio.to_thread(guardrail.validate, ...) to integrate with async web frameworks, and serializes batch scoring against HTTP backends that could otherwise pipeline.

Scope (breaking)

  • Provider.infer / pre_process / generate_chat become async def.
  • Guardrail.validate and the ThreeStageGuardrail pipeline become async. Breaking public API change.
  • HTTP providers move from urllib.request to httpx.AsyncClient.
  • HuggingFaceProvider wraps model.generate in asyncio.to_thread to satisfy the async contract — forced thread offload, half-measure for the local-torch path.

Cost / risk

  • Breaking API for every existing user of the library.
  • All tests, cookbooks, and AnyGuardrail.create callers grow await / asyncio.run boilerplate.
  • HF path becomes pseudo-async (the underlying torch.generate is sync; we'd be hiding the cost rather than removing it).

Workaround today

Users who need async integration can wrap any guardrail call with asyncio.to_thread:

import asyncio
result = await asyncio.to_thread(guardrail.validate, "prompt")

Related

Pairs naturally with the gRPC client follow-up (#165) since gRPC's Python client is async-native; doing both at once avoids a second sync→async pass on the same provider files.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions