Tracking @javiermtorres's comment on PR #160: #160 (comment)
Motivation
HTTP-based providers (EncoderfileProvider, LlamafileProvider, AzureContentSafety) would benefit from concurrent validate() calls — multiple prompts in flight against a single running provider. The current sync API forces callers to use asyncio.to_thread(guardrail.validate, ...) to integrate with async web frameworks, and serializes batch scoring against HTTP backends that could otherwise pipeline.
Scope (breaking)
Provider.infer / pre_process / generate_chat become async def.
Guardrail.validate and the ThreeStageGuardrail pipeline become async. Breaking public API change.
- HTTP providers move from
urllib.request to httpx.AsyncClient.
HuggingFaceProvider wraps model.generate in asyncio.to_thread to satisfy the async contract — forced thread offload, half-measure for the local-torch path.
Cost / risk
- Breaking API for every existing user of the library.
- All tests, cookbooks, and
AnyGuardrail.create callers grow await / asyncio.run boilerplate.
- HF path becomes pseudo-async (the underlying
torch.generate is sync; we'd be hiding the cost rather than removing it).
Workaround today
Users who need async integration can wrap any guardrail call with asyncio.to_thread:
import asyncio
result = await asyncio.to_thread(guardrail.validate, "prompt")
Related
Pairs naturally with the gRPC client follow-up (#165) since gRPC's Python client is async-native; doing both at once avoids a second sync→async pass on the same provider files.
Tracking @javiermtorres's comment on PR #160: #160 (comment)
Motivation
HTTP-based providers (
EncoderfileProvider,LlamafileProvider,AzureContentSafety) would benefit from concurrentvalidate()calls — multiple prompts in flight against a single running provider. The current sync API forces callers to useasyncio.to_thread(guardrail.validate, ...)to integrate with async web frameworks, and serializes batch scoring against HTTP backends that could otherwise pipeline.Scope (breaking)
Provider.infer/pre_process/generate_chatbecomeasync def.Guardrail.validateand theThreeStageGuardrailpipeline become async. Breaking public API change.urllib.requesttohttpx.AsyncClient.HuggingFaceProviderwrapsmodel.generateinasyncio.to_threadto satisfy the async contract — forced thread offload, half-measure for the local-torch path.Cost / risk
AnyGuardrail.createcallers growawait/asyncio.runboilerplate.torch.generateis sync; we'd be hiding the cost rather than removing it).Workaround today
Users who need async integration can wrap any guardrail call with
asyncio.to_thread:Related
Pairs naturally with the gRPC client follow-up (#165) since gRPC's Python client is async-native; doing both at once avoids a second sync→async pass on the same provider files.