feat(llm): add catalog controls and speed billing by brynary · Pull Request #249 · fabro-sh/fabro

brynary · 2026-05-13T01:46:50Z

Summary

This PR advances the catalog-driven LLM work from #210 by making the resolved model catalog the source of truth for provider registration, request control validation, and billing identity. Runs now preserve canonical provider/model/speed identity through pricing and API responses instead of collapsing billing around provider API aliases or model IDs alone.

What Changed

Register LLM provider adapters from the resolved catalog, including custom OpenAI-compatible providers and their credential resolution paths.
Validate effective model request controls, including run-level defaults and node overrides, before dispatching LLM requests.
Add catalog-aware billing lookup that prices canonical ModelRef values, uses base model costs for standard speed, applies per-speed cost overrides, and returns an unknown estimate instead of silently billing zero for unsupported combinations.
Move Anthropic Opus fast-mode pricing into the built-in catalog for claude-opus-4-6 and claude-opus-4-7.
Thread the injected catalog and effective speed controls through workflow billing, including API-mode and CLI-mode handlers.
Update billing APIs, server aggregation, generated clients, and the web billing view to expose provider/model/speed billing identity and keep standard and fast usage in separate rows.

Notes for Review

Billing lookup intentionally uses canonical catalog model IDs. Provider api_id substitution remains limited to provider request construction, so aliases can be used on the wire without changing billing identity. Event conversion paths that do not have catalog access now preserve token counts with a null dollar estimate rather than falling back to the bootstrap catalog.

Verification

cargo build -p fabro-api
cd lib/packages/fabro-api-client && bun run generate
cargo +nightly-2026-04-14 fmt --check --all
cargo +nightly-2026-04-14 clippy --workspace --all-targets -- -D warnings
ulimit -n 4096 && cargo nextest run -p fabro-model -p fabro-workflow -p fabro-server -p fabro-api -p fabro-cli --no-fail-fast
ulimit -n 4096 && cargo nextest run --workspace --no-fail-fast
cd apps/fabro-web && bun run typecheck
cd apps/fabro-web && bun test
git diff --check

🤖 Generated with GPT-5 via Codex

Resolve LLM credentials and adapter registration through the runtime catalog so settings-defined providers can be used for requests. This keeps built-in behavior default-equivalent while supporting provider IDs, aliases, extra headers, header-only auth, base URLs, and provider API model IDs at the adapter boundary.

Propagate run-level model controls into workflow LLM requests, type speed at the request boundary, and reject unsupported speed or reasoning controls before provider dispatch.

claude

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

_{Tip: disable this comment in your organization's Code Review settings.}

Add explicit model feature metadata for reasoning effort levels and prompt caching, while preserving the legacy effort flag as a compatibility alias. Gate Anthropic prompt-cache request encoding on the catalog feature and keep request serialization details in the adapter.

Other tests in the workspace (e.g. secret_list_json_returns_metadata_only) write ANTHROPIC_API_KEY to the shared session daemon's vault. When that test runs before bulk_skip_exits_zero_and_prints_summary, the shared server resolves Anthropic as configured via vault env-lookup fallback, attempts to call the real Anthropic API with the leaked test value, and fails with a 401 "invalid x-api-key" instead of returning a skip. Spawn an isolated server with a fresh vault so the bulk test's "no credentials configured" precondition holds regardless of which other tests share the nextest session. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Thread the resolved server LLM catalog through workflow validation, model resolution, credential lookup, request construction, and worker startup so request-serving paths no longer depend on the builtin catalog.

…o phase-7-controls-validation

brynary added 3 commits May 12, 2026 19:58

feat(llm): validate model request controls

b969b02

Propagate run-level model controls into workflow LLM requests, type speed at the request boundary, and reject unsupported speed or reasoning controls before provider dispatch.

feat(billing): price usage by catalog model speed

c300d32

claude Bot reviewed May 13, 2026

View reviewed changes

brynary and others added 5 commits May 13, 2026 08:25

feat(llm): use server catalog for workflow execution

de5c27c

Thread the resolved server LLM catalog through workflow validation, model resolution, credential lookup, request construction, and worker startup so request-serving paths no longer depend on the builtin catalog.

test(agent): inject catalog in profile guardrail

805912d

Merge remote-tracking branch 'origin/phase-7-controls-validation' int…

48cd703

…o phase-7-controls-validation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(llm): add catalog controls and speed billing#249

feat(llm): add catalog controls and speed billing#249
brynary wants to merge 8 commits into
mainfrom
phase-7-controls-validation

brynary commented May 13, 2026

Uh oh!

claude Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

brynary commented May 13, 2026

Summary

What Changed

Notes for Review

Verification

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant