Skip to content

fix(fireworks): honor max_retries#36973

Merged
Mason Daugherty (mdrxy) merged 4 commits into
masterfrom
mdrxy/fireworks-retries
Apr 23, 2026
Merged

fix(fireworks): honor max_retries#36973
Mason Daugherty (mdrxy) merged 4 commits into
masterfrom
mdrxy/fireworks-retries

Conversation

@mdrxy
Copy link
Copy Markdown
Member

ChatFireworks.max_retries silently did nothing. The old code assigned the value to a ChatCompletionV2 sub-object rather than the base client, and the pinned Fireworks SDK (0.13.0–0.19.20) never honors its own _max_retries attribute on the base client either. Since the Stainless-generated 1.x SDK that does implement retries is still pre-release (1.0.1a63 at time of writing), retry responsibility is ported to the LangChain side until the pin can be bumped.

@github-actions github-actions Bot added fireworks `langchain-fireworks` package issues & PRs fix For PRs that implement a fix integration PR made that is related to a provider partner package integration internal size: L 500-999 LOC labels Apr 23, 2026
@mdrxy
Copy link
Copy Markdown
Member Author

Mason Daugherty (mdrxy) commented Apr 23, 2026

Migration path when fireworks-ai 1.x goes GA

Finding (after inspecting fireworks-ai==1.0.1a63): the 1.x SDK ships an explicit LangChain-aware compat layer. We almost certainly do not need to minver-bump fireworks-ai when 1.x GAs.

What 1.x already shims for us:

  • fireworks.client.Fireworks / AsyncFireworks are re-exported (with a DeprecationWarning) from the top-level fireworks package, so our existing from fireworks.client import ... keeps working.
  • acreate() is aliased on async resources via fireworks/lib/_legacy_compat.py. The shim wraps the coroutine in an _AsyncCreateProxy that supports both await client.acreate(...) and async for chunk in client.acreate(stream=True, ...) — the comment in that file literally cites "the langchain pattern". Streaming surface survives.
  • Fireworks(...).chat.completions.create(...) is the canonical 1.x call and matches our usage exactly.

What 1.x does not shim:

  • fireworks.client.error.* is gone. Error class names also changed: BadGatewayError / ServiceUnavailableErrorAPIStatusError (checked via .status_code); InvalidRequestErrorBadRequestError / UnprocessableEntityError; PermissionErrorPermissionDeniedError. RateLimitError, APITimeoutError, InternalServerError, FireworksError, AuthenticationError are all preserved by name but moved to top-level fireworks.

Proposed migration plan (no minver bump needed):

  1. Widen upper pin in libs/partners/fireworks/pyproject.toml to fireworks-ai>=0.13.0,<2.0.0. No langchain-fireworks major bump required.
  2. Add a try/except ImportError block for error classes — prefer the 1.x top-level imports, fall back to 0.x fireworks.client.error paths. Set a _FIREWORKS_V1 flag off the result.
  3. Build _RETRYABLE_ERRORS dynamically from whatever imported. On 1.x, add APIStatusError + a predicate for status_code >= 500 (same marker pattern already used here for raw httpx.HTTPStatusError).
  4. On 1.x, pass max_retries into client_params and short-circuit the LC-side retry wrapper — the SDK handles backoff natively (including retry-after, which our tenacity config does not). On 0.x, keep the current wrapper.
  5. Keep the streaming / acreate call sites as-is; the 1.x compat layer absorbs them.

Unaffected: embeddings.py uses the openai SDK against Fireworks' base URL; llms.py uses raw requests / aiohttp. Neither touches fireworks-ai directly.

Net effect: when 1.x GAs, a single small PR widens the pin and adds the import-layer conditional. 0.x users stay on 0.x; 1.x users get native SDK retries. No breaking change for anyone downstream.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Apr 23, 2026

Merging this PR will not alter performance

✅ 2 untouched benchmarks
⏩ 13 skipped benchmarks1


Comparing mdrxy/fireworks-retries (d47c1f7) with master (3f382a9)2

Open in CodSpeed

Footnotes

  1. 13 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

  2. No successful run was found on master (d30ef8a) during the generation of this report, so 3f382a9 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

…ries

# Conflicts:
#	libs/partners/fireworks/tests/unit_tests/test_chat_models.py
@mdrxy Mason Daugherty (mdrxy) merged commit 7b09eb7 into master Apr 23, 2026
83 of 84 checks passed
@mdrxy Mason Daugherty (mdrxy) deleted the mdrxy/fireworks-retries branch April 23, 2026 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fireworks `langchain-fireworks` package issues & PRs fix For PRs that implement a fix integration PR made that is related to a provider partner package integration internal size: L 500-999 LOC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant