Skip to content

fix(core): improve JSON get_format_instructions using Opik Agent Optimizer#33718

Merged
ccurme (ccurme) merged 1 commit into
langchain-ai:masterfrom
vincentkoc:patch-1
Oct 29, 2025
Merged

fix(core): improve JSON get_format_instructions using Opik Agent Optimizer#33718
ccurme (ccurme) merged 1 commit into
langchain-ai:masterfrom
vincentkoc:patch-1

Conversation

@vincentkoc
Copy link
Copy Markdown
Contributor

@vincentkoc Vincent Koc (vincentkoc) commented Oct 29, 2025

Description: It's common knowledge that the JSON based get_format_instructions() has a high failure rate on some models. We tested the prompt systematically using JSONSchemaBench with Opik Agent Optimizer (open-source optimizer toolchain) against a few optimizers to improve the underlying prompt using evaluation-based approach.

  • The full analysis, test enviroment, and results can be found here.
  • We achived a score of 0.97 (+708%) vs. the baseline of 0.12 (orignal prompt) on gpt-4.1.
  • This ensures robust schema adherance and can be re-applied to other formats like yaml xml etc.

Results from Opik Optimizer: Opik Agent Optimizer results using JSONSchemaBench

Issue: No issues linked
Dependencies: No dependencies


  • Lint and test: Run make format, make lint and make test from the root of the package(s) you've modified.

@github-actions github-actions Bot added core `langchain-core` package issues & PRs fix For PRs that implement a fix and removed fix For PRs that implement a fix labels Oct 29, 2025
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Oct 29, 2025

CodSpeed Performance Report

Merging #33718 will not alter performance

Comparing vincentkoc:patch-1 (e5cbfc9) with master (a2a9a02)1

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

Summary

✅ 13 untouched
⏩ 21 skipped2

Footnotes

  1. No successful run was found on master (b5e23e5) during the generation of this report, so a2a9a02 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

  2. 21 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@ccurme ccurme (ccurme) merged commit 78a2f86 into langchain-ai:master Oct 29, 2025
150 of 152 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core `langchain-core` package issues & PRs fix For PRs that implement a fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants