Skip to content

cookbook: data_labeling on Gemini + quality review as Workflow#8024

Merged
ashpreetbedi merged 4 commits into
mainfrom
cookbook/quality-review-workflow
May 20, 2026
Merged

cookbook: data_labeling on Gemini + quality review as Workflow#8024
ashpreetbedi merged 4 commits into
mainfrom
cookbook/quality-review-workflow

Conversation

@ashpreetbedi

@ashpreetbedi ashpreetbedi commented May 20, 2026

Copy link
Copy Markdown
Contributor

Summary

Three commits to cookbook/data_labeling/ and adjacent files, in service of the upcoming "Agents for data labeling" blog post.

1. Migrate all 18 data_labeling cookbooks to Gemini

Across every folder (_01 through _18), switch the default model from OpenAIResponses(gpt-5.5) to Gemini(gemini-3.5-flash). The rationale:

  • One model, one API key for the whole cookbook. Readers can run all 18 workflows with GOOGLE_API_KEY alone (the lone exception is _18_quality_review/, which uses Claude for one of its two labelers).
  • Native multimodality. _06_14 all run through audio, video, images, and PDFs without rearchitecting the agent. Gemini 3.5 Flash handles every modality in the same call shape.
  • Flash pricing for frontier-grade reasoning on the routine labeling tier, which is the whole pitch of the cookbook.

READMEs and TEST_LOGs across all 18 folders updated to match (OPENAI_API_KEYGOOGLE_API_KEY).

2. Rewrite _18_quality_review/basic.py as a Workflow

Lifts the labeler → reviewer → adjudicator pipeline off procedural Python and onto the agno Workflow API:

  • Real concurrency between the two labelers via Parallel(label_a, label_b). The old version ran them sequentially, doubling latency.
  • Conditional adjudication via Condition(evaluator=has_disagreement, steps=[adjudicate]) — workflow structure instead of an if statement.
  • Persistent traces via SqliteDb(db_file="tmp/labeling.db") — every run captured for auditability, which matters for a labeling pipeline.

The reviewer and adjudicator use Step(executor=fn) and reach into the Parallel block with step_input.get_step_output("Labeler A"), the documented helper for recursively searching nested steps.

3. Ruff format drift in three unrelated files

./scripts/format.sh surfaced pre-existing format / import-order drift in three files outside the data_labeling cookbook (cookbook/90_models/...antigravity.py, cookbook/91_tools/...drive_all_drives_search.py, libs/agno/tests/unit/models/google/test_gemini.py). Pure formatting fixes, no semantic changes. Included here to keep format.sh clean on this branch.

Type of change

  • Bug fix
  • New feature
  • Breaking change
  • Improvement
  • Model update
  • Other:

Checklist

  • Code complies with style guidelines
  • Ran format/validation scripts (./scripts/format.sh and ./scripts/validate.sh; cookbook ruff + pattern checks pass; mypy crash in agno_infra is a pre-existing pydantic_settings issue unrelated to this PR)
  • Self-review completed
  • Documentation updated (file docstrings + per-folder READMEs)
  • Examples and guides: this IS the cookbook update
  • Tested in clean environment (workflow constructs correctly; quality_review end-to-end run verified by author with live API keys)
  • Tests added/updated (N/A — cookbook examples)

Duplicate and AI-Generated PR Check

  • I have searched existing open pull requests and confirmed that no other PR already addresses this issue
  • If a similar PR exists, I have explained below why this PR is a better approach
  • Check if this PR was entirely AI-generated (by Copilot, Claude Code, Cursor, etc.)

Additional Notes

Companion to the upcoming "Agents for data labeling" blog post — the post uses gemini-3.5-flash throughout and shows the Workflow version of the quality review pipeline as its wedge example for multi-agent quality control. This PR makes the cookbook match.

Stats: 80 files changed, +284/-231 across three commits.

Lifts the labeler -> reviewer -> adjudicator pipeline onto the Workflow
API so the two labelers run concurrently (Parallel), the adjudicator
only runs on disagreement (Condition), and every run is persisted to
SQLite for traceability.

The reviewer and adjudicator use Step(executor=...) to pull both
labelers' outputs out of the Parallel block via step_input.get_step_output(),
which recursively searches nested steps.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ashpreetbedi ashpreetbedi requested a review from a team as a code owner May 20, 2026 17:26
Across all 18 data_labeling folders, switch the default model from
OpenAIResponses(gpt-5.5) to Gemini(gemini-3.5-flash) -- multimodal,
frontier-grade reasoning at flash prices, single GOOGLE_API_KEY for
the whole cookbook. READMEs and TEST_LOGs updated to match.

Also picks up a couple of ruff format/import-order fixes in
_17_llm_as_judge/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ashpreetbedi ashpreetbedi changed the title cookbook: rewrite _18_quality_review as an agno Workflow cookbook: data_labeling on Gemini + quality review as Workflow May 20, 2026
ashpreetbedi and others added 2 commits May 20, 2026 10:48
Pure formatting / import-order fixes surfaced by ./scripts/format.sh
while preparing this PR. No semantic changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ashpreetbedi ashpreetbedi merged commit a436f67 into main May 20, 2026
6 checks passed
@ashpreetbedi ashpreetbedi deleted the cookbook/quality-review-workflow branch May 20, 2026 18:05
@ysolanky ysolanky mentioned this pull request May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants