Skip to content

Conversation

@castrapel
Copy link
Contributor

Relevant issues

When using batch operations with Vertex AI, the batch rate limiter attempts to download input files (GCS URIs like gs://bucket/file.jsonl) to count tokens. For large files (5GB+), this causes timeout errors (503), memory issues, and unnecessary overhead.

This MR adds a configuration option litellm.skip_batch_token_counting_providers configuration that allows users to disable token counting for batch operations on specific providers, e.g. skip_batch_token_counting_providers = ["vertex_ai"]

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

@vercel
Copy link

vercel bot commented Dec 22, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
litellm Ready Ready Preview, Comment Dec 22, 2025 10:51pm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant