Skip to content

feat(core): add multimodal support to count_tokens_approximately#34883

Merged
ccurme (ccurme) merged 2 commits into
langchain-ai:masterfrom
jiaming2li:feature/multimodal-token-counting
Jan 26, 2026
Merged

feat(core): add multimodal support to count_tokens_approximately#34883
ccurme (ccurme) merged 2 commits into
langchain-ai:masterfrom
jiaming2li:feature/multimodal-token-counting

Conversation

@jiaming2li
Copy link
Copy Markdown
Contributor

Summary

This PR adds multimodal support to count_tokens_approximately to properly handle image content blocks. Previously, base64-encoded images were counted as ~25,000 tokens; now they use a fixed penalty of ~85 tokens, providing a more accurate approximation.
Fixes #34873

Fixes the issue where trim_messages and other context management tools fail with multimodal content due to massive token overestimation.

Changes

  • Added tokens_per_image parameter to count_tokens_approximately (default: 85)
  • Added logic to detect and handle list-based content blocks (text, image, image_url)
  • Applied fixed token penalty for image blocks instead of counting base64 characters
  • Maintained full backward compatibility with string content
  • Added 4 new test cases for multimodal scenarios

Testing

Ran the following commands from libs/core:

  • make format
  • make test ✅ (1635 passed, 3 skipped)

Note: make lint shows 1 error in scripts/check_version.py (line too long), but this is a pre-existing issue in a file not modified by this PR.

All 145 tests in test_utils.py pass, including 4 new multimodal tests.

Breaking Changes

None. This change is fully backward compatible.

Example

from langchain_core.messages import HumanMessage
from langchain_core.messages.utils import count_tokens_approximately

message = HumanMessage(content=[
    {"type": "text", "text": "What's in this image?"},
    {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}}
])

# Before: ~25,000 tokens (counting base64 chars)
# After: ~92 tokens (85 for image + text)
count = count_tokens_approximately([message])

- Add tokens_per_image parameter for fixed image token penalty
- Handle list-based content blocks (text, image, image_url)
- Prevent massive overestimation from base64-encoded images
- Maintain backward compatibility with string content
- Add comprehensive test coverage for multimodal scenarios

Fixes token counting for multimodal messages where base64 images
were previously counted as 25,000+ tokens instead of ~85 tokens.
- Add tokens_per_image parameter for fixed image token penalty
- Handle list-based content blocks (text, image, image_url)
- Prevent massive overestimation from base64-encoded images
- Maintain backward compatibility with string content
- Add comprehensive test coverage for multimodal scenarios
@github-actions github-actions Bot added core `langchain-core` package issues & PRs feature For PRs that implement a new feature; NOT A FEATURE REQUEST external labels Jan 26, 2026
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Jan 26, 2026

Merging this PR will not alter performance

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

✅ 13 untouched benchmarks
⏩ 21 skipped benchmarks1


Comparing jiaming2li:feature/multimodal-token-counting (c0b8b90) with master (aaba1b0)2

Open in CodSpeed

Footnotes

  1. 21 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

  2. No successful run was found on master (c930062) during the generation of this report, so aaba1b0 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

Copy link
Copy Markdown
Collaborator

@ccurme ccurme (ccurme) left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thank you!

@ccurme ccurme (ccurme) merged commit 585b691 into langchain-ai:master Jan 26, 2026
90 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core `langchain-core` package issues & PRs external feature For PRs that implement a new feature; NOT A FEATURE REQUEST

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support approximate token counting for image blocks in langchain-core

2 participants