feat: Add Automatic Text Truncation Support for Embedding Functions by danielaskdd · Pull Request #2523 · HKUDS/LightRAG

danielaskdd · 2025-12-22T12:21:01Z

Add Automatic Text Truncation Support for Embedding Functions

Summary

This PR enhances the EmbeddingFunc wrapper to automatically inject max_token_size parameter to underlying embedding functions that support it. This enables automatic text truncation for embedding operations, preventing API errors caused by texts exceeding model token limits.

Changes

Core Enhancement: `lightrag/utils.py`

Added import inspect for function signature introspection
Added automatic max_token_size injection logic in EmbeddingFunc.__call__:
- Uses inspect.signature() to check if the underlying function supports max_token_size
- Only injects when the parameter is supported (avoids TypeError for unsupported functions)

OpenAI Embedding: `lightrag/llm/openai.py`

Added import tiktoken for tokenization
Added _TIKTOKEN_ENCODING_CACHE module-level cache and _get_tiktoken_encoding_for_model() helper
Added max_token_size parameter to openai_embed() function
Implemented client-side text truncation using tiktoken (OpenAI API may return errors for over-limit texts)

Gemini Embedding: `lightrag/llm/gemini.py`

Added max_token_size parameter to gemini_embed() function
No client-side truncation - Gemini API handles truncation automatically (autoTruncate=True by default)

Ollama Embedding: `lightrag/llm/ollama.py`

Added max_token_size parameter to ollama_embed() function
Added comprehensive docstring
No client-side truncation - Ollama API handles truncation automatically based on num_ctx setting

Minor Updates

lightrag/api/lightrag_server.py: Updated log message for clarity
lightrag/operate.py: Changed token threshold from 90% to 100% and improved warning message

Truncation Strategies by Provider

Provider	Truncation Strategy	Reason
OpenAI	Client-side (tiktoken)	API may return errors for over-limit texts
Gemini	Server-side (autoTruncate)	API automatically truncates
Ollama	Server-side (num_ctx)	API automatically truncates

How It Works

User calls: await embedding_func(texts)
    ↓
EmbeddingFunc.__call__:
    1. inspect.signature() checks if func supports max_token_size
    2. If supported → inject max_token_size from decorator
    ↓
embedding_func(texts, max_token_size=...):
    - openai_embed: Client-side truncation with tiktoken
    - gemini_embed / ollama_embed: Server-side automatic truncation

Backward Compatibility

✅ Fully backward compatible
Embedding functions without max_token_size parameter continue to work (signature check prevents injection)
No breaking changes to existing API

- Auto-inject max_token_size in wrapper - Implement OpenAI client-side truncation - Update Gemini/Ollama embed signatures - Relax summary token warning threshold - Update server startup logging

- Set cache env var before import - Support raw encoding names - Add cl100k_base to default list - Improve cache path resolution

* Fix typo in log message * Add missing closing parenthesis

chatgpt-codex-connector · 2025-12-22T12:21:09Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

danielaskdd added 5 commits December 22, 2025 19:33

feat: inject max_token_size and add client-side truncation for OpenAI

2678005

- Auto-inject max_token_size in wrapper - Implement OpenAI client-side truncation - Update Gemini/Ollama embed signatures - Relax summary token warning threshold - Update server startup logging

Fix tiktoken cache env var and support encoding names

9c9dfcd

- Set cache env var before import - Support raw encoding names - Add cl100k_base to default list - Improve cache path resolution

Update default cache path comment in docs

5a45598

Fix table formatting in OfflineDeployment docs

3527c68

Fix missing parenthesis in log message

e2a95ab

* Fix typo in log message * Add missing closing parenthesis

danielaskdd merged commit dca23e2 into HKUDS:main Dec 22, 2025
3 checks passed

danielaskdd deleted the embedding-max-token branch December 22, 2025 17:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add Automatic Text Truncation Support for Embedding Functions#2523

feat: Add Automatic Text Truncation Support for Embedding Functions#2523
danielaskdd merged 5 commits intoHKUDS:mainfrom
danielaskdd:embedding-max-token

danielaskdd commented Dec 22, 2025

Uh oh!

chatgpt-codex-connector bot commented Dec 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

danielaskdd commented Dec 22, 2025

Add Automatic Text Truncation Support for Embedding Functions

Summary

Changes

Core Enhancement: lightrag/utils.py

OpenAI Embedding: lightrag/llm/openai.py

Gemini Embedding: lightrag/llm/gemini.py

Ollama Embedding: lightrag/llm/ollama.py

Minor Updates

Truncation Strategies by Provider

How It Works

Backward Compatibility

Uh oh!

chatgpt-codex-connector bot commented Dec 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Core Enhancement: `lightrag/utils.py`

OpenAI Embedding: `lightrag/llm/openai.py`

Gemini Embedding: `lightrag/llm/gemini.py`

Ollama Embedding: `lightrag/llm/ollama.py`