Skip to content

Fix: Add configurable model support for Jina embedding#2433

Merged
danielaskdd merged 4 commits intoHKUDS:mainfrom
danielaskdd:fix-jina-embedding
Nov 28, 2025
Merged

Fix: Add configurable model support for Jina embedding#2433
danielaskdd merged 4 commits intoHKUDS:mainfrom
danielaskdd:fix-jina-embedding

Conversation

@danielaskdd
Copy link
Collaborator

@danielaskdd danielaskdd commented Nov 28, 2025

Fix: Add configurable model support for Jina embedding

Summary

This PR adds the ability to configure the Jina embedding model via the EMBEDDING_MODEL environment variable. Previously, the Jina embedding model was hardcoded to jina-embeddings-v4, which prevented users from using other Jina models like jina-embeddings-v3.

Fix: #2431

Changes

lightrag/llm/jina.py

  • Added model parameter to jina_embed function with default value jina-embeddings-v4
  • Updated API request to use the configurable model name instead of hardcoded value
  • Updated docstring to document the new model parameter

lightrag/api/lightrag_server.py

  • Modified create_optimized_embedding_function to pass model parameter when calling Jina embedding

Configuration Example

EMBEDDING_BINDING=jina
EMBEDDING_MODEL=jina-embeddings-v3  # or jina-embeddings-v4 (default)
JINA_API_KEY=your-api-key

Breaking Changes

None. Default behavior remains unchanged (jina-embeddings-v4).

Testing

  • Manual testing with EMBEDDING_MODEL=jina-embeddings-v3
  • Manual testing without EMBEDDING_MODEL (should use default v4)

- Add model parameter to jina_embed
- Pass model from API server
- Default to jina-embeddings-v4
- Update function documentation
- Make model selection flexible
@danielaskdd
Copy link
Collaborator Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@danielaskdd
Copy link
Collaborator Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

- Set EMBEDDING_MODEL default to None
- Pass model param only when provided
- Let providers use their own defaults
- Fix lollms embed function params
- Add ollama embed_model default param
@danielaskdd
Copy link
Collaborator Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@danielaskdd
Copy link
Collaborator Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

# For embedding_dim: user config (always has value) takes priority
# Only use provider default if user config is explicitly None (which shouldn't happen)
final_embedding_dim = (
args.embedding_dim if args.embedding_dim else provider_embedding_dim

P1 Badge Fix Jina model option using v4 dimension by default

When a user sets EMBEDDING_BINDING=jina with EMBEDDING_MODEL=jina-embeddings-v3 (as advertised in the commit message), args.embedding_dim now defaults to None, so final_embedding_dim remains the provider default 2048. The Jina binding forces send_dimensions=True, so the optimized wrapper injects 2048 into jina_embed, but the v3 API returns 1024-length vectors. EmbeddingFunc then raises a dimension-mismatch ValueError, making the new model selector unusable unless users manually override EMBEDDING_DIM. The dimension should be derived from the chosen model or validated so the configuration example works without runtime failure.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@danielaskdd danielaskdd merged commit b670544 into HKUDS:main Nov 28, 2025
4 checks passed
@danielaskdd danielaskdd deleted the fix-jina-embedding branch December 1, 2025 05:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Hardcoded Jina Embeddings Model Name Prevents Configured Model Usage

1 participant