-
Notifications
You must be signed in to change notification settings - Fork 4k
Description
Do you need to file an issue?
- I have searched the existing issues and this bug is not already filed.
- I believe this is a legitimate bug, not just a question or feature request.
Describe the bug
I have used Ollama with gpt-oss:20b for a while but have to change Ollama to llama.cpp server.
When using Ollama the tool calls obviously work - I got the needed response in LightRAG chat. When I use the llama-server - it seems the tool calls are not working.
Now, I am in doubt where the issues is since in llama.cpp there is a guide ggml-org/llama.cpp#15396 how to propoerly run gpt-oss models in llama.cpp for the tools to work. Some say - partially the issue is in the jinja/chat templates format - so this is not in LightRAG scope. But some say the clients that invoke commands to LLM also should follow some specific formats.
ggml-org/llama.cpp#15341
In the end, I have to use LightRAG with this model AND llama.cpp. Ollama is not an option anymore. Please help, what else can I try?
Also my bug in llama.cpp: ggml-org/llama.cpp#17410
Steps to reproduce
- run llama-server with gpt-oss:20b model
- configure LightRAG .env file to use llama-server:
LLM type = openai
baseurl: IP:port/v1 to the llama-server
Expected Behavior
Chat response with data from the RAG
LightRAG Config Used
Paste your config here
Logs and screenshots
No response
Additional Information
- LightRAG Version: v1.4.9.8/0251
- Operating System: Ubuntu 24.04.3
- Python Version: 3.21
- Related Issues: