Releases: HKUDS/LightRAG
v1.4.9.1
What's Changed
- Feature(webui): Add KaTeX chemical formula rendering by supporting mhchem extension by @danielaskdd in #2154
- Feat(webui): Prevent LaTeX Parsing Errors Show-up During Streaming by @danielaskdd in #2155
- Fix dark mode graph labels for system theme and improve colors by @roman-marchuk in #2163
- web_ui: check node source and target by @zl7261 in #2156
Full Changelog: v1.4.9...v1.4.9.1
v1.4.9
Importance Notes
v1.4.9 introduces key enhancements focused on refining the reference output format and incorporating structured references into query results. All query API endpoints now include a references field, enabling frontend applications to retrieve cited documents and their corresponding identifiers associated with LightRAG query results.
The context format sent to the LLM has been updated. The streaming response from the LLM now includes a references field (ignored by the frontend by default). If your application relies on context data returned by the LightRAG query API, you may need to update your code accordingly.
By leveraging the user_prompt parameter, users can instruct the LLM to generate responses with footnote annotations. The footnote numbers in the LLM output can be seamlessly mapped to the document IDs returned in the references field. This integration enables tighter alignment between LightRAG and your business system, empowering users to access original source materials directly.
user_prompt act as additional output instruction for LLM. Here provide two examples:
user_promptfor gpt-4.1-mini or Qwen3
For inline citations, employ the footnote reference format `[^1]`, where the `^` following the opening square bracket denotes a superscript link. When multiple citations are required at a single location, enclose each reference ID within separate footnote markers (e.g., `[^1][^2][^3]`).
user_promptfor DeepSeek:
内嵌引文标注使用Markdown脚注格式`[^1]`, 当某处有多个引文标注时,应将每个引用ID分别置于独立的中括号内(例如:`[^1][^2][^3]`)。仅需对回答的关键事实和依据信息给出标注。出现在引文标中的引文ID都应该列在最后生成的参考文献中段落中。不要在参考文献段落之后生成脚注段落。
This screenshot illustrates the functionality of the user_prompt parameter in the WebUI:
What's New
- Refactor: Provide Citation Context to LLM and Improve Reference Section Generation Quality by @danielaskdd in #2140
- Feature: Add Reference List Support for All Query Endpoints by @danielaskdd in #2147
- Refactor(WebUI): Change Client-side Search Logic with Server-driven Enity Name Search by @danielaskdd in #2124
- Feature(webui): Force sending history messages in bypass mode by @danielaskdd in #2132
- Feature(webui): Add footnotes support to markdown rendering in chat messages by @danielaskdd in #2145
- Feature(webui): Add user prompt history dropdown to query settings by @danielaskdd in #2146
What's Fixed
- Add path traversal security validation for file deletion operations by @danielaskdd in #2113
- Fix WebUI: Enhance tooltip readability by fix tooltip text wrapping of error message by @danielaskdd in #2114
- Fix Retrieval Page Parameter Options: Enforce Mutual Exclusivity Between "Only Need Context" and "Only Need Prompt" by @Saravanakumar26 in #2118
- Refactor: Optimize Query Prompts and User Prompt Handling by @danielaskdd in #2127
- WebUI Bugfix and Improvement by @danielaskdd in #2129
- Fix: Restore browser autocomplete functionality in message input box by @danielaskdd in #2131
- feat: Implement Comprehensive Document Duplication Prevention System by @danielaskdd in #2135
- Refactor node type legend and color mapping by @danielaskdd in #2137
- Fix typo: "Oputput" -> output by @SeungAhSon in #2139
- Feature: Add Enhanced Markdown Support for WebUI by @danielaskdd in #2143
- Fix: Robust clipboard functionality with fallback strategies by @danielaskdd in #2144
- Optimize Footnote Marker Display in WebUI by @danielaskdd in #2151
- Fix double query problem by add aquery_llm function for consistent response handling by @danielaskdd in #2152
- Web UI - center the loading icon and adjust GraphSeach width by @zl7261 in #2150
New Contributors
- @Saravanakumar26 made their first contribution in #2118
- @SeungAhSon made their first contribution in #2139
- @zl7261 made their first contribution in #2150
Full Changelog: v1.4.8.2...v1.4.9
v1.4.8.2
What's Changed
- Refactor: Add error handling with chunk ID prefixing in entity extraction by @danielaskdd in #2107
- fix: resolve dark mode text visibility issue in knowledge graph view by @roman-marchuk in #2106
- Fix: Resolve DocumentManager UI freezing and enhance error handling by @danielaskdd in #2109
New Contributors
- @roman-marchuk made their first contribution in #2106
Full Changelog: v1.4.8.1...v1.4.8.2
v1.4.8.1
Import Notes
- Introduced a raw data query API
/query/data, enabling developers to retrieve complete raw data recalled by LightRAG for fine-grained processing. - Optimized the system to efficiently handle hundreds of documents and hundreds of thousands of ten-relationships in one batch job, resolving UI lag and enhancing overall system stability.
- Drop entities with short numeric names that negatively impact performance and query results; dropping names containing only two digits, names shorter than six characters, and names mixed with digits and dots, like 1.1, 12.3, 1.2.3 etc.
- Significantly improves the quantity and quality of entity and relation extraction for smaller parameter models, leading to a substantial improvement in query performance
- Optimized the prompt engineering for
Qwen3-30B-A3B-Instructandgpt-oss-120bmodels, incorporating targeted fault tolerance for model outputs. - Implemented
max tokensandtempretureconfiguration to prevent excessively long or endless output loop during the entity relationship extraction phase for Large Language Model (LLM) responses.
# Increased temperature values may mitigate infinite inference loops in certain LLM, such as Qwen3-30B.
OPENAI_LLM_TEMPERATURE=0.9
# For vLLM/SGLang doployed models, or most of OpenAI compatible API provider
OPENAI_LLM_MAX_TOKENS=9000
# For Ollama Deployed Modeles
OLLAMA_LLM_NUM_PREDICT=9000
# For OpenAI o1-mini or newer modles
OPENAI_LLM_MAX_COMPLETION_TOKENS=9000
The purpose of setting max tokens parameter is to truncate LLM output before timeouts occur, thereby preventing document extraction failures. This addresses issues where certain text blocks (e.g., tables or citations) containing numerous entities and relationships can lead to overly long or even endless loop outputs from LLMs. This setting is particularly crucial for locally deployed, smaller-parameter models. Max tokens value can be calculated by this formula:
LLM_TIMEOUT * llm_output_tokens/second(i.e.9000 = 180s * 50 tokens/s)
What's New
- refact: Enhance KG Extraction with Improved Prompts and Parser Robustness by @danielaskdd in #2032
- feat: Limit Pipeline Status History Messages to Latest 1000 Entries by @danielaskdd in #2064
- feature: Enhance document status display with metadata tooltips and better icons by @danielaskdd in #2070
- refactor: Optimize Entity Extraction for Small Parameter LLMs with Enhanced Prompt Caching by @danielaskdd in #2076 #2072
- Feature: Add LLM COT Rendering support for WebUI by @danielaskdd in #2077
- feat: Add Deepseek Sytle CoT Support for Open AI Compatible LLM Provider by @danielaskdd in #2086
- Add
query_datafunction and /query/data API endpoint to LightRag for retrieval structured response by @tongda in #2036 #2100
What's Fixed
- Fix: Eliminate Lambda Closure Bug in Embedding Function Creation by @avchauzov in #2028
- refac: Eliminate Conditional Imports and Simplify Initialization by @danielaskdd in #2029
- Fix: Preserve Leading Spaces in Graph Label Selection by @danielaskdd in #2030
- Fix ENTITY_TYPES Environment Variable Handling by @danielaskdd in #2034
- refac: Enhanced Entity Relation Extraction Text Sanitization and Normalization by @danielaskdd in #2031
- Fix LLM output instability for <|> tuple delimiter by @danielaskdd in #2035
- Enhance KG Extraction for LLM with Small Parameters by @danielaskdd in #2051
- Add VDB error handling with retries for data consistency by @danielaskdd in #2055
- Fix incorrect variable name in NetworkXStorage file path by @danielaskdd in #2060
- refact: Smart Configuration Caching and Conditional Logging by @danielaskdd in #2068
- refactor: Improved Exception Handling with Context-Aware Error Messages by @danielaskdd in #2069
- fix env file example by @k-shlomi in #2075
- Increase default Gunicorn worker timeout from 210 to 300 seconds by @danielaskdd in #2078
- Fix assistant message display with content fallback by @danielaskdd in #2079
- Prompt Optimization: remove angle brackets from entity and relationship output formats by @danielaskdd in #2082
- Refactor PostgreSQL Graph Query by Native SQL and Standardized Parameter Passing by @Matt23-star in #2027
- Update env.example by @rolotumazi in #2091
- refactor: Optimize Prompt and Fault Tolerance for LLM with Smaller Param LLM by @danielaskdd in #2093
New Contributors
- @avchauzov made their first contribution in #2028
- @k-shlomi made their first contribution in #2075
- @rolotumazi made their first contribution in #2091
- @tongda made their first contribution in #2036
Full Changelog: v1.4.7...v1.4.8
v1.4.8rc9
What's Changed
- Prompt Optimization: remove angle brackets from entity and relationship output formats by @danielaskdd in #2082
- Refactor PostgreSQL Graph Query by Native SQL and Standardized Parameter Passing by @Matt23-star in #2027
- feat: Add Deepseek Sytle CoT Support for Open AI Compatible LLM Provider by @danielaskdd in #2086
Full Changelog: v1.4.8rc8...v1.4.8rc9
v1.4.8rc8
What's New
- Optimize Entity Extraction for Small Parameter LLMs with Enhanced Prompt Caching by @danielaskdd in #2076 #2072
- Add LLM COT Rendering support for WebUI by @danielaskdd in #2077
- Enhance document status display with metadata tooltips by @danielaskdd in #2070
What's Fixed
- Limit Pipeline Status History Messages to Latest 1000 Entries by @danielaskdd in #2064
- Optimized Function Factories for LLM and Embedding function @danielaskdd in #2068
- Improved LLM Exception Handling with Context-Aware Error Messages by @danielaskdd in #2069
- fix env file example by @k-shlomi in #2075
New Contributors
Full Changelog: v1.4.8rc4...v1.4.8rc8
v1.4.8rc4
Import Notes
Refactoring Prompt Template and enhancing robust mal-format handling for Knowledge Graph (KG) extraction using Small Parameter Language Models (LLMs). This will invalidate all cached LLM outputs.
What's Changed
- Fix: Eliminate Lambda Closure Bug in Embedding Function Creation by @avchauzov in #2028
- refac: Eliminate Conditional Imports and Simplify Initialization by @danielaskdd in #2029
- Fix: Preserve Leading Spaces in Graph Label Selection by @danielaskdd in #2030
- Fix ENTITY_TYPES Environment Variable Handling by @danielaskdd in #2034
- refac: Enhanced Entity Relation Extraction Text Sanitization and Normalization by @danielaskdd in #2031
- refact: Enhance KG Extraction with Improved Prompts and Parser Robustness by @danielaskdd in #2032
- Fix LLM output instability for <|> tuple delimiter by @danielaskdd in #2035
- Enhance KG Extraction for LLM with Small Parameters by @danielaskdd in #2051
- Add VDB error handling with retries for data consistency by @danielaskdd in #2055
- Fix incorrect variable name in NetworkXStorage file path by @danielaskdd in #2060
New Contributors
- @avchauzov made their first contribution in #2028
Full Changelog: v1.4.7...v1.4.8rc2
v1.4.7
Important Notes
- Doc-id based chunk filtering feature is remove from PostgreSQL vector storage.
- Prompt template has been updated, invalidating all LLM caches
- The default value of the FORCE_LLM_SUMMARY_ON_MERGE environment variable has been changed from 4 to 8. This adjustment significantly reduces the number of LLM calls during the documentation indexing phase, thereby shortening the overall document processing time.
- Added support for multiple Rerank Providers (Cohere AI, Jina AI, Aliyun Dashscope). If Rerank was previously enabled, and new env var must be set to enable rerank again:
RERANK_BINDING=cohere
- Introduced a new environment variable, LLM_TIMEOUT, to specifically control the Large Language Model (LLM) timeout. The existing TIMEOUT variable now exclusively manages the Gunicorn worker timeout. The default LLM timeout is set to 180 seconds. If you previously relied on the TIMEOUT variable for LLM timeout configuration, please update your settings to use LLM_TIMEOUT instead:
LLM_TIMEOUT=180
- Add comprehensive environment variable settings for OpenAI and Ollama Large Language Model (LLM) Bindings.
The generic TEMPERATURE environment variable for LLM temperature control has been deprecated. Instead, LLM temperature is now configured using binding-specific environment variables:
# Temperature setting for OpenAI binding
OPENAI_LLM_TEMPERATURE=0.8
# Temperature setting for Ollama binding
OLLAMA_LLM_TEMPERATURE=1.0
To mitigate endless output loops and prevent greedy decoding for Qwen3, set the temperature parameter to a value between 0.8 and 1.0. To disable the model's "Thinking" mode, please refer to the following configuration:
### Qwen3 Specific Parameters depoly by vLLM
# OPENAI_LLM_EXTRA_BODY='{"chat_template_kwargs": {"enable_thinking": false}}'
### OpenRouter Specific Parameters
# OPENAI_LLM_EXTRA_BODY='{"reasoning": {"enabled": false}}'
For a full list of support options, use the following command:
lightrag-server --llm-binding openai --help
lightrag-server --llm-binding ollama --help
lightrag-server --embedding-binding ollama --help
- A Full list of new env vars added or new default values:
# Timeout for LLM requests (seconds)
LLM_TIMEOUT=180
# Timeout for embedding requests (seconds)
EMBEDDING_TIMEOUT=30
### Number of summary segments or tokens to trigger LLM summary on entity/relation merge (at least 3 is recommended)
FORCE_LLM_SUMMARY_ON_MERGE=8
### Max description token size to trigger LLM summary
SUMMARY_MAX_TOKENS = 1200
### Recommended LLM summary output length in tokens
SUMMARY_LENGTH_RECOMMENDED=600
### Maximum context size sent to LLM for description summary
SUMMARY_CONTEXT_SIZE=12000
### RERANK_BINDING type: null, cohere, jina, aliyun
RERANK_BINDING=null
### Enable rerank by default in query params when RERANK_BINDING is not null
# RERANK_BY_DEFAULT=True
### chunk selection strategies
### VECTOR: Pick KG chunks by vector similarity, delivered chunks to the LLM aligning more closely with naive retrieval
### WEIGHT: Pick KG chunks by entity and chunk weight, delivered more solely KG related chunks to the LLM
### If reranking is enabled, the impact of chunk selection strategies will be diminished.
KG_CHUNK_PICK_METHOD=VECTOR
### Entity types that the LLM will attempt to recognize
ENTITY_TYPES=["person", "organization", "location", "event", "concept"]
What's News
- feat: Add OpenAI LLM Options Support by @danielaskdd in #1910
- feat: add tiktoken cache directory support for offline deployment by @danielaskdd in #1914
- Refact:Enhanced Neo4j Connection Lifecycle Management by @danielaskdd in #1924
- Add Vector Index Support for PostgreSQL KG by @Matt23-star in #1922
- Add AWS Bedrock LLm Binding Support #1733 by @sjjpo2002 in #1948
- Feat: Reprocessing of failed documents without the original file being present by @danielaskdd in #1954
- Feat: add KG related chunks selection by vector similarity by @danielaskdd in #1959
- Feat: Optimize error handling for document processing pipeline by @danielaskdd in #1965
- Add Chinese pinyin sorting support across document operations by @danielaskdd in #1967
- Refactor: Remove file_path and created_at from entity and relation query context send to LLM by @danielaskdd in #1976
- Refactor Postgres Batch Queries for Scalability, Safety, and Vector Search Optimization by @Matt23-star in #1964
- feat: Support turning off thinking on OpenRouter/vLLM by @danielaskdd in #1987
- feat: Add Multiple Rerank Provider Support by @danielaskdd in #1993
- feat: Add clear error messages for uninitialized storage by @albertgilopez in #1978
- feat: Add diagnostic tool to check initialization status by @albertgilopez in #1979
- feat: Improve Empty Keyword Handling logic by @danielaskdd in #1996
- refac: Refactor LLM Summary Generation Algorithm by @danielaskdd in #2006
- Added entity_types as a user defined variable (via .env) by @thiborose in #1998
- refac: Enhanced Timeout Handling for LLM Priority Queue by @danielaskdd in #2024
- refac: Remove deprecated doc-id based filtering from vector storage queries by @danielaskdd in #2025
What''s Fixed
- Fix ollama stop option handling and enhance temperature configuration by @danielaskdd in #1909
- Feat: Change embedding formats from float to base64 for efficiency by @danielaskdd in #1913
- Refact: Optimized LLM Cache Hash Key Generation by Including All Query Parameters by @danielaskdd in #1915
- Fix: Unify document chunks context format in only_need_context query by @danielaskdd in #1923
- Fix: Update OpenAI embedding handling for both list and base64 embeddings by @danielaskdd in #1928
- Fix: Initialize first_stage_tasks and entity_relation_task to prevent empty-task cancel errors by @danielaskdd in #1931
- Fix: Resolve workspace isolation issues across multiple storage implementations by @danielaskdd in #1941
- Fix: remove query params from cache key generation for keyword extraction by @danielaskdd in #1949
- Refac: uniformly protected with the get_data_init_lock for all storage initializations by @danielaskdd in #1951
- Fixes crash when processing files with UTF-8 encoding error by @danielaskdd in #1952
- Fix Document Selection Issues After Pagination Implementation by @danielaskdd in #1966
- Change the status from PROCESSING/FAILED to PENDING at the beginning of document processing pipeline by @danielaskdd in #1971
- Refac: Increase file_path field length to 32768 and add schema migration for Milvus DB by @danielaskdd in #1975
- Optimize keyword extraction prompt, and remove conversation history from keyword extraction by @danielaskdd in #1977
- Fix(UI): Implement XLSX format upload support for web UI by @danielaskdd in #1982
- Fix: resolved UTF-8 encoding error during document processing by @danielaskdd in #1983
- Fix: Preserve Document List Pagination During Pipeline Status Changes by @danielaskdd in #1992
- Update README-zh.md by @OnesoftQwQ in #1989
- Fi: Added import of OpenAILLMOptions when using azure_openai by @thiborose in #1999
- fix(webui): resolve document status grouping issue in DocumentManager by @danielaskdd in #2013
- fix mismatch of 'error' and 'error_msg' in MongoDB by @LinkinPony in #2009
- Fix UTF-8 Encoding Issues Causing Document Processing Failures by @danielaskdd in #2017
- docs(config): fix typo in .env comments by @SandmeyerX in #2021
- fix: adjust the EMBEDDING_BINDING_HOST for openai in the env.example by @pedrofs in #2026
New Contributors
- @Matt23-star made their first contribution in #1922
- @sjjpo2002 made their first contribution in #1948
- @OnesoftQwQ made their first contribution in #1989
- @albertgilopez made their first contribution in #1978
- @thiborose made their first contribution in #1999
- @LinkinPony made their first contribution in #2009
- @SandmeyerX made their first contribution in #202...
v1.4.6
What's New
- feat(performance): Optimize Document Deletion Performance with Entity/Relation Indexing by @danielaskdd in #1904
- refactor: improve JSON parsing reliability with json-repair library by @danielaskdd in #1897
What's Fixed
- Fix: resolved workspace isolation problem for json KV storage by @danielaskdd in #1899
- Fix: move OllamaServerInfos class to base module by @danielaskdd in #1893
- Add graceful shutdown handling for LightRAG server by @danielaskdd in #1895
Full Changelog: v1.4.5...v1.4.6
v1.4.5
What's New
- Feat(webui): add document list pagination for webui by @danielaskdd in #1886
- Feat: add Document Processing Track ID Support for Frontend by @danielaskdd in #1882
- Better prompt for entity description extraction to avoid hallucinations by @AkosLukacs in #1845
- Refine entity continuation prompt to avoid duplicates and reducing document processing time by @danielaskdd in #1868
- feat: Add rerank score filtering with configurable threshold by @danielaskdd in #1871
- Feat(webui): add query param reset buttons to webui by @danielaskdd in #1889
- Feat(webui): enhance status card with new settings from health endpoint by @danielaskdd in #1873
What's Fixed
- fix(webui): Correct edge renderer for sigma.js v3 by @danielaskdd in #1863
- Fix: file_path exceeds max length by @okxuewei in #1862
- refactor: unify file_path handling across merge and rebuild functions by @danielaskdd in #1869
- Fix: Improve keyword extraction prompt for robust JSON output. by @danielaskdd in #1872
- allow options to be passed to ollama client when using ollama binding [improved] by @michele-comitini in #1874
- refactor: Remove deprecated
max_token_sizefrom embedding configura… by @danielaskdd in #1875 - Fix: corrected unterminated f-string in config.py by @Ja1aia in #1876
- feat: add processing time tracking to document status with metadata field by @danielaskdd in #1883
- fix timeout issue by @Ja1aia in #1878
- fix: Add safe handling for missing file_path in PostgreSQL by @danielaskdd in #1891
- Remove content in doc status to improve performance by @danielaskdd in #1881
New Contributors
- @michele-comitini made their first contribution in #1874
- @Ja1aia made their first contribution in #1876
Full Changelog: v1.4.4...v1.4.5