feat: Add RAGAS evaluation framework for RAG quality assessment #2297

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

danielaskdd merged 20 commits into HKUDS:main from anouar-bm:feat/ragas-evaluation

Nov 4, 2025

+1,231 −3

.gitignore

-Original file line number
+Diff line change
@@ Expand Up / @@ -50,6 +50,9 @@ output/ @@
     rag_storage/
     data/
+    # Evaluation results
+    lightrag/evaluation/results/
     # Miscellaneous
     .DS_Store
     TODO.md
@@ Expand Down @@

lightrag/api/README.md

-Original file line number
+Diff line change
@@ Expand Up @@
     RERANK_BY_DEFAULT=False
     ```
+    ### Include Chunk Content in References
+    By default, the `/query` and `/query/stream` endpoints return references with only `reference_id` and `file_path`. For evaluation, debugging, or citation purposes, you can request the actual retrieved chunk content to be included in references.
+    The `include_chunk_content` parameter (default: `false`) controls whether the actual text content of retrieved chunks is included in the response references. This is particularly useful for:
+    - **RAG Evaluation**: Testing systems like RAGAS that need access to retrieved contexts
+    - **Debugging**: Verifying what content was actually used to generate the answer
+    - **Citation Display**: Showing users the exact text passages that support the response
+    - **Transparency**: Providing full visibility into the RAG retrieval process
+    **Important**: The `content` field is an **array of strings**, where each string represents a chunk from the same file. A single file may correspond to multiple chunks, so the content is returned as a list to preserve chunk boundaries.
+    **Example API Request:**
+    ```json
+    {
+      "query": "What is LightRAG?",
+      "mode": "mix",
+      "include_references": true,
+      "include_chunk_content": true
+    }
+    ```
+    **Example Response (with chunk content):**
+    ```json
+    {
+      "response": "LightRAG is a graph-based RAG system...",
+      "references": [
+        {
+          "reference_id": "1",
+          "file_path": "/documents/intro.md",
+          "content": [
+            "LightRAG is a retrieval-augmented generation system that combines knowledge graphs with vector similarity search...",
+            "The system uses a dual-indexing approach with both vector embeddings and graph structures for enhanced retrieval..."
+          ]
+        },
+        {
+          "reference_id": "2",
+          "file_path": "/documents/features.md",
+          "content": [
+            "The system provides multiple query modes including local, global, hybrid, and mix modes..."
+          ]
+        }
+      ]
+    }
+    ```
+    **Notes**:
+    - This parameter only works when `include_references=true`. Setting `include_chunk_content=true` without including references has no effect.
+    - **Breaking Change**: Prior versions returned `content` as a single concatenated string. Now it returns an array of strings to preserve individual chunk boundaries. If you need a single string, join the array elements with your preferred separator (e.g., `"\n\n".join(content)`).
     ### .env Examples
     ```bash
@@ Expand Down @@

lightrag/api/routers/query_routes.py

-Original file line number
+Diff line change
@@ Expand Up / @@ -103,6 +103,11 @@ class QueryRequest(BaseModel): @@
             description="If True, includes reference list in responses. Affects /query and /query/stream endpoints. /query/data always includes references.",
         )
+        include_chunk_content: Optional[bool] = Field(
+            default=False,
+            description="If True, includes actual chunk text content in references. Only applies when include_references=True. Useful for evaluation and debugging.",
+        )
         stream: Optional[bool] = Field(
             default=True,
             description="If True, enables streaming output for real-time responses. Only affects /query/stream endpoint.",
@@ Expand Down Expand Up / @@ -130,19 +135,33 @@ def conversation_history_role_check( @@
         def to_query_params(self, is_stream: bool) -> "QueryParam":
             """Converts a QueryRequest instance into a QueryParam instance."""
             # Use Pydantic's `.model_dump(exclude_none=True)` to remove None values automatically
-            request_data = self.model_dump(exclude_none=True, exclude={"query"})
+            # Exclude API-level parameters that don't belong in QueryParam
+            request_data = self.model_dump(
+                exclude_none=True, exclude={"query", "include_chunk_content"}
+            )
             # Ensure `mode` and `stream` are set explicitly
             param = QueryParam(**request_data)
             param.stream = is_stream
             return param
+    class ReferenceItem(BaseModel):
+        """A single reference item in query responses."""
+        reference_id: str = Field(description="Unique reference identifier")
+        file_path: str = Field(description="Path to the source file")
+        content: Optional[List[str]] = Field(
+            default=None,
+            description="List of chunk contents from this file (only present when include_chunk_content=True)",
+        )
     class QueryResponse(BaseModel):
         response: str = Field(
             description="The generated response",
         )
-        references: Optional[List[Dict[str, str]]] = Field(
+        references: Optional[List[ReferenceItem]] = Field(
             default=None,
             description="Reference list (Disabled when include_references=False, /query/data always includes references.)",
         )
@@ Expand Down Expand Up @@
                                             "properties": {
                                                 "reference_id": {"type": "string"},
                                                 "file_path": {"type": "string"},
+                                                "content": {
+                                                    "type": "array",
+                                                    "items": {"type": "string"},
+                                                    "description": "List of chunk contents from this file (only included when include_chunk_content=True)",
+                                                },
                                             },
                                         },
                                         "description": "Reference list (only included when include_references=True)",
@@ Expand All @@
                                         ],
                                     },
                                 },
+                                "with_chunk_content": {
+                                    "summary": "Response with chunk content",
+                                    "description": "Example response when include_references=True and include_chunk_content=True. Note: content is an array of chunks from the same file.",
+                                    "value": {
+                                        "response": "Artificial Intelligence (AI) is a branch of computer science that aims to create intelligent machines capable of performing tasks that typically require human intelligence, such as learning, reasoning, and problem-solving.",
+                                        "references": [
+                                            {
+                                                "reference_id": "1",
+                                                "file_path": "/documents/ai_overview.pdf",
+                                                "content": [
+                                                    "Artificial Intelligence (AI) represents a transformative field in computer science focused on creating systems that can perform tasks requiring human-like intelligence. These tasks include learning from experience, understanding natural language, recognizing patterns, and making decisions.",
+                                                    "AI systems can be categorized into narrow AI, which is designed for specific tasks, and general AI, which aims to match human cognitive abilities across a wide range of domains.",
+                                                ],
+                                            },
+                                            {
+                                                "reference_id": "2",
+                                                "file_path": "/documents/machine_learning.txt",
+                                                "content": [
+                                                    "Machine learning is a subset of AI that enables computers to learn and improve from experience without being explicitly programmed. It focuses on the development of algorithms that can access data and use it to learn for themselves."
+                                                ],
+                                            },
+                                        ],
+                                    },
+                                },
                                 "without_references": {
                                     "summary": "Response without references",
                                     "description": "Example response when include_references=False",
@@ Expand Down Expand Up / @@ -368,13 +416,37 @@ async def query_text(request: QueryRequest): @@
                 # Extract LLM response and references from unified result
                 llm_response = result.get("llm_response", {})
-                references = result.get("data", {}).get("references", [])
+                data = result.get("data", {})
+                references = data.get("references", [])
                 # Get the non-streaming response content
                 response_content = llm_response.get("content", "")
                 if not response_content:
                     response_content = "No relevant context found for the query."
+                # Enrich references with chunk content if requested
+                if request.include_references and request.include_chunk_content:
+                    chunks = data.get("chunks", [])
+                    # Create a mapping from reference_id to chunk content
+                    ref_id_to_content = {}
+                    for chunk in chunks:
+                        ref_id = chunk.get("reference_id", "")
+                        content = chunk.get("content", "")
+                        if ref_id and content:
+                            # Collect chunk content; join later to avoid quadratic string concatenation
+                            ref_id_to_content.setdefault(ref_id, []).append(content)
+                    # Add content to references
+                    enriched_references = []
+                    for ref in references:
+                        ref_copy = ref.copy()
+                        ref_id = ref.get("reference_id", "")
+                        if ref_id in ref_id_to_content:
+                            # Keep content as a list of chunks (one file may have multiple chunks)
+                            ref_copy["content"] = ref_id_to_content[ref_id]
+                        enriched_references.append(ref_copy)
+                    references = enriched_references
                 # Return response with or without references based on request
                 if request.include_references:
                     return QueryResponse(response=response_content, references=references)
@@ Expand Down Expand Up / @@ -404,6 +476,11 @@ async def query_text(request: QueryRequest): @@
                                     "description": "Multiple NDJSON lines when stream=True and include_references=True. First line contains references, subsequent lines contain response chunks.",
                                     "value": '{"references": [{"reference_id": "1", "file_path": "/documents/ai_overview.pdf"}, {"reference_id": "2", "file_path": "/documents/ml_basics.txt"}]}\n{"response": "Artificial Intelligence (AI) is a branch of computer science"}\n{"response": " that aims to create intelligent machines capable of performing"}\n{"response": " tasks that typically require human intelligence, such as learning,"}\n{"response": " reasoning, and problem-solving."}',
                                 },
+                                "streaming_with_chunk_content": {
+                                    "summary": "Streaming mode with chunk content (stream=true, include_chunk_content=true)",
+                                    "description": "Multiple NDJSON lines when stream=True, include_references=True, and include_chunk_content=True. First line contains references with content arrays (one file may have multiple chunks), subsequent lines contain response chunks.",
+                                    "value": '{"references": [{"reference_id": "1", "file_path": "/documents/ai_overview.pdf", "content": ["Artificial Intelligence (AI) represents a transformative field...", "AI systems can be categorized into narrow AI and general AI..."]}, {"reference_id": "2", "file_path": "/documents/ml_basics.txt", "content": ["Machine learning is a subset of AI that enables computers to learn..."]}]}\n{"response": "Artificial Intelligence (AI) is a branch of computer science"}\n{"response": " that aims to create intelligent machines capable of performing"}\n{"response": " tasks that typically require human intelligence."}',
+                                },
                                 "streaming_without_references": {
                                     "summary": "Streaming mode without references (stream=true)",
                                     "description": "Multiple NDJSON lines when stream=True and include_references=False. Only response chunks are sent.",
@@ Expand Down Expand Up / @@ -600,6 +677,30 @@ async def stream_generator(): @@
                     references = result.get("data", {}).get("references", [])
                     llm_response = result.get("llm_response", {})
+                    # Enrich references with chunk content if requested
+                    if request.include_references and request.include_chunk_content:
+                        data = result.get("data", {})
+                        chunks = data.get("chunks", [])
+                        # Create a mapping from reference_id to chunk content
+                        ref_id_to_content = {}
+                        for chunk in chunks:
+                            ref_id = chunk.get("reference_id", "")
+                            content = chunk.get("content", "")
+                            if ref_id and content:
+                                # Collect chunk content
+                                ref_id_to_content.setdefault(ref_id, []).append(content)
+                        # Add content to references
+                        enriched_references = []
+                        for ref in references:
+                            ref_copy = ref.copy()
+                            ref_id = ref.get("reference_id", "")
+                            if ref_id in ref_id_to_content:
+                                # Keep content as a list of chunks (one file may have multiple chunks)
+                                ref_copy["content"] = ref_id_to_content[ref_id]
+                            enriched_references.append(ref_copy)
+                        references = enriched_references
                     if llm_response.get("is_streaming"):
                         # Streaming mode: send references first, then stream response chunks
                         if request.include_references:
@@ Expand Down @@

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add RAGAS evaluation framework for RAG quality assessment #2297

Diff view

Diff view

There are no files selected for viewing

Uh oh!