Skip to content

Fix: Add Comprehensive Retry Mechanism for Neo4j Storage Operations#2417

Merged
danielaskdd merged 3 commits intoHKUDS:mainfrom
danielaskdd:neo4j-retry
Nov 24, 2025
Merged

Fix: Add Comprehensive Retry Mechanism for Neo4j Storage Operations#2417
danielaskdd merged 3 commits intoHKUDS:mainfrom
danielaskdd:neo4j-retry

Conversation

@danielaskdd
Copy link
Collaborator

🐛 Fix: Add Comprehensive Retry Mechanism for Neo4j Storage Operations

Problem

Users were experiencing AttributeError: 'NoneType' object has no attribute 'send' errors during document processing with Neo4j storage backend. This error occurred in the merging stage when the connection pool encountered transient network issues or connection lifecycles, causing operations to fail with NoneType objects attempting to call the .send() method.

Error Stack Trace:

File "c:\develop\myenvs\lightrag-env\Lib\site-packages\neo4j\_async_compat\network\_bolt.py", line 196, in sendall
  self._write.write(data)
AttributeError: 'NoneType' object has no attribute 'send'

Solution

Implemented comprehensive retry logic using the tenacity library for all critical Neo4j database operations. The retry mechanism handles transient failures gracefully by automatically retrying failed operations with exponential backoff.

Changes Made

Modified Files

  • lightrag/kg/neo4j_impl.py - Added retry decorators to 15 critical database operations

Functions Enhanced with Retry Logic

Read Operations (10 functions) - Retry on connection errors including AttributeError:

  1. has_node - Check node existence
  2. has_edge - Check edge existence
  3. get_node - Get single node
  4. get_nodes_batch - Batch get nodes
  5. node_degree - Get node degree
  6. node_degrees_batch - Batch get node degrees
  7. get_edge - Get single edge
  8. get_edges_batch - Batch get edges
  9. get_node_edges - Get all edges for a node
  10. get_nodes_edges_batch - Batch get node edges

Write Operations (5 functions) - Retry on write-specific errors:

  1. upsert_node - Insert/update node
  2. upsert_edge - Insert/update edge
  3. delete_node - Delete node
  4. remove_nodes - Batch delete nodes
  5. remove_edges - Batch delete edges

Retry Configuration

Read Operations:

  • Max attempts: 3
  • Wait strategy: Exponential backoff (4-10 seconds)
  • Retry on: ServiceUnavailable, TransientError, SessionExpired, ConnectionResetError, OSError, AttributeError

Write Operations:

  • Max attempts: 3
  • Wait strategy: Exponential backoff (4-10 seconds)
  • Retry on: All read exceptions + WriteServiceUnavailable, ClientError

Benefits

Improved Reliability - Automatically recovers from transient connection pool and network issues
Production Ready - Handles connection lifecycle problems gracefully
Zero Breaking Changes - Transparent retry logic with no API changes
Better User Experience - Document processing continues successfully despite temporary failures
Comprehensive Coverage - All critical database operations protected

Testing Recommendations

  • Verify document processing completes successfully with Neo4j backend
  • Test under network instability conditions
  • Confirm retry logs appear in debug mode when transient failures occur
  • Validate no performance degradation under normal operations

… storage

• Define READ_RETRY_EXCEPTIONS constant
• Create reusable READ_RETRY decorator
• Replace 11 duplicate retry decorators
• Improve code maintainability
• Add missing retry to edge_degrees_batch
@danielaskdd
Copy link
Collaborator Author

@codex review

@danielaskdd danielaskdd merged commit 2832a2c into HKUDS:main Nov 24, 2025
4 checks passed
@danielaskdd danielaskdd deleted the neo4j-retry branch November 24, 2025 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant