Fix: Robust error handling for async database operations in graph storage#2356
Fix: Robust error handling for async database operations in graph storage#2356danielaskdd merged 2 commits intoHKUDS:mainfrom
Conversation
- Initialize result vars to None - Add null checks before consume calls - Prevent crashes in except blocks - Apply fix to both Neo4J and Memgraph
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
lightrag/kg/memgraph_impl.py
Outdated
| workspace_label = self._get_workspace_label() | ||
| async with self._driver.session( | ||
| database=self._DATABASE, default_access_mode="READ" | ||
| ) as session: | ||
| result = None |
There was a problem hiding this comment.
Restore outer try/except in Memgraph popular label lookup
get_popular_labels() previously wrapped _get_workspace_label() and the session acquisition in a try/except that logged failures and returned an empty list. The new implementation performs those steps outside the try block, so a driver connection error or _get_workspace_label() failure now propagates as an unhandled exception and will bubble up to callers instead of returning [] as before. This is a behavioral regression and contradicts the commit’s intent of a purely defensive change.
Useful? React with 👍 / 👎.
lightrag/kg/memgraph_impl.py
Outdated
| workspace_label = self._get_workspace_label() | ||
| async with self._driver.session( | ||
| database=self._DATABASE, default_access_mode="READ" | ||
| ) as session: | ||
| result = None | ||
| try: |
There was a problem hiding this comment.
Maintain graceful failure in Memgraph label search
search_labels() likewise moved _get_workspace_label() and the session context creation outside its try/except. When Memgraph is unreachable or the helper raises, the method now bubbles the exception instead of logging and returning an empty list as it did before, causing callers to receive hard failures rather than a safe fallback.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
lightrag/kg/memgraph_impl.py
Outdated
| f"[{self.workspace}] Retrieved {len(labels)} popular labels (limit: {limit})" | ||
| ) | ||
| return labels | ||
| except Exception as e: | ||
| logger.error(f"[{self.workspace}] Error getting popular labels: {str(e)}") | ||
| return [] | ||
| except Exception as e: | ||
| logger.error( | ||
| f"[{self.workspace}] Error getting popular labels: {str(e)}" | ||
| ) | ||
| if result is not None: | ||
| await result.consume() | ||
| return [] |
There was a problem hiding this comment.
Preserve empty-list fallback when session acquisition fails
The previous version wrapped the entire method in a try/except and returned [] on any error. By moving the try block inside the async with context, failures raised while opening the Memgraph session (e.g. network outage or authentication error) now bypass the handler and bubble up to callers. That changes the public behaviour from “log and return an empty list” to propagating exceptions, which can crash code paths that previously tolerated a down graph store. Consider restoring the broader try/except so that session acquisition errors are still logged and handled consistently.
Useful? React with 👍 / 👎.
lightrag/kg/memgraph_impl.py
Outdated
| f"[{self.workspace}] Search query '{query}' returned {len(labels)} results (limit: {limit})" | ||
| ) | ||
| return labels | ||
| except Exception as e: | ||
| logger.error(f"[{self.workspace}] Error searching labels: {str(e)}") | ||
| return [] | ||
| except Exception as e: | ||
| logger.error(f"[{self.workspace}] Error searching labels: {str(e)}") | ||
| if result is not None: | ||
| await result.consume() |
There was a problem hiding this comment.
Search labels now throws on connection errors instead of returning []
Similar to get_popular_labels, the try/except is now nested inside the session block, so any exception while acquiring the Memgraph session (or resolving the workspace label) is no longer caught and the method raises instead of returning [] as it did before. Callers relying on the graceful fallback will now see unhandled exceptions when the database is unavailable. Wrapping the session creation in the same try/except would keep the previous contract.
Useful? React with 👍 / 👎.
|
@codex review |
|
Codex Review: Didn't find any major issues. Another round soon, please! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
🐛 Fix: Robust error handling for async database operations in Neo4j and Memgraph storage
Summary
This PR enhances error handling in graph storage implementations by adding defensive null checks before consuming async database results in exception handlers. This prevents secondary exceptions from masking the original error and improves overall system stability.
Problem
In several async methods across
neo4j_impl.pyandmemgraph_impl.py, the exception handlers attempted to consume database results without checking if theresultvariable was successfully initialized. If an exception occurred duringsession.run(), theresultvariable would be unbound, leading toUnboundLocalErrororNameErrorin the exception handler, which would mask the actual error.Example of problematic code:
Solution
Applied a consistent defensive programming pattern across all affected methods:
Changes
Modified Files
lightrag/kg/neo4j_impl.py- 4 methods fixedlightrag/kg/memgraph_impl.py- 7 methods fixedFixed Methods
Neo4j Implementation:
has_node()- Check node existencehas_edge()- Check edge existenceget_popular_labels()- Retrieve popular entity labelsget_node_edges()- Retrieve node relationshipsMemgraph Implementation:
has_node()- Check node existencehas_edge()- Check edge existenceget_all_labels()- Retrieve all labelsget_node_edges()- Retrieve node relationshipsget_edge()- Retrieve edge propertiesget_popular_labels()- Retrieve popular entity labelssearch_labels()- Search labels with fuzzy matchingTechnical Details
Pattern Applied:
result = Nonebefore the try blocksession.run()if result is not None:check in exception handler before consumingBenefits:
.clinerulesTesting
Impact