-
Notifications
You must be signed in to change notification settings - Fork 78
Only check for deadlocks in deadlock busting thread #3977
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only check for deadlocks in deadlock busting thread #3977
Conversation
Signed-off-by: Alessandro Bellina <[email protected]>
c4501b3 to
9485a51
Compare
|
build |
|
build |
Greptile Summary
Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant DT as Deadlock Thread
participant TSR as ThreadStateRegistry
participant SRA as SparkResourceAdaptor
participant Native as C++ spark_resource_adaptor
participant WT as Worker Thread
DT->>TSR: blockedThreadIds()
TSR->>TSR: iterate knownThreads
TSR->>TSR: check thread state for each
TSR-->>DT: return blocked thread IDs array
DT->>SRA: checkAndBreakDeadlocks()
SRA->>Native: checkAndBreakDeadlocks(handle, blockedThreadIds)
Native->>Native: check_and_break_deadlocks(java_blocked_thread_ids)
Native->>Native: check_and_update_for_bufn(lock, java_blocked_thread_ids)
Native->>Native: is_in_deadlock() with java thread state
alt Deadlock detected
Native->>Native: transition thread to BUFN_THROW or RUNNING
Native->>Native: notify blocked threads
end
WT->>Native: do_deallocate()
Native->>Native: dealloc_core()
Native->>Native: wake_next_highest_priority_blocked()
Note over Native: Only wakes THREAD_BLOCKED, not BUFN threads
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 files reviewed, 1 comment
Edit Code Review Agent Settings | Greptile
React with 👍 or 👎 to share your feedback on this new summary format
revans2
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also why is the cudf version changing? is there something needed by it to be different?
| full_thread_state const& state, | ||
| std::optional<std::unordered_set<long>> const& java_blocked_thread_ids) | ||
| { | ||
| LOG_INFO("is_thread_bufn_or_above: state: {}, java_blocked_thread_ids: {}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Generally we have always had the logging follow a CSV/TSV like structure so that we can parse it and do post processing/analytics on it. Is the needed? Could this be moved to more of that format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, I can change or remove. It is "just in case", I never like to be deadlocked without finding a way to debug it.
|
yeah let me check the cudf thing, it's not intended. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 files reviewed, 1 comment
Edit Code Review Agent Settings | Greptile
React with 👍 or 👎 to share your feedback on this new summary format
|
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 files reviewed, no comments
Edit Code Review Agent Settings | Greptile
React with 👍 or 👎 to share your feedback on this new summary format
|
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 files reviewed, no comments
Edit Code Review Agent Settings | Greptile
React with 👍 or 👎 to share your feedback on this new summary format
|
Aborted the submodule run to allow this one to get merged first |
This is a performance improvement for the state machine and is part of #3905