-
Notifications
You must be signed in to change notification settings - Fork 328
UDF bug fix caused by ThreadStatic BroadcastVariablesRegistry #551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Is this still WIP? |
Can you also update the description please (make it more general instead of limiting to interactive session)? |
@imback82 Updated the description and opened the PR for review. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test? Any bug fix should come with a test that validates the fix. I think different threads just defining UDFs should suffice.
Can you update the title to say it's related to Broadcast? |
src/csharp/Microsoft.Spark.E2ETest/UdfTests/UdfSimpleTypesTests.cs
Outdated
Show resolved
Hide resolved
src/csharp/Microsoft.Spark.E2ETest/UdfTests/UdfSimpleTypesTests.cs
Outdated
Show resolved
Hide resolved
src/csharp/Microsoft.Spark.E2ETest/UdfTests/UdfSimpleTypesTests.cs
Outdated
Show resolved
Hide resolved
src/csharp/Microsoft.Spark.E2ETest/UdfTests/UdfSimpleTypesTests.cs
Outdated
Show resolved
Hide resolved
src/csharp/Microsoft.Spark.E2ETest/UdfTests/UdfSimpleTypesTests.cs
Outdated
Show resolved
Hide resolved
src/csharp/Microsoft.Spark.E2ETest/UdfTests/UdfSimpleTypesTests.cs
Outdated
Show resolved
Hide resolved
src/csharp/Microsoft.Spark.E2ETest/UdfTests/UdfSimpleTypesTests.cs
Outdated
Show resolved
Hide resolved
src/csharp/Microsoft.Spark.E2ETest/UdfTests/UdfSimpleTypesTests.cs
Outdated
Show resolved
Hide resolved
src/csharp/Microsoft.Spark.E2ETest/UdfTests/UdfSimpleTypesTests.cs
Outdated
Show resolved
Hide resolved
src/csharp/Microsoft.Spark.E2ETest/UdfTests/UdfSimpleTypesTests.cs
Outdated
Show resolved
Hide resolved
src/csharp/Microsoft.Spark.E2ETest/UdfTests/UdfSimpleTypesTests.cs
Outdated
Show resolved
Hide resolved
src/csharp/Microsoft.Spark.E2ETest/UdfTests/UdfSimpleTypesTests.cs
Outdated
Show resolved
Hide resolved
src/csharp/Microsoft.Spark.E2ETest/UdfTests/UdfSimpleTypesTests.cs
Outdated
Show resolved
Hide resolved
Co-authored-by: Steve Suh <[email protected]>
src/csharp/Microsoft.Spark.E2ETest/UdfTests/UdfSimpleTypesTests.cs
Outdated
Show resolved
Hide resolved
LGTM. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @Niharikadutta!
This PR fixes a bug found when using UDFs.
The class
JvmBroadcastRegistry
has a variable defined that stores the list of all active Broadcast variables' jvmObjects that are to be shipped to the workers. This variable was marked asThreadStatic
to maintain a thread-specific copy of the object. The wayThreadStatic
works is it only initializes the object for the very first thread that uses it, and subsequent threads are not initialized. This caused aNull reference
exception to be thrown when using UDFs that used a different thread.Solution: This was fixed by declaring it to be a
ThreadLocal<>
variable instead, that let's you provide an initialization function, which would define a new instance every time before it's accessed in a new thread.