-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Add support for Spark Connect #2569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Spark Connect #2569
Conversation
/hold for review |
6ec9e84
to
ae49a2b
Compare
|
||
// mutateServerService mutates the server service for the SparkConnect resource. | ||
func (r *Reconciler) mutateServerService(ctx context.Context, conn *v1alpha1.SparkConnect, svc *corev1.Service) error { | ||
if svc.CreationTimestamp.IsZero() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to ensure this is applied every reconciliation loop? Not just the first time it's created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should set immutable fields when creating the server pod. For mutable fields, we can try to update it in every reconciliation.
This is great! Thanks @ChenYi015 for the PR! A quick question, Spark Connect will need a GRPC ingress to expose the driver side Spark Connect server endpoint, similar like the HTTP ingress to expose Spark UI. Does this PR contain code to create such GRPC ingress? |
Signed-off-by: Yi Chen <[email protected]>
Signed-off-by: Yi Chen <[email protected]>
Signed-off-by: Yi Chen <[email protected]>
Signed-off-by: Yi Chen <[email protected]>
Signed-off-by: Yi Chen <[email protected]>
Signed-off-by: Yi Chen <[email protected]>
Signed-off-by: Yi Chen <[email protected]>
7db7c03
to
c96c667
Compare
Signed-off-by: Yi Chen <[email protected]>
Signed-off-by: Yi Chen <[email protected]>
Signed-off-by: Yi Chen <[email protected]>
c96c667
to
cadc7e5
Compare
Signed-off-by: Yi Chen <[email protected]>
…op.' Signed-off-by: Yi Chen <[email protected]>
cadc7e5
to
6715caf
Compare
@hiboyang Have not included it yet. We will implement this feature in this or following PRs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!!
I think it will just be HTTP/2. You can port forward the pod/svc on 15002 then connect from your local. |
Will merge this PR and improve it in the following PRs. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ChenYi015 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/unhold |
I am very happy to see the adding spark-connect, thank you When will be release spark-connect? |
+1 |
1 similar comment
+1 |
@ChenYi015 Thank you for your work on this!
Am I missing something? |
@rafagsiqueira Please check whether the namespace of SparkConnect is included in spark.jobNamespaces, otherwise it will not be processed by the operater. |
@ChenYi015 I suspected that, but since I used the same namespace as the spark operator, I thought it wouldn't be an issue. Let me try with a different namespace. |
@ChenYi015 that was indeed the problem. Thank you very much for clarifying! Looking forward to using my newly deployed spark connect. |
This is great! The example yaml specifies spark 4.0.0, but I'm limited to using spark 3.5 to use sedona 1.7.2. What versions of spark is this CRD limited to? edit: spec:
sparkVersion: 3.5.4
sparkConf:
spark.jars.packages: org.apache.spark:spark-connect_2.12:3.5.4
spark.driver.extraJavaOptions: "-Divy.cache.dir=/tmp -Divy.home=/tmp"
spark.jars.ivy: /tmp/.ivy2 |
@torsol For Spark v4, the spark connect jar is included by default. But for Spark 3.5, you will need to build a image which contains the spark connect jar or use ivy to specify it as a dependency. |
Purpose of this PR
Close #1801.
Proposed changes:
SparkConnect
SparkConnect
manifestChange Category
Rationale
Checklist
Additional Notes