Skip to content

[SPARK-52880][CORE] Improve toString by JEP-280 instead of ToStringBuilder #51572

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Jul 19, 2025

What changes were proposed in this pull request?

This PR aims to improve toString by JEP-280 instead of ToStringBuilder. In addition, Scalastyle and Checkstyle rules are added to prevent a future regression.

Why are the changes needed?

Since Java 9, String Concatenation has been handled better by default.

ID DESCRIPTION
JEP-280 Indify String Concatenation

For example, this PR improves OpenBlocks like the following. Both Java source code and byte code are simplified a lot by utilizing JEP-280 properly.

CODE CHANGE

- return new ToStringBuilder(this, ToStringStyle.SHORT_PREFIX_STYLE)
-   .append("appId", appId)
-   .append("execId", execId)
-   .append("blockIds", Arrays.toString(blockIds))
-   .toString();
+ return "OpenBlocks[appId=" + appId + ",execId=" + execId + ",blockIds=" +
+     Arrays.toString(blockIds) + "]";

BEFORE

  public java.lang.String toString();
    Code:
       0: new           #39                 // class org/apache/commons/lang3/builder/ToStringBuilder
       3: dup
       4: aload_0
       5: getstatic     #41                 // Field org/apache/commons/lang3/builder/ToStringStyle.SHORT_PREFIX_STYLE:Lorg/apache/commons/lang3/builder/ToStringStyle;
       8: invokespecial #47                 // Method org/apache/commons/lang3/builder/ToStringBuilder."<init>":(Ljava/lang/Object;Lorg/apache/commons/lang3/builder/ToStringStyle;)V
      11: ldc           #50                 // String appId
      13: aload_0
      14: getfield      #7                  // Field appId:Ljava/lang/String;
      17: invokevirtual #51                 // Method org/apache/commons/lang3/builder/ToStringBuilder.append:(Ljava/lang/String;Ljava/lang/Object;)Lorg/apache/commons/lang3/builder/ToStringBuilder;
      20: ldc           #55                 // String execId
      22: aload_0
      23: getfield      #13                 // Field execId:Ljava/lang/String;
      26: invokevirtual #51                 // Method org/apache/commons/lang3/builder/ToStringBuilder.append:(Ljava/lang/String;Ljava/lang/Object;)Lorg/apache/commons/lang3/builder/ToStringBuilder;
      29: ldc           #56                 // String blockIds
      31: aload_0
      32: getfield      #16                 // Field blockIds:[Ljava/lang/String;
      35: invokestatic  #57                 // Method java/util/Arrays.toString:([Ljava/lang/Object;)Ljava/lang/String;
      38: invokevirtual #51                 // Method org/apache/commons/lang3/builder/ToStringBuilder.append:(Ljava/lang/String;Ljava/lang/Object;)Lorg/apache/commons/lang3/builder/ToStringBuilder;
      41: invokevirtual #61                 // Method org/apache/commons/lang3/builder/ToStringBuilder.toString:()Ljava/lang/String;
      44: areturn

AFTER

  public java.lang.String toString();
    Code:
       0: aload_0
       1: getfield      #7                  // Field appId:Ljava/lang/String;
       4: aload_0
       5: getfield      #13                 // Field execId:Ljava/lang/String;
       8: aload_0
       9: getfield      #16                 // Field blockIds:[Ljava/lang/String;
      12: invokestatic  #39                 // Method java/util/Arrays.toString:([Ljava/lang/Object;)Ljava/lang/String;
      15: invokedynamic #43,  0             // InvokeDynamic #0:makeConcatWithConstants:(Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;)Ljava/lang/String;
      20: areturn

Does this PR introduce any user-facing change?

No. This is an toString implementation improvement.

How was this patch tested?

Pass the CIs.

Was this patch authored or co-authored using generative AI tooling?

No.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-52880][CORE] Improve toString by JEP-280 instead of ToStringBuilder [SPARK-52880][CORE] Improve toString by JEP-280 instead of ToStringBuilder Jul 19, 2025
@dongjoon-hyun
Copy link
Member Author

@@ -190,6 +190,10 @@
<property name="format" value="new URL\("/>
<property name="message" value="Use URI.toURL or URL.of instead of URL constructors." />
</module>
<module name="RegexpSinglelineJava">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should ban the use of org.apache.commons.lang.builder.ToStringBuilder simultaneously, even though it is not currently being used, because the commons-lang/2.6//commons-lang-2.6.jar is still a dependency of Spark.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for review, @LuciferYang . We already banned lang2 package completely.

spark/scalastyle-config.xml

Lines 285 to 289 in c717624

<check customId="commonslang2" level="error" class="org.scalastyle.file.RegexChecker" enabled="true">
<parameters><parameter name="regex">org\.apache\.commons\.lang\.</parameter></parameters>
<customMessage>Use Commons Lang 3 classes (package org.apache.commons.lang3.*) instead
of Commons Lang 2 (package org.apache.commons.lang.*)</customMessage>
</check>

@@ -299,6 +299,11 @@ This file is divided into 3 sections:
<customMessage>Use org.apache.spark.util.Pair instead</customMessage>
</check>

<check customId="commonslang3tuple" level="error" class="org.scalastyle.file.RegexChecker" enabled="true">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

Comment on lines 339 to 340
return "TransportClient[remoteAddress=" + channel.remoteAddress() + "clientId=" + clientId +
"isActive=" + isActive() + "]";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return "TransportClient[remoteAddress=" + channel.remoteAddress() + "clientId=" + clientId +
"isActive=" + isActive() + "]";
return "TransportClient[remoteAddress=" + channel.remoteAddress() + ",clientId=" + clientId +
",isActive=" + isActive() + "]";

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

.toString();
return "FetchShuffleBlockChunks[appId=" + appId + ",execId=" + execId +
",shuffleId=" + shuffleId + ",shuffleMergeId=" + shuffleMergeId +
",reduceIds=" + Arrays.toString(reduceIds) + ",chunkIds=" + Arrays.toString(chunkIds) + "]";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why changing chunkIds from Arrays.deepToString to Arrays.toString?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, thank you. I'll fix it.

Comment on lines +65 to +67
return "FetchShuffleBlocks[appId=" + appId + ",execId=" + execId + ",shuffleId=" + shuffleId +
",mapIds=" + Arrays.toString(mapIds) + ",reduceIds=" + Arrays.deepToString(reduceIds) +
",batchFetchEnabled=" + batchFetchEnabled + "]";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds more fields than before?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It came from toStringHelper() method at line 65. Technically, this PR removed toStringHelper usage.

public ToStringBuilder toStringHelper() {
return new ToStringBuilder(this, ToStringStyle.SHORT_PREFIX_STYLE)
.append("appId", appId)
.append("execId", execId)
.append("shuffleId", shuffleId);
}

@@ -43,12 +43,14 @@ protected AbstractFetchShuffleBlocks(
this.shuffleId = shuffleId;
}

// checkstyle.off: RegexpSinglelineJava
public ToStringBuilder toStringHelper() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably not be a public api, since deleting it doesn't cause a mima check failure.

Copy link
Member Author

@dongjoon-hyun dongjoon-hyun Jul 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, I hope to remove this eventually later because this method is not used from now.

However, this PR aims to focus on toString method improvement only in order to avoid any discussion about this method toStringHelper.

/**
* Base class for fetch shuffle blocks and chunks.
*
* @since 3.2.0
*/
public abstract class AbstractFetchShuffleBlocks extends BlockTransferMessage {
public final String appId;
public final String execId;
public final int shuffleId;
protected AbstractFetchShuffleBlocks(
String appId,
String execId,
int shuffleId) {
this.appId = appId;
this.execId = execId;
this.shuffleId = shuffleId;
}
public ToStringBuilder toStringHelper() {

@dongjoon-hyun
Copy link
Member Author

Thank you, @LuciferYang and @viirya . I addressed your comments and replied.

Copy link
Contributor

@peter-toth peter-toth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a nit that maybe we could use .getClass().getSimpleName()s instead of duplicating the class names.

@dongjoon-hyun
Copy link
Member Author

Thank you, @peter-toth and @cloud-fan .

@dongjoon-hyun
Copy link
Member Author

To @peter-toth , you are right that it's a generally good approach. Actually, I intentionally avoid that generalized pattern here due to the nature of three additional operations; two additional invokevirtuals and one additional string argument at makeConcatWithConstants at the end. For this PR, let's keep toString as simple/faster as possible for now because this is Spark internals. We may want to generalize it later.

       0: aload_0
       1: invokevirtual #39                 // Method java/lang/Object.getClass:()Ljava/lang/Class;
       4: invokevirtual #43                 // Method java/lang/Class.getSimpleName:()Ljava/lang/String;

@dongjoon-hyun
Copy link
Member Author

Merged to master for Apache Spark 4.1.0.

Thank you, @LuciferYang , @viirya , @peter-toth , @cloud-fan , @MaxGekk !

@dongjoon-hyun dongjoon-hyun deleted the SPARK-52880 branch July 21, 2025 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants