-
Notifications
You must be signed in to change notification settings - Fork 6.1k
Reduce size of j.t.f.DateTimePrintContext::adjust #26633
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Reduce size of j.t.f.DateTimePrintContext::adjust #26633
Conversation
👋 Welcome back swen! A progress list of the required criteria for merging this PR into |
❗ This change is not yet ready to be integrated. |
Would you mind sharing the details about the performance improvement you observed, please? On which platforms? Using which benchmarks? (Any in |
By running DateTimeFormatterBench, we can see that the current proposal can improve performance by about 5% in multiple scenarios. The detailed data is as follows 1. Running performance test scripts
2. Performance results in multiple architectures2.1. MacBook M1 Pro (aarch64)-# before
-Benchmark (pattern) Mode Cnt Score Error Units
-DateTimeFormatterBench.formatInstants HH:mm:ss thrpt 15 15.053 ± 0.281 ops/ms
-DateTimeFormatterBench.formatInstants HH:mm:ss.SSS thrpt 15 10.664 ± 0.119 ops/ms
-DateTimeFormatterBench.formatInstants yyyy-MM-dd'T'HH:mm:ss thrpt 15 8.994 ± 0.579 ops/ms
-DateTimeFormatterBench.formatInstants yyyy-MM-dd'T'HH:mm:ss.SSS thrpt 15 6.943 ± 0.370 ops/ms
-DateTimeFormatterBench.formatZonedDateTime HH:mm:ss thrpt 15 22.938 ± 0.056 ops/ms
-DateTimeFormatterBench.formatZonedDateTime HH:mm:ss.SSS thrpt 15 15.882 ± 0.144 ops/ms
-DateTimeFormatterBench.formatZonedDateTime yyyy-MM-dd'T'HH:mm:ss thrpt 15 13.302 ± 0.518 ops/ms
-DateTimeFormatterBench.formatZonedDateTime yyyy-MM-dd'T'HH:mm:ss.SSS thrpt 15 10.153 ± 0.075 ops/ms
+# after
+Benchmark (pattern) Mode Cnt Score Error Units
+DateTimeFormatterBench.formatInstants HH:mm:ss thrpt 15 16.083 ± 0.178 ops/ms +6.84%
+DateTimeFormatterBench.formatInstants HH:mm:ss.SSS thrpt 15 11.548 ± 0.171 ops/ms +8.28%
+DateTimeFormatterBench.formatInstants yyyy-MM-dd'T'HH:mm:ss thrpt 15 9.835 ± 0.064 ops/ms +9.35%
+DateTimeFormatterBench.formatInstants yyyy-MM-dd'T'HH:mm:ss.SSS thrpt 15 7.344 ± 0.308 ops/ms +5.77%
+DateTimeFormatterBench.formatZonedDateTime HH:mm:ss thrpt 15 22.500 ± 0.147 ops/ms -1.90%
+DateTimeFormatterBench.formatZonedDateTime HH:mm:ss.SSS thrpt 15 16.592 ± 0.086 ops/ms +4.47%
+DateTimeFormatterBench.formatZonedDateTime yyyy-MM-dd'T'HH:mm:ss thrpt 15 13.182 ± 0.829 ops/ms -0.90%
+DateTimeFormatterBench.formatZonedDateTime yyyy-MM-dd'T'HH:mm:ss.SSS thrpt 15 10.127 ± 0.140 ops/ms -0.25% 2.2. aliyun_c8y (aarch64 CPU Aliyun_Yitian 710)-# before
-Benchmark (pattern) Mode Cnt Score Error Units
-DateTimeFormatterBench.formatInstants HH:mm:ss thrpt 15 9.916 ± 0.073 ops/ms
-DateTimeFormatterBench.formatInstants HH:mm:ss.SSS thrpt 15 7.416 ± 0.013 ops/ms
-DateTimeFormatterBench.formatInstants yyyy-MM-dd'T'HH:mm:ss thrpt 15 6.248 ± 0.082 ops/ms
-DateTimeFormatterBench.formatInstants yyyy-MM-dd'T'HH:mm:ss.SSS thrpt 15 4.988 ± 0.042 ops/ms
-DateTimeFormatterBench.formatZonedDateTime HH:mm:ss thrpt 15 13.653 ± 0.073 ops/ms
-DateTimeFormatterBench.formatZonedDateTime HH:mm:ss.SSS thrpt 15 9.435 ± 0.033 ops/ms
-DateTimeFormatterBench.formatZonedDateTime yyyy-MM-dd'T'HH:mm:ss thrpt 15 7.550 ± 0.145 ops/ms
-DateTimeFormatterBench.formatZonedDateTime yyyy-MM-dd'T'HH:mm:ss.SSS thrpt 15 6.096 ± 0.301 ops/ms
+# after
+Benchmark (pattern) Mode Cnt Score Error Units
+DateTimeFormatterBench.formatInstants HH:mm:ss thrpt 15 10.897 ± 0.047 ops/ms +9.89%
+DateTimeFormatterBench.formatInstants HH:mm:ss.SSS thrpt 15 7.449 ± 0.086 ops/ms +0.44%
+DateTimeFormatterBench.formatInstants yyyy-MM-dd'T'HH:mm:ss thrpt 15 6.979 ± 0.025 ops/ms +11.69%
+DateTimeFormatterBench.formatInstants yyyy-MM-dd'T'HH:mm:ss.SSS thrpt 15 4.950 ± 0.024 ops/ms -0.76%
+DateTimeFormatterBench.formatZonedDateTime HH:mm:ss thrpt 15 14.206 ± 0.206 ops/ms +4.05%
+DateTimeFormatterBench.formatZonedDateTime HH:mm:ss.SSS thrpt 15 9.957 ± 0.182 ops/ms +5.53%
+DateTimeFormatterBench.formatZonedDateTime yyyy-MM-dd'T'HH:mm:ss thrpt 15 7.705 ± 0.093 ops/ms +2.05%
+DateTimeFormatterBench.formatZonedDateTime yyyy-MM-dd'T'HH:mm:ss.SSS thrpt 15 6.222 ± 0.235 ops/ms +2.06% 2.3. OrangePi 5 (aarch64 CPU RK3588)-# before
-Benchmark (pattern) Mode Cnt Score Error Units
-DateTimeFormatterBench.formatInstants HH:mm:ss thrpt 15 6.118 ± 0.417 ops/ms
-DateTimeFormatterBench.formatInstants HH:mm:ss.SSS thrpt 15 4.312 ± 0.202 ops/ms
-DateTimeFormatterBench.formatInstants yyyy-MM-dd'T'HH:mm:ss thrpt 15 4.047 ± 0.107 ops/ms
-DateTimeFormatterBench.formatInstants yyyy-MM-dd'T'HH:mm:ss.SSS thrpt 15 2.773 ± 0.204 ops/ms
-DateTimeFormatterBench.formatZonedDateTime HH:mm:ss thrpt 15 9.228 ± 0.232 ops/ms
-DateTimeFormatterBench.formatZonedDateTime HH:mm:ss.SSS thrpt 15 5.640 ± 0.223 ops/ms
-DateTimeFormatterBench.formatZonedDateTime yyyy-MM-dd'T'HH:mm:ss thrpt 15 5.109 ± 0.076 ops/ms
-DateTimeFormatterBench.formatZonedDateTime yyyy-MM-dd'T'HH:mm:ss.SSS thrpt 15 3.407 ± 0.168 ops/ms
+# after
+Benchmark (pattern) Mode Cnt Score Error Units
+DateTimeFormatterBench.formatInstants HH:mm:ss thrpt 15 6.080 ± 0.451 ops/ms -0.62%
+DateTimeFormatterBench.formatInstants HH:mm:ss.SSS thrpt 15 4.341 ± 0.219 ops/ms +0.67%
+DateTimeFormatterBench.formatInstants yyyy-MM-dd'T'HH:mm:ss thrpt 15 4.009 ± 0.207 ops/ms -0.93%
+DateTimeFormatterBench.formatInstants yyyy-MM-dd'T'HH:mm:ss.SSS thrpt 15 2.928 ± 0.104 ops/ms +5.58%
+DateTimeFormatterBench.formatZonedDateTime HH:mm:ss thrpt 15 9.670 ± 0.139 ops/ms +4.78%
+DateTimeFormatterBench.formatZonedDateTime HH:mm:ss.SSS thrpt 15 5.835 ± 0.234 ops/ms +3.45%
+DateTimeFormatterBench.formatZonedDateTime yyyy-MM-dd'T'HH:mm:ss thrpt 15 5.149 ± 0.153 ops/ms +0.78%
+DateTimeFormatterBench.formatZonedDateTime yyyy-MM-dd'T'HH:mm:ss.SSS thrpt 15 3.553 ± 0.131 ops/ms +4.28% 2.4 aliyun_c9i (x64 CPU Intel Xeon Granite Rapids)-# before
-Benchmark (pattern) Mode Cnt Score Error Units
-DateTimeFormatterBench.formatInstants HH:mm:ss thrpt 15 15.718 ± 1.343 ops/ms
-DateTimeFormatterBench.formatInstants HH:mm:ss.SSS thrpt 15 11.758 ± 0.638 ops/ms
-DateTimeFormatterBench.formatInstants yyyy-MM-dd'T'HH:mm:ss thrpt 15 10.529 ± 0.057 ops/ms
-DateTimeFormatterBench.formatInstants yyyy-MM-dd'T'HH:mm:ss.SSS thrpt 15 7.945 ± 0.304 ops/ms
-DateTimeFormatterBench.formatZonedDateTime HH:mm:ss thrpt 15 20.441 ± 0.067 ops/ms
-DateTimeFormatterBench.formatZonedDateTime HH:mm:ss.SSS thrpt 15 13.975 ± 0.062 ops/ms
-DateTimeFormatterBench.formatZonedDateTime yyyy-MM-dd'T'HH:mm:ss thrpt 15 11.721 ± 0.070 ops/ms
-DateTimeFormatterBench.formatZonedDateTime yyyy-MM-dd'T'HH:mm:ss.SSS thrpt 15 8.604 ± 0.073 ops/ms
+# after
+Benchmark (pattern) Mode Cnt Score Error Units
+DateTimeFormatterBench.formatInstants HH:mm:ss thrpt 15 16.745 ± 0.129 ops/ms +6.53%
+DateTimeFormatterBench.formatInstants HH:mm:ss.SSS thrpt 15 11.571 ± 1.085 ops/ms -1.52%
+DateTimeFormatterBench.formatInstants yyyy-MM-dd'T'HH:mm:ss thrpt 15 10.591 ± 0.046 ops/ms +0.58%
+DateTimeFormatterBench.formatInstants yyyy-MM-dd'T'HH:mm:ss.SSS thrpt 15 7.899 ± 0.323 ops/ms -0.57%
+DateTimeFormatterBench.formatZonedDateTime HH:mm:ss thrpt 15 20.993 ± 0.082 ops/ms +2.70%
+DateTimeFormatterBench.formatZonedDateTime HH:mm:ss.SSS thrpt 15 14.123 ± 0.094 ops/ms +1.05%
+DateTimeFormatterBench.formatZonedDateTime yyyy-MM-dd'T'HH:mm:ss thrpt 15 11.781 ± 0.054 ops/ms +0.51%
+DateTimeFormatterBench.formatZonedDateTime yyyy-MM-dd'T'HH:mm:ss.SSS thrpt 15 8.723 ± 0.055 ops/ms +1.38% 2.5 MacBook 2019 (x64 intel i9)-# before
-Benchmark (pattern) Mode Cnt Score Error Units
-DateTimeFormatterBench.formatInstants HH:mm:ss thrpt 15 12.393 ± 0.985 ops/ms
-DateTimeFormatterBench.formatInstants HH:mm:ss.SSS thrpt 15 9.658 ± 0.220 ops/ms
-DateTimeFormatterBench.formatInstants yyyy-MM-dd'T'HH:mm:ss thrpt 15 8.321 ± 0.098 ops/ms
-DateTimeFormatterBench.formatInstants yyyy-MM-dd'T'HH:mm:ss.SSS thrpt 15 6.575 ± 0.073 ops/ms
-DateTimeFormatterBench.formatZonedDateTime HH:mm:ss thrpt 15 18.946 ± 0.387 ops/ms
-DateTimeFormatterBench.formatZonedDateTime HH:mm:ss.SSS thrpt 15 12.156 ± 0.316 ops/ms
-DateTimeFormatterBench.formatZonedDateTime yyyy-MM-dd'T'HH:mm:ss thrpt 15 9.980 ± 0.115 ops/ms
-DateTimeFormatterBench.formatZonedDateTime yyyy-MM-dd'T'HH:mm:ss.SSS thrpt 15 7.729 ± 0.160 ops/ms
+# after
+Benchmark (pattern) Mode Cnt Score Error Units
+DateTimeFormatterBench.formatInstants HH:mm:ss thrpt 15 13.942 ± 0.257 ops/ms +12.49%
+DateTimeFormatterBench.formatInstants HH:mm:ss.SSS thrpt 15 9.882 ± 0.327 ops/ms +2.42%
+DateTimeFormatterBench.formatInstants yyyy-MM-dd'T'HH:mm:ss thrpt 15 9.234 ± 0.240 ops/ms +10.97%
+DateTimeFormatterBench.formatInstants yyyy-MM-dd'T'HH:mm:ss.SSS thrpt 15 6.594 ± 0.074 ops/ms +2.89%
+DateTimeFormatterBench.formatZonedDateTime HH:mm:ss thrpt 15 19.482 ± 0.385 ops/ms +2.82%
+DateTimeFormatterBench.formatZonedDateTime HH:mm:ss.SSS thrpt 15 12.917 ± 0.244 ops/ms +6.26%
+DateTimeFormatterBench.formatZonedDateTime yyyy-MM-dd'T'HH:mm:ss thrpt 15 10.196 ± 0.142 ops/ms +2.16%
+DateTimeFormatterBench.formatZonedDateTime yyyy-MM-dd'T'HH:mm:ss.SSS thrpt 15 7.793 ± 0.136 ops/ms +8.28% |
@@ -149,6 +149,12 @@ private static TemporalAccessor adjust(final TemporalAccessor temporal, DateTime | |||
Chronology chrono = Objects.requireNonNullElse(effectiveChrono, IsoChronology.INSTANCE); | |||
return chrono.zonedDateTime(Instant.from(temporal), overrideZone); | |||
} | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tested the split at different locations? I would expect line 143 or line 130 to perform best. There may be a case for splitting multiple times. You will also need a comment to indicate why the method is split.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory, the smaller the method, the greater the opportunity for performance improvement, but in DateTimeFormatterWithPaddingBench and DateTimeFormatterBench, the improvement is the same.
make test TEST="micro:java.time.format.DateTimeFormatterWithPaddingBench"
-# before
-Benchmark Mode Cnt Score Error Units
-DateTimeFormatterWithPaddingBench.formatWithPadding thrpt 15 11823.795 ± 221.488 ops/ms
-DateTimeFormatterWithPaddingBench.formatWithPaddingLengthOne thrpt 15 34634.499 ± 91.144 ops/ms
-DateTimeFormatterWithPaddingBench.formatWithPaddingLengthZero thrpt 15 39070.959 ± 198.192 ops/ms
+# after
+Benchmark Mode Cnt Score Error Units
+DateTimeFormatterWithPaddingBench.formatWithPadding thrpt 15 12257.729 ± 49.449 ops/ms +3.67%
+DateTimeFormatterWithPaddingBench.formatWithPaddingLengthOne thrpt 15 37689.494 ± 117.526 ops/ms +8.82%
+DateTimeFormatterWithPaddingBench.formatWithPaddingLengthZero thrpt 15 43628.181 ± 493.395 ops/ms +11.66% In DateTimeFormatterWithPaddingBench, I found that this proposal improved performance even more. This is because DateTimeFormatter is statically defined as final, and DateTimeFormatter::getChronology and DateTimeFormatter::getZone used in the adjust method are final fields. This eliminates dead code branches during inlining by the C2 optimizer.
public class DateTimeFormatterWithPaddingBench {
private static final DateTimeFormatter FORMATTER_WITH_PADDING = new DateTimeFormatterBuilder()
.appendLiteral("Date:")
.padNext(20, ' ')
.append(DateTimeFormatter.ISO_DATE)
.toFormatter();
private static final DateTimeFormatter FORMATTER_WITH_PADDING_ZERO = new DateTimeFormatterBuilder()
.appendLiteral("Year:")
.padNext(4)
.appendValue(ChronoField.YEAR)
.toFormatter();
private static final DateTimeFormatter FORMATTER_WITH_PADDING_ONE = new DateTimeFormatterBuilder()
.appendLiteral("Year:")
.padNext(5)
.appendValue(ChronoField.YEAR)
.toFormatter();
final class DateTimeFormatter {
private final Chronology chrono;
private final ZoneId zone;
public Chronology getChronology() {
return chrono;
}
public ZoneId getZone() {
return zone;
}
}
final class DateTimePrintContext {
private static TemporalAccessor adjust(final TemporalAccessor temporal, DateTimeFormatter formatter) {
Chronology overrideChrono = formatter.getChronology();
ZoneId overrideZone = formatter.getZone();
if (overrideChrono == null && overrideZone == null) {
return temporal;
}
// ....
}
} |
By adding the JVM startup parameters
-XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining
and analyzing the printed log information, and found that the code size of the j.t.f.DateTimePrintContext::adjust method is 382, which is greater than 325, causing inlining failure.By splitting the code into
common/uncommon
, and moving the uncommon code into adjust0, the adjust method is kept small and can be inlined by the C2 optimizer.Progress
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/26633/head:pull/26633
$ git checkout pull/26633
Update a local copy of the PR:
$ git checkout pull/26633
$ git pull https://git.openjdk.org/jdk.git pull/26633/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 26633
View PR using the GUI difftool:
$ git pr show -t 26633
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/26633.diff