-
Notifications
You must be signed in to change notification settings - Fork 365
Bug Fix: SSE Events lines MUST NOT contain \r #5868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@jansupol This PR fails because of incorrect Copyright check in Jersey's POM.xml. According to Eclipse Foundation's rules, all projects MUST accept the short form having only the initial publication date (see https://www.eclipse.org/projects/handbook/#ip-copyright-headers). Apparently Jersey's POM.xml expects to find the latest date, which is wrong. What is your decision how to proceed? |
@mkarg In Jersey project, we follow the advice of Oracle legal department to contain the copyright year with the last year of a change. This is enforced by the glassfish copyright plugin created for that purpose. Do you have a hard time to increase the copyright year in the changed files? |
You mean, besides me being an Eclipse Committer Member bound solely to Eclipse Foundation rules, not employed with Oracle, not bound to Oracle-internal rules? The EF is pretty clear here:
|
@jansupol FYI: Fixed Copyright according Oracle rules. |
Apparently you did not find the time to review / merge this PR, so I used the time to author a commit ontop with a unit test for |
@jansupol Anything more needed to review / merge this bug fix? 🤔 |
This is what I did: I created a brief test as follows:
The OutputStream performance behaves differently for SIZE= 100 & SIZE=10000. The OutputStream0 is better for short messages, OutputStream1 for large messages (SIZE > 10000), but OutputStream2 is now the slowest. What exactly is the purpose of this change? The original PR mentioned performance, this mentions the \r data in the message, but the real reason to me was the empty message at the end. Can you provide a use-case which justifies the change in SSE? Thanks. |
TL;DR: The purpose of this PR is not performance but correctness solely, w.r.t to what is told in the PR's description (this PR is just a bug fix). Performance will get recovered by a subsequent PR. Explanation: The original PR you mention had the intention of improving performance, but we both agreed that it fails because it fixes one bug but opens another bug, plus there since ever was already a bug with
|
While I agree that the current state does not work exactly as the SSE standard describes for the corner case of sending new lines, and there is an extra unnecessary empty message, I do not see a legitimate reason for making a change that sacrifices the performance. I agree that the change might be beneficial, but only if we had a similar performance.
Sorry, we cannot do a merge that significantly changes the performance with a hope that some future work may fix it, knowing that it may never come. I am sure you understand this. |
I do not agree that bug fixes must only get merged if they do not sacrifice performance, as this is rather often that case, actually. Nevertheless, I will start benchmarking with my already developed performance improvement, so we have comparable numbers. |
TL;DR: Here is the speed-optimized SSE code. 😃 Sorry for the delay. I was bound in other projects. I have now put some commits ontop providing superior performance, so in no case the corrected code is slower than the incorrect code. Benchmarks have proven that the new code (including the performance tweaks) is faster than the incorrect code in all cases. This mostly stems from the fact that I have implemented Having said that, here are the benchmark results:
As the numbers show, the new solution is just slightly faster for super-short messages, but gets increasingly faster the longer the message gets, while the original code was always slow independent of message length. Rather every non-trivial message bears hundreds to thousands of times better performance. While it looks impressive that there is a nearly 11.000x boost @ 100k message size, certainly most SSE messages in reality are rather tiny. I modified your benchmark in several ways:
FYI, here is the updated source code of the benchmark: public class SsePerf {
public static void main(String[] args) throws Throwable {
Main.main(args);
}
@State(Scope.Thread)
public static class MyState {
@Param({"1", "10", "100", "1000", "10000", "100000"})
public int SIZE;
private byte[] data;
private ServerSocketChannel serverChannel;
private SocketChannel clientChannel;
private OutputStream os;
private volatile boolean running = true;
@Setup(Level.Trial)
public void beforeTrial() throws IOException {
this.data = SIZE < 2 ? "X".repeat(SIZE).getBytes(UTF_8) : ("X".repeat(SIZE / 2 - 1) + "\r\n" + "X".repeat(SIZE / 2 - 1)).getBytes(UTF_8);
this.serverChannel = ServerSocketChannel.open().bind(new InetSocketAddress(0));
final var serverAddress = serverChannel.getLocalAddress();
Thread.ofVirtual().start(() -> {
try (final var acceptedChannel = serverChannel.accept(); final var is = Channels.newInputStream(acceptedChannel.socket().getChannel())) {
while (this.running)
is.skip(Integer.MAX_VALUE);
} catch (final IOException e) {
throw new UncheckedIOException(e);
}
});
clientChannel = SocketChannel.open(serverAddress);
os = Channels.newOutputStream(clientChannel);
}
@TearDown(Level.Trial)
public void afterTrial() throws IOException {
this.running = false;
this.os.close();
this.clientChannel.close();
this.serverChannel.close();
}
@Benchmark
public void optimizedCode(MyState state) throws IOException {
try (final var dls = new DataLeadStream(state.os)) {
for (int i = 0, n = 1000000 / state.SIZE; i < n; i++)
dls.write(state.data);
dls.finish();
}
}
private static final byte[] DATA_LEAD = "data: ".getBytes(UTF_8);
private static final byte[] EOL = {'\n'};
static final class DataLeadStream extends OutputStream {
private final OutputStream entityStream;
private int lastChar = -1;
DataLeadStream(final OutputStream entityStream) {
this.entityStream = entityStream;
}
@Override
public void write(final int i) throws IOException {
if (lastChar == -1) {
entityStream.write(DATA_LEAD);
} else if (lastChar != '\n' && lastChar != '\r') {
entityStream.write(lastChar);
} else if (lastChar == '\n' || lastChar == '\r' && i != '\n') {
entityStream.write(EOL);
entityStream.write(DATA_LEAD);
}
lastChar = i;
}
private static int indexOfEol(final byte[] b, final int fromIndex, final int toIndex) {
for (var i = fromIndex; i < toIndex; i++) {
if (b[i] == '\n' || b[i] == '\r') {
return i;
}
}
return -1;
}
@Override
public void write(final byte[] b, final int off, final int len) throws IOException {
Objects.checkFromIndexSize(off, len, b.length);
if (len == 0) {
return;
}
write(b[off]);
if (len > 1) {
final var end = off + len - 1;
var i = off;
for (var j = indexOfEol(b, i, end); j != -1; j = indexOfEol(b, i, end)) {
entityStream.write(b, i, j - i);
entityStream.write(EOL);
entityStream.write(DATA_LEAD);
if (b[j] == '\r' && b[j + 1] == '\n') {
j++;
}
i = ++j;
}
if (i < end) {
entityStream.write(b, i, end - i);
}
lastChar = b[end];
}
}
void finish() throws IOException {
if (lastChar != -1) {
write(-1);
}
}
}
@Benchmark
public void originalCode(MyState state) throws IOException {
try (final var dls = new DataLeadStream0(state.os)) {
for (int i = 0, n = 1000000 / state.SIZE; i < n; i++)
dls.write(state.data);
}
}
private static final class DataLeadStream0 extends OutputStream {
private final OutputStream entityStream;
private boolean start = true;
private DataLeadStream0(OutputStream entityStream) {
this.entityStream = entityStream;
}
@Override
public void write(final int i) throws IOException {
if (start) {
entityStream.write(DATA_LEAD);
start = false;
}
entityStream.write(i);
if (i == '\n') {
entityStream.write(DATA_LEAD);
}
}
}
} |
@jansupol Build on Java 21 works fine, but fails on Java 11 with OSGI errors outside of the scope of this MR. So I assume it is not my fault. Can you please check this? Thanks. |
Kindly asking for review. |
@mkarg Your test results look marvelous, I will check, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks!
According to https://html.spec.whatwg.org/multipage/server-sent-events.html#parsing-an-event-stream any line within an SSE Event MUST NOT contain any of the characters
\n
,\r
nor the combination\r\n
.This PR also contains performance improvements, so the corrected code is in no case slower than the incorrect code, but even outperforms the original code in many cases. It replaces #5832.