Add new streaming mode for LargeText#722
Conversation
- Clients request this new streaming mode via the request header `X-Streaming: true`. The server echoes it back to signal support for it. - A final new line is added to the body with a JSON payload that includes the start/end/completed state plus additional metadata from child classes. E.g. AnnotatedLargeText in Jenkins core will add the ConsoleAnnotator state when incrementally reading annotations. We are effectively creating a multi-part response with the first part being the plain-text/html console log followed by the second part with the metadata. - Bonus: Drop the line end normalization that was required for internet explorer. IE is long dead. - Bonus: Read the log in larger chunks of 64KiB and skip the buffering of each line. 1kiB disk reads are far too small for today's disks, from NVMe disks to spinning rust; with all the other buffering removed, going to 1MiB would probably be fine too. - Bonus: Add support for tailing via a negative `?start` query parameter in LargeText. It will try to find the next new line boundary to start reading from and fallback to the fixed tail if missing. The JSON blob will contain "startFromNewLine":true when starting from a new line. The same heuristic is available when fetching more bytes via `?searchNewLineUntil=<last start>`.
- Clients request this new streaming mode via the standard request
header 'Accept: multipart/form-data'. The server responds with a
standard multipart response, i.e.
'Content-Type: multipart/form-data;boundary=<uuid>'.
- The browser fetch api has support for reading form-data multipart
responses using 'response.formData()'.
- The multipart response contains two parts:
- "name=text", the original LargeText content and encoding as provided
by the source (default UTF-8) via 'setContentType()'.
- "name=meta", the json payload with streaming metadata as described
in the previous commit.
core/src/main/java/org/kohsuke/stapler/framework/io/LargeText.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/kohsuke/stapler/framework/io/LargeText.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/kohsuke/stapler/framework/io/LargeText.java
Outdated
Show resolved
Hide resolved
core/src/test/java/org/kohsuke/stapler/framework/io/LargeTextTest.java
Outdated
Show resolved
Hide resolved
- use descriptive variables - move ?searchNewLineUntil parameter name into constant - use text blocks instead of escaping JSON in simple strings Co-authored-by: Tim Jacomb <21194782+timja@users.noreply.github.com>
|
Thanks again for the review, Tim. I've addressed your points in 3a6f536. |
core/src/main/java/org/kohsuke/stapler/framework/io/LargeText.java
Outdated
Show resolved
Hide resolved
timja
left a comment
There was a problem hiding this comment.
Thanks! I would prefer another review though if possible.
|
The multipart system seems somewhat tricky; could you not just use WebSocket? At any rate, looks OK to me from a quick glance but I do not have time for any sort of thorough review I am afraid. |
Did you see the "simple implementation"?
Something like the following?
Using SSE / |
|
On using SSE/EventSource:
That would imply a limit of 6 concurrent Jenkins browser tabs, which seems quite low. Presumably we want to retain HTTP/1.1 support? We could fall back to polling when detecting no messages on the stream, which could indicate that too many connections are open already. What do we fallback to though? The old streaming that results in excessive memory usage/OOM that wanted to fix here? Apart from the connection limit, SSE is trivial to implement as there is nothing to negotiate, "just" send Leaving the transport aside, I've taken a look at implementing this. An immediate issue that came up is that |
Can we shutdown the SSE stream when the tab visibility is changed?
Yes
This new implementation potentially? |
👍 Good idea!
👍
👍 WDYT about getting this implementation out then and follow up with SSE/WebSocket afterwards? |
I've taken another look at this. It will be quite some work to make this work as all the usages of |
|
To be clear, I never requested WebSocket support. I was just surprised to see new code which logically involves real-time streaming not making use of it. If what you have works well enough, fine. |
This PR adds a new streaming mode for
LargeTextthat allows unbuffered reads (on the stapler layer).There are two implementations in this PR: a "simple" mixed content type response (
<log>\n<metadata>); and a proper RFC compliantmultipart/form-dataresponse body behind standard content negotiation viaAccepts.Support for
multipart/form-datais good in the browser (part of thefetchspec), but rather poor in Java/Jenkins test framework. With that said and after a few iterations on the implementation, I'm quite happy with the multipart approach and actually prefer it over the other mode.Example:
full curl
Build: Stapler -> Jenkins -> Workflow-api -> pipeline-graph-view-plugin
Tail
I'll copy the implementation notes for each commit here:
Simple implementation:
Clients request this new streaming mode via the request header
X-Streaming: true.The server echoes it back to signal support for it.
A final new line is added to the body with a JSON payload that
includes the start/end/completed state plus additional metadata from
child classes. E.g. AnnotatedLargeText in Jenkins core will add the
ConsoleAnnotator state when incrementally reading annotations.
We are effectively creating a multi-part response with the first part
being the plain-text/html console log followed by the second part with
the metadata.
Bonus: Drop the line end normalization that was required for internet
explorer. IE is long dead.
Bonus: Read the log in larger chunks of 64KiB and skip the buffering
of each line.
1kiB disk reads are far too small for today's disks, from NVMe disks
to spinning rust; with all the other buffering removed, going to 1MiB
would probably be fine too.
The default stream buffer size in Node.js is 64KiB: https://nodejs.org/docs/latest-v22.x/api/stream.html#streamgetdefaulthighwatermarkobjectmode
Bonus: Add support for tailing via a negative
?startquery parameterin LargeText. It will try to find the next new line boundary to start
reading from and fallback to the fixed tail if missing. The JSON blob
will contain "startFromNewLine":true when starting from a new line.
The same heuristic is available when fetching more bytes via
?searchNewLineUntil=<last start>.multipart/form-dataimplementationheader 'Accept: multipart/form-data'. The server responds with a
standard multipart response, i.e.
'Content-Type: multipart/form-data;boundary='.
responses using 'response.formData()'.
by the source (default UTF-8) via 'setContentType()'.
in the previous commit.
Testing done
See extensive test suite. Adopt it in progressive text in Jenkins. Adopt it in the pipeline-graph-view-plugin.
Submitter checklist