-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Replies: 3 comments · 48 replies
-
@plewam can you please share the stream directory that we can use to reproduce this behavior, together with an (edited to not contain any sensitive info) definition file? |
Beta Was this translation helpful? Give feedback.
All reactions
-
@plewam please take a second to look and see how I edited your comment to "hide" the log. Otherwise, it takes up a huge amount of screen space. Another alternative is to upload a file, rather than paste a wall of text. |
Beta Was this translation helpful? Give feedback.
All reactions
-
@lukebakken Got it thanks, didn't know that |
Beta Was this translation helpful? Give feedback.
All reactions
-
@kjnilsson So I just did a VM restart and attached the log for you. I immediately have corrupted stream files, See log VM restart log
|
Beta Was this translation helpful? Give feedback.
All reactions
-
Beta Was this translation helpful? Give feedback.
All reactions
-
And from there on the error remains (CPU load returns to normal however), the log file is flooded, and the memory increases until high watermark is reached. When reached the broker stops accepting connections. |
Beta Was this translation helpful? Give feedback.
All reactions
-
Alternatively, a set of Stream PerfTest CLI flags that we can use to get a stream into that state would suffice. I assume that the size of an archive with 1 billion messages can be prohibitive in practice. |
Beta Was this translation helpful? Give feedback.
All reactions
-
I think I have found the issue. It occurs when the very first chunk in a stream isn't written fully which causes subsequent chunks to be written out of position. @plewam I don't know why the data was lost as you say it was an ordered shutdown but stuff like that in a virtualised environment isn't uncommon and RabbitMQ streams are meant to cope with it. It does also highlight that running streams without replication means you have fairly weak data safety guarantees. Here is the PR to the streaming subsystem. Once that is merged I will update RabbitMQ main branch. |
Beta Was this translation helpful? Give feedback.
All reactions
-
Can you see whether you can download it here: https://drive.google.com/file/d/1Lv6_r_UDa5Wi5YPny2RJ5nXfkzL48r-c/view?usp=sharing |
Beta Was this translation helpful? Give feedback.
All reactions
-
@kjnilsson Should I give rmq 3.13.0-rc.4 a try? |
Beta Was this translation helpful? Give feedback.
All reactions
-
@kjnilsson Maybe one more observation from my side. I use this client library to publish to the stream. https://github.com/rabbitmq/rabbitmq-stream-dotnet-client |
Beta Was this translation helpful? Give feedback.
All reactions
-
When you tested the version with the fix did you upgrade an existing environment or did you perform a clean install and re-declared all the streams? |
Beta Was this translation helpful? Give feedback.
All reactions
-
I upgraded an existing environment, but deleted the stream directory. However the issue persists. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi All,
I have been a long time user of classic queues within RMQ. Recently I started using RMQ streams.
RMQ Version: 3.13-RC2
Erlang Version:
Plattform: Windows Server 2019
My streams are configured to hold each dataset for 21 days. As of now this has resulted in a combined stream size of around a billion messages.
As the stream size grows I encounter multiple problems.
1. Problem - Crazy log writing of the below error
2. Problem - Increase in memory (Other ETS Tables)

3. Problem - Long startup times and continuously high CPU usage
A start of RabbitMQ takes up to 5 minutes until the management interface becomes responsive. Until then I also have connectivity problems. Throughout the whole time the CPU usage is at 100%
All these three problems might be linked to each other. In order to temporarily fix this I have reduced the number of days a message is hold inside the stream to 3 days again. It seems I am stable again now.
Any help would be highly appreciated, especially about Problem 1.
Beta Was this translation helpful? Give feedback.
All reactions