-
Notifications
You must be signed in to change notification settings - Fork 67
Description
Problem Description
The Decode function in encoding.go (lines 50-63) uses a shared state decodeBuffer field, which can cause log data consistency issues when processing messages from Kafka topics with 2 or more partitions.
This bug can cause decoded log bodies from different Kafka records to be interleaved. For example, if two log messages with bodies AAAA and BBB are processed concurrently from different partitions, the resulting stored body may become something like BBBA, indicating that parts of one message have overwritten the other due to the shared decodeBuffer.
Issue Details
When multiple Kafka partitions are processed concurrently, the Decode function may be called simultaneously from different goroutines. Since e.decodeBuffer is a shared instance variable, this creates a race condition where:
- Multiple goroutines can read/write to the same
decodeBufferconcurrently - The buffer resizing logic (
e.decodeBuffer = make([]byte, len(e.decodeBuffer)*2)) can cause data corruption - Different messages may overwrite each other's decoded data
In practice, this means that log bodies from multiple records can be mixed together (e.g. AAAA and BBB being stored as BBBA), which is a clear violation of log data integrity.
Expected Behavior
Each decode operation should be thread-safe and maintain data integrity regardless of the number of Kafka partitions.
Suggested Solution
- Allocate a new buffer for each decode operation