Skip to content

Data consistency issue with Encoding.Decode when using Kafka with multiple partitions #747

@OhJuhun

Description

@OhJuhun

Problem Description

The Decode function in encoding.go (lines 50-63) uses a shared state decodeBuffer field, which can cause log data consistency issues when processing messages from Kafka topics with 2 or more partitions.

https://github.com/SigNoz/signoz-otel-collector/blob/main/internal/coreinternal/textutils/encoding.go#L50-L63

This bug can cause decoded log bodies from different Kafka records to be interleaved. For example, if two log messages with bodies AAAA and BBB are processed concurrently from different partitions, the resulting stored body may become something like BBBA, indicating that parts of one message have overwritten the other due to the shared decodeBuffer.

Issue Details

When multiple Kafka partitions are processed concurrently, the Decode function may be called simultaneously from different goroutines. Since e.decodeBuffer is a shared instance variable, this creates a race condition where:

  1. Multiple goroutines can read/write to the same decodeBuffer concurrently
  2. The buffer resizing logic (e.decodeBuffer = make([]byte, len(e.decodeBuffer)*2)) can cause data corruption
  3. Different messages may overwrite each other's decoded data

In practice, this means that log bodies from multiple records can be mixed together (e.g. AAAA and BBB being stored as BBBA), which is a clear violation of log data integrity.

Expected Behavior

Each decode operation should be thread-safe and maintain data integrity regardless of the number of Kafka partitions.

Suggested Solution

  • Allocate a new buffer for each decode operation

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions