Skip to content

config: multiline: in_tail: filter_multiline: Add configurable buffer limit for multiline interface #10653

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

cosmo0920
Copy link
Contributor

@cosmo0920 cosmo0920 commented Jul 28, 2025

We added an interface for configurable buffer limit for multiline.
Also, we implemented robust processing for multiline concatenations.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • New Features

    • Added a configurable buffer size limit for multiline log message concatenation.
    • Introduced warnings and metrics for truncated multiline messages due to buffer limits.
    • Truncated multiline messages are now marked with metadata for easier identification.
  • Bug Fixes

    • Improved handling and logging to clearly distinguish between append failures and buffer truncations.
  • Tests

    • Added tests to verify behavior when multiline buffer limits are reached and truncation occurs.

cosmo0920 and others added 5 commits July 28, 2025 11:14
Signed-off-by: Hiroshi Hatake <[email protected]>
Co-authored-by: Eduardo Silva <[email protected]>
Signed-off-by: Hiroshi Hatake <[email protected]>
Co-authored-by: Eduardo Silva <[email protected]>
Signed-off-by: Hiroshi Hatake <[email protected]>
Co-authored-by: Eduardo Silva <[email protected]>
Signed-off-by: Hiroshi Hatake <[email protected]>
Co-authored-by: Eduardo Silva <[email protected]>
Signed-off-by: Hiroshi Hatake <[email protected]>
Co-authored-by: Eduardo Silva <[email protected]>
cosmo0920 and others added 6 commits August 1, 2025 20:21
1 is also indicated for FLB_TRUE.

Signed-off-by: Hiroshi Hatake <[email protected]>
This commit updates the expected output for the 'container_mix' unit test.

Previously, the multiline engine could incorrectly merge pending messages
when the log stream switched between different parser types (e.g., from `docker` to `cri`).
The test's original expectations were written to match this buggy behavior.

Recent fixes have made the engine's state handling more robust and precise.
It now correctly flushes a pending message when the parser context changes,
preventing improper merges. This change aligns the test case with the new, correct logic.

Signed-off-by: Hiroshi Hatake <[email protected]>
Co-authored-by: Eduardo Silva <[email protected]>
@cosmo0920 cosmo0920 force-pushed the cosmo0920-add-limit-for-multiline-concatenation branch from 578f46a to 3a8abd9 Compare August 1, 2025 11:28
Copy link

coderabbitai bot commented Aug 1, 2025

Walkthrough

This change introduces a configurable buffer size limit for multiline message concatenation throughout the Fluent Bit codebase. It extends configuration structures, adds new status codes and logic for truncation, updates multiline processing and logging to handle buffer limits, and includes tests for truncation scenarios.

Changes

Cohort / File(s) Change Summary
Config Struct & Macro Extension
include/fluent-bit/flb_config.h
Added multiline_buffer_limit field to flb_config struct and defined the macro FLB_CONF_STR_MULTILINE_BUFFER_LIMIT.
Multiline Buffer Limit & Status Codes
include/fluent-bit/multiline/flb_ml.h
Introduced buffer size limit constant, new status codes, and added fields for truncation tracking in multiline structs.
Multiline Group Function Declaration
include/fluent-bit/multiline/flb_ml_group.h
Declared new function flb_ml_group_cat for safe buffer appending with limit enforcement.
Multiline Filter Truncation Handling
plugins/filter_multiline/ml.c
Added metric counter and logging for truncated multiline appends, distinguishing truncation from other failures.
Tail Plugin Truncation Logging
plugins/in_tail/tail_file.c
Added warning log and metrics increment when multiline message truncation occurs due to buffer limit.
Tail Plugin Metric Addition
plugins/in_tail/tail_config.c, plugins/in_tail/tail_config.h
Added new metric counter and constant for tracking multiline truncation events in tail plugin.
Config Property Wiring
src/flb_config.c
Registered new config property and initialized multiline_buffer_limit with default value.
Multiline Core Logic & Status Propagation
src/multiline/flb_ml.c
Propagated truncation status, adjusted control flow for buffer limits, and marked truncated events in metadata.
Multiline Group Buffer Append Implementation
src/multiline/flb_ml_group.c
Implemented flb_ml_group_cat to safely append data with truncation and buffer limit awareness.
Multiline Rule Buffer Handling
src/multiline/flb_ml_rule.c
Switched to flb_ml_group_cat for buffer appends, handling truncation and immediate flushing as needed.
Multiline Stream Group Initialization
src/multiline/flb_ml_stream.c
Initialized new fields for truncation and stream pointer in stream group creation.
Multiline Truncation Test
tests/internal/multiline.c
Added test for buffer limit truncation, updated test data, and improved flush callback robustness.
Multiline Filter Plugin Metric Update
plugins/filter_multiline/ml.h
Added new metric ID and counter for truncated multiline events in filter plugin context.
Size Parsing Utility Addition
include/fluent-bit/flb_utils.h, src/flb_utils.c
Added flb_utils_size_to_binary_bytes function to parse size strings with binary units (KiB, MiB, GiB).
Size Parsing Utility Test
tests/internal/utils.c
Added tests for flb_utils_size_to_binary_bytes validating binary size string parsing.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Config
    participant MultilineEngine
    participant StreamGroup
    participant LogPlugin

    User->>Config: Set multiline_buffer_limit
    Config->>MultilineEngine: Pass buffer_limit on init
    LogPlugin->>MultilineEngine: Append log line
    MultilineEngine->>StreamGroup: Attempt to concatenate line
    alt Buffer within limit
        StreamGroup-->>MultilineEngine: FLB_MULTILINE_OK
        MultilineEngine-->>LogPlugin: Success
    else Buffer exceeds limit
        StreamGroup-->>MultilineEngine: FLB_MULTILINE_TRUNCATED
        MultilineEngine-->>LogPlugin: Warn: message truncated
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

Poem

A buffer’s tale, a limit set,
To keep log lines from growing yet.
Truncation now, a warning bright,
Keeps memory safe throughout the night.
Multiline streams, concise and neat—
With tests to prove this feat complete!
🐇✨

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 20c7695 and 5b808f5.

📒 Files selected for processing (1)
  • plugins/filter_multiline/ml.c (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • plugins/filter_multiline/ml.c
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-centos-7
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
  • GitHub Check: PR - fuzzing test
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch cosmo0920-add-limit-for-multiline-concatenation

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
tests/internal/utils.c (1)

849-849: Fix typo in test name.

There's a typo in the test name: "test_size_to_bianry_bytes" should be "test_size_to_binary_bytes".

-    { "test_size_to_bianry_bytes", test_size_to_binary_bytes },
+    { "test_size_to_binary_bytes", test_size_to_binary_bytes },
src/multiline/flb_ml.c (1)

886-892: Consider logging invalid buffer limit configurations

The code silently falls back to the default when the buffer limit configuration is invalid. Consider adding a warning log to help users identify configuration issues.

 limit = flb_utils_size_to_binary_bytes(ml->config->multiline_buffer_limit);
 if (limit > 0) {
     ml->buffer_limit = (size_t)limit;
 }
 else {
+    if (ml->config->multiline_buffer_limit && strlen(ml->config->multiline_buffer_limit) > 0) {
+        flb_warn("[multiline] invalid buffer limit '%s', using default %zu bytes", 
+                 ml->config->multiline_buffer_limit, FLB_ML_BUFFER_LIMIT_DEFAULT);
+    }
     ml->buffer_limit = FLB_ML_BUFFER_LIMIT_DEFAULT;
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2168247 and 31386ce.

📒 Files selected for processing (13)
  • include/fluent-bit/flb_config.h (2 hunks)
  • include/fluent-bit/flb_utils.h (1 hunks)
  • include/fluent-bit/multiline/flb_ml.h (3 hunks)
  • plugins/filter_multiline/ml.c (4 hunks)
  • plugins/filter_multiline/ml.h (2 hunks)
  • plugins/in_tail/tail_config.c (1 hunks)
  • plugins/in_tail/tail_config.h (2 hunks)
  • plugins/in_tail/tail_file.c (2 hunks)
  • src/flb_config.c (2 hunks)
  • src/flb_utils.c (1 hunks)
  • src/multiline/flb_ml.c (13 hunks)
  • tests/internal/multiline.c (4 hunks)
  • tests/internal/utils.c (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
  • plugins/filter_multiline/ml.h
  • src/flb_config.c
  • plugins/filter_multiline/ml.c
  • plugins/in_tail/tail_config.h
  • include/fluent-bit/flb_config.h
  • plugins/in_tail/tail_config.c
  • plugins/in_tail/tail_file.c
  • include/fluent-bit/multiline/flb_ml.h
🧰 Additional context used
🧬 Code Graph Analysis (2)
tests/internal/utils.c (1)
src/flb_utils.c (1)
  • flb_utils_size_to_binary_bytes (610-694)
include/fluent-bit/flb_utils.h (1)
src/flb_utils.c (1)
  • flb_utils_size_to_binary_bytes (610-694)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: PR - fuzzing test
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-centos-7
🔇 Additional comments (14)
include/fluent-bit/flb_utils.h (1)

52-52: LGTM!

The function declaration is correctly placed and matches the implementation signature.

tests/internal/utils.c (1)

801-830: LGTM! Good test coverage for the new function.

The test cases cover various scenarios including edge cases and overflow conditions. The expected values are mathematically correct for binary units.

tests/internal/multiline.c (4)

114-115: LGTM!

The updated expected output correctly reflects the new multiline concatenation behavior where Docker log entries are properly concatenated based on stream context.


395-398: Good defensive programming!

Adding the NULL check prevents potential crashes when the callback is invoked without proper context data.


1463-1531: Well-structured test for buffer truncation!

The test effectively validates the multiline buffer truncation feature by:

  • Setting a realistic 80-byte limit for concatenated content
  • Using appropriate Docker JSON format for testing
  • Correctly expecting FLB_MULTILINE_OK for the first append and FLB_MULTILINE_TRUNCATED when the limit is exceeded

1543-1543: Test properly registered!

The new buffer truncation test is correctly added to the test suite.

src/multiline/flb_ml.c (8)

27-27: LGTM!

Required include for the new flb_utils_size_to_binary_bytes() function.


217-219: Proper truncation state tracking!

The function correctly tracks and propagates the truncation state from rule processing, ensuring callers are aware when buffer limits are exceeded.

Also applies to: 267-270, 347-350


421-422: Correct status propagation!

The function now properly returns the actual processing status including truncation, instead of masking it with a generic success code.

Also applies to: 495-495


607-613: Good clarification on error handling strategy!

The comment properly explains why sub-parser failures don't halt processing - multiline rules should still get a chance to process the raw text.


723-754: Excellent improvement to non-matching line handling!

The logic now correctly:

  1. Flushes all pending multiline data when a non-matching line is encountered
  2. Processes the non-matching line as a standalone message
  3. Properly tracks truncation status throughout

This prevents data loss and ensures proper message boundaries.


764-855: Consistent truncation handling across text and object paths!

The implementation properly mirrors the text append logic, ensuring consistent behavior regardless of input type.


1346-1384: Important fix for empty buffer handling!

The code now correctly handles the case where the multiline buffer is empty by packing the original map, preventing potential data loss.


1439-1444: Excellent observability enhancement!

Adding the multiline_truncated metadata field allows downstream consumers to identify when messages were truncated due to buffer limits. The flag is properly reset after flushing.

Also applies to: 1484-1484

size_to_byte function just converts with 1000(K), 1000*K, 1000*M.
But this function converts with 1024(KiB), 1024*KiB(MiB), and
1024*MiB(GiB).

Signed-off-by: Hiroshi Hatake <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant