-
Notifications
You must be signed in to change notification settings - Fork 2k
Description
A note for the community
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Problem
Splunk HEC has two endpoint targets: "event" (JSON) and "raw" (text).
Raw events should be framed:
HTTP Event Collector can parse raw text and extract one or more events. HEC expects that the HTTP request contains one or more events with line-breaking rules in effect.
(emphasis mine, source: https://docs.splunk.com/Documentation/SplunkCloud/9.3.2408/Data/FormateventsforHTTPEventCollector#Event_parsing)
Technically, on the Splunk side, you can configure event splitting. The default is:
LINE_BREAKER=([\r\n]+)
See https://docs.splunk.com/Documentation/SplunkCloud/latest/Data/Configureeventlinebreaking.
In practice, this could mean there are cases where you'd want to use a custom character-delimiter for framing on Vector to match the LINE_BREAKER in Splunk.
In general/default, however, newline framing is the expectation.
Currently, the Splunk HEC sink does NO framing and has NO config options to control it unlike most other sinks.
Configuration
sources:
syslog:
type: demo_logs
format: bsd_syslog
interval: 0.25
sinks:
splunk_raw:
type: splunk_hec_logs
inputs: [syslog]
endpoint_target: raw
acknowledgements:
enabled: true
endpoint: https://splunk.example.com:8088/
default_token: xxx
encoding:
codec: raw_message
sourcetype: syslogdemo_logs rate of 0.25 ensures ~4 events/sec, so that there will be multiple in a batch to Splunk (default batch timeout of 1sec)
Version
0.45.0
Debug Output
(Debug output does not contain anything useful, but if there's something specific you want, let me know.)
Example Data
Not dependent on event/data type. An easy way to repro is use a demo_logs source to send some fake syslog to Splunk HEC raw, for example. (See example config above)
For example, here's several demo syslog events that did not get split properly, ending up a single event in Splunk:
<28>Apr 30 16:51:52 names.xn--3pxu8k fwd[3303]: Great Scott! We're never gonna reach 88 mph with the flux capacitor in its current state!<81>Apr 30 16:51:53 for.sakura scraper[1082]: Great Scott! We're never gonna reach 88 mph with the flux capacitor in its current state!<57>Apr 30 16:51:53 we.beauty alerter[9857]: #hugops to everyone who has to deal with this<42>Apr 30 16:51:53 random.audi fwd[7972]: You're not gonna believe what just happened
Additional Context
- I'm currently still on 0.45.0 but no commits in 0.46.x that would change this behavior
- Not sure the best way to get
Framerinto Splunk HEC sink/config given it's internally pretty different than most others
References
No response