Skip to content
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
e12eced
input plugin that reads files each interval
MrMaxBuilds Jun 21, 2018
08a11d7
change config file
MrMaxBuilds Jun 21, 2018
9c4b522
tweak metric output
MrMaxBuilds Jun 21, 2018
4e24a1b
add grok as a top level parser
MrMaxBuilds Jun 21, 2018
ec7f131
add more test files
MrMaxBuilds Jun 21, 2018
504d978
clean up some test cases
MrMaxBuilds Jun 21, 2018
542c030
knock more errors from test files
MrMaxBuilds Jun 21, 2018
554b960
add setparser to reader
MrMaxBuilds Jun 25, 2018
36a23ea
Merge branch 'master' into plugin/reader
MrMaxBuilds Jun 25, 2018
f40371e
add init function to reader
MrMaxBuilds Jun 25, 2018
9c84595
add grok as a top level parser, still need README
MrMaxBuilds Jun 25, 2018
cc40629
allow for import from plugins/all
MrMaxBuilds Jun 25, 2018
79d9ea4
add docker-image spin up for reader
MrMaxBuilds Jun 26, 2018
bbd68b3
docker will spin up
MrMaxBuilds Jun 26, 2018
bf7220d
add test file to docker spin up
MrMaxBuilds Jun 26, 2018
a931eb1
update DATA_FORMATS_INPUT.MD to include grok
MrMaxBuilds Jun 26, 2018
e450b26
remove comments
MrMaxBuilds Jun 26, 2018
001658a
condense telegraf.conf
MrMaxBuilds Jun 26, 2018
7fa27f4
more condensing
MrMaxBuilds Jun 26, 2018
1be2a8e
Formatting and revert Makefile
glinton Jun 26, 2018
aa750ec
add reader README.md
MrMaxBuilds Jun 27, 2018
892c95a
update readmes
MrMaxBuilds Jun 27, 2018
04f09d6
grok parser func unexported
MrMaxBuilds Jun 28, 2018
8063b38
address some of Daniel's comments
MrMaxBuilds Jul 3, 2018
bfc13a7
incomplete changes to logparser plugin
MrMaxBuilds Jul 3, 2018
67db143
still unfinished logparser changes
MrMaxBuilds Jul 3, 2018
8a9da28
logparser is linked to grok parser
MrMaxBuilds Jul 6, 2018
cafa95e
logparser no longer uses seperate grok
MrMaxBuilds Jul 6, 2018
c6087ab
add more unit tests to grok parser
MrMaxBuilds Jul 6, 2018
e4b6f23
fix unit tests for grok parser
MrMaxBuilds Jul 6, 2018
d224673
change logparser unit tests
MrMaxBuilds Jul 9, 2018
f52ceeb
test files added for logparser
MrMaxBuilds Jul 9, 2018
285cf0b
Merge branch 'master' into plugin/reader
MrMaxBuilds Jul 12, 2018
0c3ac29
addresses daniel's comments
MrMaxBuilds Jul 12, 2018
74900ed
change parser config names
MrMaxBuilds Jul 12, 2018
d0f5389
allow for original config and functionality of logparser
MrMaxBuilds Jul 12, 2018
dd778a9
finish daniel's changes
MrMaxBuilds Jul 13, 2018
5449eb7
small change to config
MrMaxBuilds Jul 13, 2018
1f58dd7
Rename reader to file input
danielnelson Jul 14, 2018
2a18ca2
Attempt linking to another plugin
danielnelson Jul 14, 2018
50f49fe
Adjust link to data format docs
danielnelson Jul 14, 2018
08d1397
Rename Reader struct to File
danielnelson Jul 14, 2018
63da4e6
Move measurement option to logparser grok config
danielnelson Jul 14, 2018
88a85a7
Set default data_format to influx
danielnelson Jul 14, 2018
2638186
Add deprecation messages to logparser input
danielnelson Jul 14, 2018
1fe3adb
Fix dev config for file input
danielnelson Jul 14, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 13 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -92,4 +92,16 @@ docker-image:
plugins/parsers/influx/machine.go: plugins/parsers/influx/machine.go.rl
ragel -Z -G2 $^ -o $@

.PHONY: deps telegraf install test test-windows lint vet test-all package clean docker-image fmtcheck uint64
static:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may cause a merge conflict when we merge #4324 unless you merged that branch into yours.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I do when I need to have parts from another commit/branch is either cherry-pick that commit or make the change to the shared file, and just not commit the changes to that file from my new branch.

@echo "Building static linux binary..."
@CGO_ENABLED=0 \
GOOS=linux \
GOARCH=amd64 \
go build -ldflags "$(LDFLAGS)" ./cmd/telegraf

plugin-%:
@echo "Starting dev environment for $${$(@)} input plugin..."
@docker-compose -f plugins/inputs/$${$(@)}/dev/docker-compose.yml up

.PHONY: deps telegraf install test test-windows lint vet test-all package clean docker-image fmtcheck uint64 static

35 changes: 34 additions & 1 deletion docs/DATA_FORMATS_INPUT.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Telegraf is able to parse the following input data formats into metrics:
1. [Nagios](https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md#nagios) (exec input only)
1. [Collectd](https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md#collectd)
1. [Dropwizard](https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md#dropwizard)
1. [Grok](https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md#grok)

Telegraf metrics, like InfluxDB
[points](https://docs.influxdata.com/influxdb/v0.10/write_protocols/line/),
Expand Down Expand Up @@ -651,5 +652,37 @@ For more information about the dropwizard json format see
# [inputs.exec.dropwizard_tag_paths]
# tag1 = "tags.tag1"
# tag2 = "tags.tag2"
```

```
#### Grok
Parse logstash-style "grok" patterns:
```toml
[inputs.reader]
## This is a list of patterns to check the given log file(s) for.
## Note that adding patterns here increases processing time. The most
## efficient configuration is to have one pattern per logparser.
## Other common built-in patterns are:
## %{COMMON_LOG_FORMAT} (plain apache & nginx access logs)
## %{COMBINED_LOG_FORMAT} (access logs + referrer & agent)
patterns = ["%{COMBINED_LOG_FORMAT}"]

## Name of the outputted measurement name.
name_override = "apache_access_log"

## Full path(s) to custom pattern files.
custom_pattern_files = []

## Custom patterns can also be defined here. Put one pattern per line.
custom_patterns = '''

## Timezone allows you to provide an override for timestamps that
## don't already include an offset
## e.g. 04/06/2016 12:41:45 data one two 5.43µs
##
## Default: "" which renders UTC
## Options are as follows:
## 1. Local -- interpret based on machine localtime
## 2. "Canada/Eastern" -- Unix TZ values like those found in https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
## 3. UTC -- or blank/unspecified, will return timestamp in UTC
timezone = "Canada/Eastern"
```
58 changes: 58 additions & 0 deletions internal/config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -1338,6 +1338,59 @@ func buildParser(name string, tbl *ast.Table) (parsers.Parser, error) {
}
}

//for grok data_format
if node, ok := tbl.Fields["named_patterns"]; ok {
if kv, ok := node.(*ast.KeyValue); ok {
if ary, ok := kv.Value.(*ast.Array); ok {
for _, elem := range ary.Value {
if str, ok := elem.(*ast.String); ok {
c.NamedPatterns = append(c.NamedPatterns, str.Value)
}
}
}
}
}

if node, ok := tbl.Fields["patterns"]; ok {
if kv, ok := node.(*ast.KeyValue); ok {
if ary, ok := kv.Value.(*ast.Array); ok {
for _, elem := range ary.Value {
if str, ok := elem.(*ast.String); ok {
c.Patterns = append(c.Patterns, str.Value)
}
}
}
}
}

if node, ok := tbl.Fields["custom_patterns"]; ok {
if kv, ok := node.(*ast.KeyValue); ok {
if str, ok := kv.Value.(*ast.String); ok {
c.CustomPatterns = str.Value
}
}
}

if node, ok := tbl.Fields["custom_pattern_files"]; ok {
if kv, ok := node.(*ast.KeyValue); ok {
if ary, ok := kv.Value.(*ast.Array); ok {
for _, elem := range ary.Value {
if str, ok := elem.(*ast.String); ok {
c.CustomPatternFiles = append(c.CustomPatternFiles, str.Value)
}
}
}
}
}

if node, ok := tbl.Fields["timezone"]; ok {
if kv, ok := node.(*ast.KeyValue); ok {
if str, ok := kv.Value.(*ast.String); ok {
c.TimeZone = str.Value
}
}
}

c.MetricName = name

delete(tbl.Fields, "data_format")
Expand All @@ -1353,6 +1406,11 @@ func buildParser(name string, tbl *ast.Table) (parsers.Parser, error) {
delete(tbl.Fields, "dropwizard_time_format")
delete(tbl.Fields, "dropwizard_tags_path")
delete(tbl.Fields, "dropwizard_tag_paths")
delete(tbl.Fields, "named_patterns")
delete(tbl.Fields, "patterns")
delete(tbl.Fields, "custom_patterns")
delete(tbl.Fields, "custom_pattern_files")
delete(tbl.Fields, "timezone")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these all are at plugin level now, lets prefix them with grok_


return parsers.NewParser(c)
}
Expand Down
1 change: 1 addition & 0 deletions plugins/inputs/all/all.go
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@ import (
_ "github.com/influxdata/telegraf/plugins/inputs/puppetagent"
_ "github.com/influxdata/telegraf/plugins/inputs/rabbitmq"
_ "github.com/influxdata/telegraf/plugins/inputs/raindrops"
_ "github.com/influxdata/telegraf/plugins/inputs/reader"
_ "github.com/influxdata/telegraf/plugins/inputs/redis"
_ "github.com/influxdata/telegraf/plugins/inputs/rethinkdb"
_ "github.com/influxdata/telegraf/plugins/inputs/riak"
Expand Down
13 changes: 13 additions & 0 deletions plugins/inputs/reader/dev/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
version: '3'

services:
telegraf:
image: glinton/scratch
volumes:
- ./telegraf.conf:/telegraf.conf
- ../../../../telegraf:/telegraf
- ./json_a.log:/var/log/test.log
entrypoint:
- /telegraf
- --config
- /telegraf.conf
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: add newline at the end of this and json_a.log

14 changes: 14 additions & 0 deletions plugins/inputs/reader/dev/json_a.log
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"parent": {
"child": 3.0,
"ignored_child": "hi"
},
"ignored_null": null,
"integer": 4,
"list": [3, 4],
"ignored_parent": {
"another_ignored_null": null,
"ignored_string": "hello, world!"
},
"another_list": [4]
}
106 changes: 106 additions & 0 deletions plugins/inputs/reader/dev/telegraf.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slim this file down by removing comments and default values. See other example.

# Global tags can be specified here in key="value" format.
[global_tags]
# dc = "us-east-1" # will tag all metrics with dc=us-east-1
# rack = "1a"
## Environment variables can be used as tags, and throughout the config file
# user = "$USER"


# Configuration for telegraf agent
[agent]
## Default data collection interval for all inputs
interval = "15s"
## Rounds collection interval to 'interval'
## ie, if interval="10s" then always collect on :00, :10, :20, etc.
round_interval = true

## Telegraf will send metrics to outputs in batches of at most
## metric_batch_size metrics.
## This controls the size of writes that Telegraf sends to output plugins.
metric_batch_size = 1000

## For failed writes, telegraf will cache metric_buffer_limit metrics for each
## output, and will flush this buffer on a successful write. Oldest metrics
## are dropped first when this buffer fills.
## This buffer only fills when writes fail to output plugin(s).
metric_buffer_limit = 10000

## Collection jitter is used to jitter the collection by a random amount.
## Each plugin will sleep for a random time within jitter before collecting.
## This can be used to avoid many plugins querying things like sysfs at the
## same time, which can have a measurable effect on the system.
collection_jitter = "0s"

## Default flushing interval for all outputs. You shouldn't set this below
## interval. Maximum flush_interval will be flush_interval + flush_jitter
flush_interval = "10s"
## Jitter the flush interval by a random amount. This is primarily to avoid
## large write spikes for users running a large number of telegraf instances.
## ie, a jitter of 5s and interval 10s means flushes will happen every 10-15s
flush_jitter = "0s"

## By default or when set to "0s", precision will be set to the same
## timestamp order as the collection interval, with the maximum being 1s.
## ie, when interval = "10s", precision will be "1s"
## when interval = "250ms", precision will be "1ms"
## Precision will NOT be used for service inputs. It is up to each individual
## service input to set the timestamp at the appropriate precision.
## Valid time units are "ns", "us" (or "µs"), "ms", "s".
precision = ""

## Logging configuration:
## Run telegraf with debug log messages.
debug = false
## Run telegraf in quiet mode (error log messages only).
quiet = false
## Specify the log file name. The empty string means to log to stderr.
logfile = ""

## Override default hostname, if empty use os.Hostname()
hostname = ""
## If set to true, do no set the "host" tag in the telegraf agent.
omit_hostname = false

# # reload and gather from file[s] on telegraf's interval
[[inputs.reader]]
# ## These accept standard unix glob matching rules, but with the addition of
# ## ** as a "super asterisk". ie:
# ## /var/log/**.log -> recursively find all .log files in /var/log
# ## /var/log/*/*.log -> find all .log files with a parent dir in /var/log
# ## /var/log/apache.log -> only tail the apache log file
files = ["/var/log/test.log"]
#
# ## The dataformat to be read from files
# ## Each data format has its own unique set of configuration options, read
# ## more about them here:
# ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "json"
#

#patterns = ["%{TEST_LOG_B}","%{TEST_LOG_A}"]
#
# ## Name of the outputted measurement name.
#name_override = "grok_reader"
#
# ## Full path(s) to custom pattern files.
#custom_pattern_files = ["/Users/maxu/go/src/github.com/influxdata/telegraf/plugins/inputs/logparser/grok/testdata/test-patterns"]
#
# ## Custom patterns can also be defined here. Put one pattern per line.
# custom_patterns = '''
# '''
#
# ## Timezone allows you to provide an override for timestamps that
# ## don't already include an offset
# ## e.g. 04/06/2016 12:41:45 data one two 5.43µs
# ##
# ## Default: "" which renders UTC
# ## Options are as follows:
# ## 1. Local -- interpret based on machine localtime
# ## 2. "Canada/Eastern" -- Unix TZ values like those found in https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
# ## 3. UTC -- or blank/unspecified, will return timestamp in UTC
# timezone = "Canada/Eastern"


[[outputs.file]]
files = ["stdout"]
102 changes: 102 additions & 0 deletions plugins/inputs/reader/reader.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
package reader

import (
"io/ioutil"
"log"

"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/internal/globpath"
"github.com/influxdata/telegraf/plugins/inputs"
"github.com/influxdata/telegraf/plugins/parsers"
)

type Reader struct {
Filepaths []string `toml:"files"`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call this Files, if its good enough for the config its good enough for the struct, and it will be less confusing.

FromBeginning bool
parser parsers.Parser

Filenames []string
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this a unexported field with a lowercase letter, otherwise you can set this from the config file which is undesired.

}

const sampleConfig = `## Files to parse each interval.
## These accept standard unix glob matching rules, but with the addition of
## ** as a "super asterisk". ie:
## /var/log/**.log -> recursively find all .log files in /var/log
## /var/log/*/*.log -> find all .log files with a parent dir in /var/log
## /var/log/apache.log -> only tail the apache log file
files = ["/var/log/apache/access.log"]

## The dataformat to be read from files
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = ""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use 2 space indention in the sample config, make sure to update the README as well. You can generate the config for the readme with telegraf -usage reader.

`

// SampleConfig returns the default configuration of the Input
func (r *Reader) SampleConfig() string {
return sampleConfig
}

func (r *Reader) Description() string {
return "reload and gather from file[s] on telegraf's interval"
}

func (r *Reader) Gather(acc telegraf.Accumulator) error {
r.refreshFilePaths()
for _, k := range r.Filenames {
metrics, err := r.readMetric(k)
if err != nil {
return err
}

for i, m := range metrics {

//error if m is nil
if m == nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be possible for on of the metrics to be nil, it would be a programming mistake, so don't check for it. If it did happen panic'ing would be okay.

log.Printf("E! Metric could not be parsed from: %v, on line %v", k, i)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will go away if you follow my previous comment, but I should also mention that it is not strictly true that each line will have a metric and this would line up.

continue
}
acc.AddFields(m.Name(), m.Fields(), m.Tags())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add the m.Time().

}
}
return nil
}

func (r *Reader) SetParser(p parsers.Parser) {
r.parser = p
}

func (r *Reader) refreshFilePaths() {
var allFiles []string
for _, filepath := range r.Filepaths {
g, err := globpath.Compile(filepath)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would compile only once and store the result on the Reader struct. Since we don't have a constructor function yet, I would probably check if the value on the struct is nil and if so compile.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought was that if people want to specify a directory via some globpath, then this will check for any new files added there during runtime

if err != nil {
log.Printf("E! Error Glob %s failed to compile, %s", filepath, err)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error should probably stop the input since it indicates a user error, if any glob fails return the error up and out of Gather.

continue
}
files := g.Match()

for k := range files {
allFiles = append(allFiles, k)
}
}

r.Filenames = allFiles
}

//requires that Parser has been compiled
func (r *Reader) readMetric(filename string) ([]telegraf.Metric, error) {
fileContents, err := ioutil.ReadFile(filename)
if err != nil {
log.Printf("E! File could not be opened: %v", filename)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error should be along the lines of "File could not be read" for accuracy, but also this error should be returned instead of logged. In general it is not safe to do anything with the return value if an error occurs.

}
return r.parser.Parse(fileContents)

}

func init() {
inputs.Add("reader", func() telegraf.Input {
return &Reader{}
})
}
Loading