Per-directory file counts in the Filecount plugin#4752
Per-directory file counts in the Filecount plugin#4752danielnelson merged 8 commits intoinfluxdata:masterfrom
Conversation
|
@glinton @danielnelson how about use |
|
Would |
|
Regarding the docs: In this context, "only count the /var/log dir" is a bit misleading. By default, using In my opinion we should distinguish between counting files in a directory and producing separate stats (fields) for a directory. These are not the same. |
Wouldn't it be more consistent to drop "directory" in favour of "directories" (array of globs) altogether? (I think @Jaeyo suggested the same thing in #4749, but I might have misunderstood his comment.) If a glob expression is used, the filecount plugin has to deal with multiple base directories and report stats for all matching directories accordingly, therefore "directories" seems to be a better name than "directory" to me. |
|
@Jaeyo Since you removed the |
|
Also, wouldn't it be a good idea to add some updated unit tests to test the new behaviour? I hope my comments are not coming across as too nitpicky. I actually quite like the approach taken in this PR, I just think some minor details could be improved a bit. |
|
|
||
| filtered := []string{} | ||
| for path, file := range g.Match() { | ||
| if file.IsDir() == true { |
There was a problem hiding this comment.
This should read if file.IsDir() { instead of if file.IsDir() == true {.
I agree, however one major drawback with that is breaking compatibility with config files from older versions. |
|
i think my comment lacked a little explanation.
and like @sometimesfood comment, doc also should be modified. thanks for all comment. |
Marking "directory" as deprecated in favour of "directories" sounds like a good approach in order to avoid breaking existing configurations. |
|
|
|
Reading the code, it seems that it reads the same directories several times (in case of recursive ask) : Imho, it would be more efficient to parse once all directories rather than to launch walkFn for each directory found. I may propose a fix, but not before this friday. |
|
@Samuel-BF thanks for your suggestion. i would appreciate any fix at your convenience. |
|
Well, I already recoded the plugin without filepath.Walk (using rather recursive functions) for the #4778 PR. It seemed simpler to modify it to add your work (Directories option), that's what I made in 18373c7 . Detail : in fact, your version parses n+1 times the directories, where n is the level of recursion of directories (once in globpath.Match and one time per recursion level). I made one test comparing our two versions on my system (running it against source tree of telegraf) : Of course, the gain is proportional to your folder depth. |
|
@Samuel-BF: While I appreciate the better efficiency and the resulting speedup, I dislike the |
Let me explain my use case.
An option like recursive_print_size allows me to have a configurable balance between these two extremas. But I agree, it doesn't seem very elegant, both the term and the logic. Any proposal ? I'll check the processors or agreggators plugins to see if it allows me to drop measurements. |
|
I think I understand your use case, but IMHO this should be addressed on a different abstraction level. Using processors or aggregators might be a solution to this type of problem. To be honest I haven't checked yet whether these are viable options, but I think adding this type of feature should be postponed until we have evaluated simpler options to do the same thing. I added some comments to your pull request (#4778), sorry about splitting up the discussion. |
|
I added a few comments to #4778 about the proposed options, but we can add other changes from there on top of this work. |
Required for all PRs:
close #4749