Skip to content

Commit c97adf0

Browse files
committed
feature: Implement the common-term plot
1 parent df1a4c5 commit c97adf0

File tree

12 files changed

+244
-36
lines changed

12 files changed

+244
-36
lines changed

.github/workflows/test.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,5 +21,7 @@ jobs:
2121
run: cargo test -- --test-threads=1
2222
- name: Check format
2323
run: cargo fmt -- --check
24+
- name: Get clippy version
25+
run: cargo clippy -V
2426
- name: Run clippy
25-
run: cargo clippy -- -D clippy
27+
run: cargo clippy -- -D clippy::all

Cargo.lock

Lines changed: 3 additions & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[package]
22
name = "lowcharts"
3-
version = "0.4.1"
3+
version = "0.4.2"
44
authors = ["JuanLeon Lahoz <[email protected]>"]
55
edition = "2018"
66
description = "Tool to draw low-resolution graphs in terminal"

README.md

Lines changed: 30 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ terminal.
2323
Type `lowcharts --help`, or `lowcharts PLOT-TYPE --help` for a complete list of
2424
options.
2525

26-
Currently five basic types of plots are supported:
26+
Currently six basic types of plots are supported:
2727

2828
#### Bar chart for matches in the input
2929

@@ -33,7 +33,7 @@ This chart is generated using `lowcharts matches database.log SELECT UPDATE DELE
3333

3434
[![Simple bar chart with lowcharts](resources/matches-example.png)](resources/matches-example.png)
3535

36-
#### Histogram
36+
#### Histogram for numerical inputs
3737

3838
This chart is generated using `python3 -c 'import random; [print(random.normalvariate(5, 5)) for _ in range(100000)]' | lowcharts hist`:
3939

@@ -85,19 +85,6 @@ each ∎ represents a count of 228
8585
[0.044 .. 0.049] [ 183]
8686
```
8787

88-
#### X-Y Plot
89-
90-
This chart is generated using `cat ram-usage | lowcharts plot --height 20 --width 50`:
91-
92-
[![Sample plot with lowcharts](resources/plot-example.png)](resources/plot-example.png)
93-
94-
Note that x axis is not labelled. The tool splits the input data by chunks of a
95-
fixed size and then the chart display the averages of those chunks. In other
96-
words: grouping data by time is not (yet?) supported; you can see the evolution
97-
of a metric over time, but not the speed of that evolution.
98-
99-
There is regex support for this type of plots.
100-
10188
#### Time Histogram
10289

10390
This chart is generated using `strace -tt ls -lR * 2>&1 | lowcharts timehist --intervals 10`:
@@ -109,7 +96,7 @@ similar way, and would give you a glimpse of when and how many 404s are being
10996
triggered in your server.
11097

11198
The idea is to depict the frequency of logs that match a regex (by default any
112-
log that is read by the tool). The sub-command can autodetect the more common
99+
log that is read by the tool). The sub-command can autodetect the most common
113100
(in my personal and biased experience) datetime/timestamp formats: rfc 3339, rfc
114101
2822, python `%(asctime)s`, golang default log format, nginx, rabbitmq, strace
115102
-t (or -tt, or -ttt),ltrace,... as long as the timestamp is present in the first
@@ -130,12 +117,38 @@ timezones).
130117

131118
This adds up the time histogram and bar chart in a single visualization.
132119

133-
This chart is generated using `strace -tt ls -lR 2>&1 | lowcharts split-timehist open mmap close read write --intervals 10`:
120+
This chart is generated using `strace -tt ls -lR 2>&1 | lowcharts split-timehist open mmap close read write --intervals 10`:
134121

135122
[![Sample plot with lowcharts](resources/split-timehist-example.png)](resources/split-timehist-example.png)
136123

137124
This graph depicts the relative frequency of search terms in time.
138125

126+
#### Common terms histogram
127+
128+
Useful for plotting most common terms in input lines.
129+
130+
This sample chart is generated using `strace ls -l 2>&1 | lowcharts common-terms --lines 8 -R '(.*?)\('`:
131+
132+
[![Sample plot with lowcharts](resources/common-terms-example.png)](resources/common-terms-example.png)
133+
134+
The graph depicts the 8 syscalls most used by `ls -l` command, along with its
135+
number of uses and sorted. In general, using `lowcharts common-terms` is a
136+
handy substitute to commands of the form `awk ... | sort | uniq -c | sort -rn |
137+
head`.
138+
139+
#### X-Y Plot
140+
141+
This chart is generated using `cat ram-usage | lowcharts plot --height 20 --width 50`:
142+
143+
[![Sample plot with lowcharts](resources/plot-example.png)](resources/plot-example.png)
144+
145+
Note that x axis is not labelled. The tool splits the input data by chunks of a
146+
fixed size and then the chart display the averages of those chunks. In other
147+
words: grouping data by time is not (yet?) supported; you can see the evolution
148+
of a metric over time, but not the speed of that evolution.
149+
150+
There is regex support for this type of plots.
151+
139152
### Installing
140153

141154
#### Via release

src/app.rs

Lines changed: 25 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -46,8 +46,7 @@ lines.
4646
By default this will use a capture group named `value`. If not present, it will
4747
use first capture group.
4848
49-
If no regex is used, a number per line is expected (something that can be parsed
50-
as float).
49+
If no regex is used, the whole input lines will be matched.
5150
5251
Examples of regex are ' 200 \\d+ ([0-9.]+)' (where there is one anonymous capture
5352
group) and 'a(a)? (?P<value>[0-9.]+)' (where there are two capture groups, and
@@ -68,7 +67,7 @@ fn add_non_capturing_regex(app: App) -> App {
6867
Arg::new("regex")
6968
.long("regex")
7069
.short('R')
71-
.about("Filter out lines where regex is notr present")
70+
.about("Filter out lines where regex is not present")
7271
.takes_value(true),
7372
)
7473
}
@@ -170,6 +169,19 @@ pub fn get_app() -> App<'static> {
170169
.multiple(true),
171170
);
172171

172+
let mut common_terms = App::new("common-terms")
173+
.version(clap::crate_version!())
174+
.setting(AppSettings::ColoredHelp)
175+
.about("Plot histogram with most common terms in input lines");
176+
common_terms = add_input(add_regex(add_width(common_terms))).arg(
177+
Arg::new("lines")
178+
.long("lines")
179+
.short('l')
180+
.about("Display that many lines, sorting by most frequent")
181+
.default_value("10")
182+
.takes_value(true),
183+
);
184+
173185
App::new("lowcharts")
174186
.author(clap::crate_authors!())
175187
.version(clap::crate_version!())
@@ -198,6 +210,7 @@ pub fn get_app() -> App<'static> {
198210
.subcommand(matches)
199211
.subcommand(timehist)
200212
.subcommand(splittimehist)
213+
.subcommand(common_terms)
201214
}
202215

203216
#[cfg(test)]
@@ -279,4 +292,13 @@ mod tests {
279292
sub_m.values_of("match").unwrap().collect::<Vec<&str>>()
280293
);
281294
}
295+
296+
#[test]
297+
fn terms_subcommand_arg_parsing() {
298+
let arg_vec = vec!["lowcharts", "common-terms", "--regex", "foo", "some"];
299+
let m = get_app().get_matches_from(arg_vec);
300+
let sub_m = m.subcommand_matches("common-terms").unwrap();
301+
assert_eq!("some", sub_m.value_of("input").unwrap());
302+
assert_eq!("foo", sub_m.value_of("regex").unwrap());
303+
}
282304
}

src/main.rs

Lines changed: 36 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@ fn get_float_reader(matches: &ArgMatches) -> Result<read::DataReader, ()> {
8585
builder.range(min..max);
8686
}
8787
if let Some(string) = matches.value_of("regex") {
88-
match Regex::new(&string) {
88+
match Regex::new(string) {
8989
Ok(re) => {
9090
builder.regex(re);
9191
}
@@ -100,7 +100,7 @@ fn get_float_reader(matches: &ArgMatches) -> Result<read::DataReader, ()> {
100100

101101
/// Implements the hist cli-subcommand
102102
fn histogram(matches: &ArgMatches) -> i32 {
103-
let reader = match get_float_reader(&matches) {
103+
let reader = match get_float_reader(matches) {
104104
Ok(r) => r,
105105
_ => return 2,
106106
};
@@ -122,7 +122,7 @@ fn histogram(matches: &ArgMatches) -> i32 {
122122

123123
/// Implements the plot cli-subcommand
124124
fn plot(matches: &ArgMatches) -> i32 {
125-
let reader = match get_float_reader(&matches) {
125+
let reader = match get_float_reader(matches) {
126126
Ok(r) => r,
127127
_ => return 2,
128128
};
@@ -155,11 +155,42 @@ fn matchbar(matches: &ArgMatches) -> i32 {
155155
0
156156
}
157157

158+
/// Implements the common-terms cli-subcommand
159+
fn common_terms(matches: &ArgMatches) -> i32 {
160+
let mut builder = read::DataReaderBuilder::default();
161+
if let Some(string) = matches.value_of("regex") {
162+
match Regex::new(string) {
163+
Ok(re) => {
164+
builder.regex(re);
165+
}
166+
_ => {
167+
error!("Failed to parse regex {}", string);
168+
return 1;
169+
}
170+
};
171+
} else {
172+
builder.regex(Regex::new("(.*)").unwrap());
173+
};
174+
let reader = builder.build().unwrap();
175+
let width = matches.value_of_t("width").unwrap();
176+
let lines = matches.value_of_t("lines").unwrap();
177+
if lines < 1 {
178+
error!("You should specify a potitive number of lines");
179+
return 2;
180+
};
181+
print!(
182+
"{:width$}",
183+
reader.read_terms(matches.value_of("input").unwrap(), lines),
184+
width = width
185+
);
186+
0
187+
}
188+
158189
/// Implements the timehist cli-subcommand
159190
fn timehist(matches: &ArgMatches) -> i32 {
160191
let mut builder = read::TimeReaderBuilder::default();
161192
if let Some(string) = matches.value_of("regex") {
162-
match Regex::new(&string) {
193+
match Regex::new(string) {
163194
Ok(re) => {
164195
builder.regex(re);
165196
}
@@ -236,6 +267,7 @@ fn main() {
236267
Some(("plot", subcommand_matches)) => plot(subcommand_matches),
237268
Some(("matches", subcommand_matches)) => matchbar(subcommand_matches),
238269
Some(("timehist", subcommand_matches)) => timehist(subcommand_matches),
270+
Some(("common-terms", subcommand_matches)) => common_terms(subcommand_matches),
239271
Some(("split-timehist", subcommand_matches)) => splittime(subcommand_matches),
240272
_ => unreachable!("Invalid subcommand"),
241273
});

src/plot/histogram.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@ impl fmt::Display for Histogram {
7777
let writer = HistWriter {
7878
width: f.width().unwrap_or(110),
7979
};
80-
writer.write(f, &self)
80+
writer.write(f, self)
8181
}
8282
}
8383

src/plot/mod.rs

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,14 @@
11
pub use self::histogram::Histogram;
22
pub use self::matchbar::{MatchBar, MatchBarRow};
33
pub use self::splittimehist::SplitTimeHistogram;
4+
pub use self::terms::CommonTerms;
45
pub use self::timehist::TimeHistogram;
56
pub use self::xy::XyPlot;
67

78
mod histogram;
89
mod matchbar;
910
mod splittimehist;
11+
mod terms;
1012
mod timehist;
1113
mod xy;
1214

src/plot/terms.rs

Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
use std::collections::HashMap;
2+
use std::fmt;
3+
4+
use yansi::Color::{Blue, Green, Red};
5+
6+
#[derive(Debug)]
7+
pub struct CommonTerms {
8+
pub terms: HashMap<String, usize>,
9+
lines: usize,
10+
}
11+
12+
impl CommonTerms {
13+
pub fn new(lines: usize) -> CommonTerms {
14+
CommonTerms {
15+
terms: HashMap::new(),
16+
lines,
17+
}
18+
}
19+
20+
pub fn observe(&mut self, term: String) {
21+
*self.terms.entry(term).or_insert(0) += 1
22+
}
23+
}
24+
25+
impl fmt::Display for CommonTerms {
26+
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
27+
let width = f.width().unwrap_or(100);
28+
let mut counts: Vec<(&String, &usize)> = self.terms.iter().collect();
29+
if counts.is_empty() {
30+
writeln!(f, "No data")?;
31+
return Ok(());
32+
}
33+
counts.sort_by(|a, b| b.1.cmp(a.1));
34+
let values = &counts[..self.lines.min(counts.len())];
35+
let label_width = values.iter().fold(1, |acc, x| acc.max(x.0.len()));
36+
let divisor = 1.max(counts[0].1 / width);
37+
let width_count = format!("{}", counts[0].1).len();
38+
writeln!(
39+
f,
40+
"Each {} represents a count of {}",
41+
Red.paint("∎"),
42+
Blue.paint(divisor.to_string()),
43+
)?;
44+
for (term, count) in values.iter() {
45+
writeln!(
46+
f,
47+
"[{label}] [{count}] {bar}",
48+
label = Blue.paint(format!("{:>width$}", term, width = label_width)),
49+
count = Green.paint(format!("{:width$}", count, width = width_count)),
50+
bar = Red.paint(format!("{:∎<width$}", "", width = *count / divisor))
51+
)?;
52+
}
53+
Ok(())
54+
}
55+
}
56+
57+
#[cfg(test)]
58+
mod tests {
59+
use super::*;
60+
use yansi::Paint;
61+
62+
#[test]
63+
fn test_common_terms_empty() {
64+
let terms = CommonTerms::new(10);
65+
Paint::disable();
66+
let display = format!("{}", terms);
67+
assert_eq!(display, "No data\n");
68+
}
69+
70+
#[test]
71+
fn test_common_terms() {
72+
let mut terms = CommonTerms::new(2);
73+
for _ in 0..100 {
74+
terms.observe(String::from("foo"));
75+
}
76+
for _ in 0..10 {
77+
terms.observe(String::from("arrrrrrrr"));
78+
}
79+
for _ in 0..20 {
80+
terms.observe(String::from("barbar"));
81+
}
82+
Paint::disable();
83+
let display = format!("{:10}", terms);
84+
85+
println!("{}", display);
86+
assert!(display.contains("[ foo] [100] ∎∎∎∎∎∎∎∎∎∎\n"));
87+
assert!(display.contains("[barbar] [ 20] ∎∎\n"));
88+
assert!(!display.contains("arr"));
89+
}
90+
}

0 commit comments

Comments
 (0)