Skip to content

Commit eb5a72a

Browse files
authored
fix(bpf): Fix overhead when sampling (#1685)
Update the register counters one step before metrics sample is taken instead of updating registers every time which is increasing overhead Signed-off-by: Vimal Kumar <[email protected]>
1 parent 9bffccf commit eb5a72a

File tree

1 file changed

+13
-7
lines changed

1 file changed

+13
-7
lines changed

bpf/kepler.bpf.h

Lines changed: 13 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -317,22 +317,28 @@ static inline int do_kepler_sched_switch_trace(
317317

318318
cpu_id = bpf_get_smp_processor_id();
319319

320-
// Collect metrics
321-
// Regardless of skipping the collection, we need to update the hardware
322-
// counter events to keep the metrics map current.
323-
collect_metrics_and_reset_counters(&buf, prev_pid, curr_ts, cpu_id);
324-
325320
// Skip some samples to minimize overhead
326-
// Note that we can only skip samples after updating the metric maps to
327-
// collect the right values
328321
if (SAMPLE_RATE > 0) {
329322
if (counter_sched_switch > 0) {
323+
// update hardware counters to be used when sample is taken
324+
if (counter_sched_switch == 1) {
325+
collect_metrics_and_reset_counters(
326+
&buf, prev_pid, curr_ts, cpu_id);
327+
// Add task on-cpu running start time
328+
bpf_map_update_elem(
329+
&pid_time_map, &next_pid, &curr_ts,
330+
BPF_ANY);
331+
// create new process metrics
332+
register_new_process_if_not_exist(next_tgid);
333+
}
330334
counter_sched_switch--;
331335
return 0;
332336
}
333337
counter_sched_switch = SAMPLE_RATE;
334338
}
335339

340+
collect_metrics_and_reset_counters(&buf, prev_pid, curr_ts, cpu_id);
341+
336342
// The process_run_time is 0 if we do not have the previous timestamp of
337343
// the task or due to a clock issue. In either case, we skip collecting
338344
// all metrics to avoid discrepancies between the hardware counter and CPU

0 commit comments

Comments
 (0)