-
Notifications
You must be signed in to change notification settings - Fork 126
Bug: Data race in PrometheusWriter gauges map #1194
Copy link
Copy link
Closed
Description
Describe the bug
I detected a race condition in the PrometheusWriter sink involving the gauges map. The DefineMetrics method, which is called during configuration reloads, initializes and writes to the gauges map without holding a lock. Concurrently, the Collect method (invoked by Prometheus scrapes) calls MetricStoreMessageToPromMetrics, which reads from the same gauges map. This concurrent read/write access triggers a data race
To Reproduce
Run the following test with the race detector enabled (go test -race):
func TestGaugesMap_RaceCondition(_ *testing.T) {
// 1. Initialize PrometheusWriter
promw, _ := NewPrometheusWriter(testutil.TestContext, "127.0.0.1:0/pgwatch")
// 2. Register a metric so Write() actually puts data into the map
_ = promw.SyncMetric("race_db", "test_metric", AddOp)
// 3. Pre-fill cache so Collect has something to do
_ = promw.Write(metrics.MeasurementEnvelope{
DBName: "race_db",
MetricName: "test_metric",
Data: metrics.Measurements{
{
metrics.EpochColumnName: time.Now().UnixNano(),
"value": int64(100),
},
},
})
var wg sync.WaitGroup
done := make(chan struct{})
// --- The Config Reloader (Simulating configuration updates) ---
wg.Go(func() {
for {
select {
case <-done:
return
default:
// Call the REAL DefineMetrics method (Writes to gauges map)
_ = promw.DefineMetrics(&metrics.Metrics{
MetricDefs: metrics.MetricDefs{
"test_metric": {Gauges: []string{"value"}},
},
})
}
}
})
// --- The Collector (Simulating Prometheus Scrapes) ---
wg.Go(func() {
// Prometheus provides a channel to receive metrics
ch := make(chan prometheus.Metric, 10000)
// Scrape 50 times (more than enough to trigger a race in a tight loop)
for range 50 {
// Call the REAL Collect method (Reads from gauges map)
promw.Collect(ch)
// Drain the channel so it doesn't block
drainLoop:
for {
select {
case <-ch:
default:
break drainLoop
}
}
}
close(done) // Tell the reloader to stop
})
wg.Wait()
}Expected behavior
Concurrent calls to DefineMetrics() and Collect() should be thread-safe. Access to the gauges map should be protected by a mutex.
Actual behavior
WARNING: DATA RACE
Read at 0x00c00016f5c8 by goroutine 31:
command-line-arguments.(*PrometheusWriter).MetricStoreMessageToPromMetrics()
internal/sinks/prometheus.go:206
Previous write at 0x00c00016f5c8 by goroutine 30:
command-line-arguments.(*PrometheusWriter).DefineMetrics()
internal/sinks/prometheus.go:102
testing.go:1617: race detected during execution of test
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels