-
Notifications
You must be signed in to change notification settings - Fork 126
sink: Prometheus returns "collected metric ... was collected before with the same name and label values" error #835
Description
Describe the bug
We're using pgwatch to monitor PostgreSQL metrics, with both PostgreSQL and Prometheus configured as sinks. After starting the service, we ran into an issue when accessing the /metrics endpoint exposed for Prometheus. The error message indicates that a metric with the same name and labels was registered more than once, which suggests some kind of duplication in metric collection or exposure.
An error has occurred while serving metrics:
2100 error(s) occurred:
* collected metric "pgwatch_stat_statements_temp_blk_write_time" { ... label:{name:"dir_or_tablespace" value:"pg_wal"} label:{name:"function_full_name" value:"public.pg_qualstats_index_advisor"} label:{name:"function_name" value:"pg_qualstats_index_advisor"} label:{name:"lockmode" value:"ShareUpdateExclusiveLock"} label:{name:"locktype" value:"advisory"} label:{name:"object_name" value:"public"} label:{name:"oid" value:"...."} label:{name:"path" value:"...."} label:{name:"query" value:"..."} label:{name:"queryid" value:"-92..."} label:{name:"reco_topic" value:"default_public_schema_privs"} label:{name:"schema" value:"public"} label:{name:"sys_id" value:"750...."} counter:{value:0} timestamp_ms:17...} was collected before with the same name and label values
* ....
To Reproduce
Steps to reproduce the behavior:
- Initially configure
pgwatchwith only aPostgreSQLsink enabled. - Start the
pgwatchservice using this initial configuration (as shown in thepgwatch.servicefile below, but with only the PostgreSQL sink active). - Allow
pgwatchto run for some time, collecting metrics into thePostgreSQLsink. - After a period, add the
Prometheussink to thepgwatchconfiguration (by modifying pgwatch.service to include --sink=prometheus://0.0.0.0:9188). - Restart the
pgwatchservice with the updated configuration, now including bothPostgreSQLandPrometheussinks. - Navigate to the Prometheus
/metricsendpoint (e.g., http://<pgwatch_host>:9188/metrics). - Observe the error message: "An error has occurred while serving metrics: X error(s) occurred: * collected metric '...' was collected before with the same name and label values".
Expected behavior
The Prometheus /metrics page should display all collected PostgreSQL metrics without running into conflicts or duplicate metric definitions. It should expose all the necessary data cleanly so Prometheus can scrape it without issue.
Additional context
Our setup uses pgwatch with two sinks:
- PostgreSQL (writes to 10.0.0.61:5432/pgwatch_metrics)
- Prometheus (serves metrics on 0.0.0.0:9188)
This issue only started happening after we added the Prometheus sink. From what we’ve seen, this kind of error has shown up in other exporters too, like in postgres_exporter issue #998, where similar duplicate metric registration problems have been discussed.
Below are the configuration files used:
/etc/pgwatch/sources.yaml:
# Managed by Ansible
- name: "postgresql-oflu-61"
conn_str: "postgresql://pgwatch:pgwatch123@localhost:5432/postgres?sslmode=disable"
kind: "postgres-continuous-discovery"
preset_metrics: "full"
# custom_metrics:
# preset_metrics_standby:
# custom_metrics_standby:
include_pattern: ""
exclude_pattern: "(postgres)"
is_enabled: true
group: "default"
custom_tags:
_environment: "prod"
sslrootcert: ''
sslcert: ''
sslkey: ''
/etc/systemd/system/pgwatch.service:
# Managed by Ansible
[Unit]
Description=Pgwatch Daemon
After=network-online.target
[Service]
# Sinks
Environment="PW_BATCHING_DELAY=950ms"
Environment="PW_RETENTION=7"
Environment="PW_REAL_DBNAME_FIELD=real_dbname"
Environment="PW_SYSTEM_IDENTIFIER_FIELD=sys_id"
# Sources
Environment="PW_REFRESH=120"
Environment="PW_MIN_DB_SIZE_MB=1"
Environment="PW_MAX_PARALLEL_CONNECTIONS_PER_DB=1"
Environment="PW_TRY_CREATE_LISTED_EXTS_IF_MISSING="
# Metrics
Environment="PW_CREATE_HELPERS=true"
Environment="PW_DIRECT_OS_STATS=false"
Environment="PW_INSTANCE_LEVEL_CACHE_MAX_SECONDS=30"
Environment="PW_EMERGENCY_PAUSE_TRIGGERFILE=/tmp/pgwatch-emergency-pause"
# WebUI
Environment="PW_WEBDISABLE=all"
User=root
Type=exec
ExecStart=/usr/bin/pgwatch -s /etc/pgwatch/sources.yaml --sink=postgresql://pgwatch:pgwatch456@10.0.0.61:5432/pgwatch_metrics --sink=prometheus://0.0.0.0:9188 --log-level=debug
Restart=on-failure
TimeoutStartSec=0
[Install]
WantedBy=multi-user.target