Skip to content

sink: Prometheus returns "collected metric ... was collected before with the same name and label values" error #835

@f9n

Description

@f9n

Describe the bug

We're using pgwatch to monitor PostgreSQL metrics, with both PostgreSQL and Prometheus configured as sinks. After starting the service, we ran into an issue when accessing the /metrics endpoint exposed for Prometheus. The error message indicates that a metric with the same name and labels was registered more than once, which suggests some kind of duplication in metric collection or exposure.

An error has occurred while serving metrics:

2100 error(s) occurred:
* collected metric "pgwatch_stat_statements_temp_blk_write_time" {  ... label:{name:"dir_or_tablespace"  value:"pg_wal"}  label:{name:"function_full_name"  value:"public.pg_qualstats_index_advisor"}  label:{name:"function_name"  value:"pg_qualstats_index_advisor"}  label:{name:"lockmode"  value:"ShareUpdateExclusiveLock"}  label:{name:"locktype"  value:"advisory"}  label:{name:"object_name"  value:"public"}  label:{name:"oid"  value:"...."}  label:{name:"path"  value:"...."}  label:{name:"query"  value:"..."}  label:{name:"queryid"  value:"-92..."}  label:{name:"reco_topic"  value:"default_public_schema_privs"}  label:{name:"schema"  value:"public"}  label:{name:"sys_id"  value:"750...."}  counter:{value:0}  timestamp_ms:17...} was collected before with the same name and label values
* ....

To Reproduce

Steps to reproduce the behavior:

  1. Initially configure pgwatch with only a PostgreSQL sink enabled.
  2. Start the pgwatch service using this initial configuration (as shown in the pgwatch.service file below, but with only the PostgreSQL sink active).
  3. Allow pgwatch to run for some time, collecting metrics into the PostgreSQL sink.
  4. After a period, add the Prometheus sink to the pgwatch configuration (by modifying pgwatch.service to include --sink=prometheus://0.0.0.0:9188).
  5. Restart the pgwatch service with the updated configuration, now including both PostgreSQL and Prometheus sinks.
  6. Navigate to the Prometheus /metrics endpoint (e.g., http://<pgwatch_host>:9188/metrics).
  7. Observe the error message: "An error has occurred while serving metrics: X error(s) occurred: * collected metric '...' was collected before with the same name and label values".

Expected behavior

The Prometheus /metrics page should display all collected PostgreSQL metrics without running into conflicts or duplicate metric definitions. It should expose all the necessary data cleanly so Prometheus can scrape it without issue.

Additional context

Our setup uses pgwatch with two sinks:

  • PostgreSQL (writes to 10.0.0.61:5432/pgwatch_metrics)
  • Prometheus (serves metrics on 0.0.0.0:9188)

This issue only started happening after we added the Prometheus sink. From what we’ve seen, this kind of error has shown up in other exporters too, like in postgres_exporter issue #998, where similar duplicate metric registration problems have been discussed.

Below are the configuration files used:

  • /etc/pgwatch/sources.yaml:
# Managed by Ansible
- name: "postgresql-oflu-61"
  conn_str: "postgresql://pgwatch:pgwatch123@localhost:5432/postgres?sslmode=disable"
  kind: "postgres-continuous-discovery"
  preset_metrics: "full"
  # custom_metrics:
  # preset_metrics_standby:
  # custom_metrics_standby:
  include_pattern: ""
  exclude_pattern: "(postgres)"
  is_enabled: true
  group: "default"
  custom_tags:
    _environment: "prod"
  sslrootcert: ''
  sslcert: ''
  sslkey: ''
  • /etc/systemd/system/pgwatch.service:
# Managed by Ansible

[Unit]
Description=Pgwatch Daemon
After=network-online.target

[Service]
# Sinks
Environment="PW_BATCHING_DELAY=950ms"
Environment="PW_RETENTION=7"
Environment="PW_REAL_DBNAME_FIELD=real_dbname"
Environment="PW_SYSTEM_IDENTIFIER_FIELD=sys_id"

# Sources
Environment="PW_REFRESH=120"
Environment="PW_MIN_DB_SIZE_MB=1"
Environment="PW_MAX_PARALLEL_CONNECTIONS_PER_DB=1"
Environment="PW_TRY_CREATE_LISTED_EXTS_IF_MISSING="

# Metrics
Environment="PW_CREATE_HELPERS=true"
Environment="PW_DIRECT_OS_STATS=false"
Environment="PW_INSTANCE_LEVEL_CACHE_MAX_SECONDS=30"
Environment="PW_EMERGENCY_PAUSE_TRIGGERFILE=/tmp/pgwatch-emergency-pause"

# WebUI
Environment="PW_WEBDISABLE=all"

User=root
Type=exec
ExecStart=/usr/bin/pgwatch -s /etc/pgwatch/sources.yaml --sink=postgresql://pgwatch:pgwatch456@10.0.0.61:5432/pgwatch_metrics --sink=prometheus://0.0.0.0:9188 --log-level=debug
Restart=on-failure
TimeoutStartSec=0

[Install]
WantedBy=multi-user.target

Metadata

Metadata

Assignees

Labels

questionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions