-
Notifications
You must be signed in to change notification settings - Fork 126
Changes in pg scrape behavior in version 3.5 #825
Copy link
Copy link
Closed
Labels
bugSomething isn't workingSomething isn't workingmetricsMetrics related issuesMetrics related issuesquestionFurther information is requestedFurther information is requested
Description
hi, after upgrading from version 3.1.0 to version 3.5.0 I noticed strange behavior in the collected metrics when saving in PG, which made me go to investigate by versions what exactly changed.
- 3.1.0 -> 3.2.0 without changes
- 3.2.0 -> 3.3.0 I see that sys_id and real_dbname were added to the data field (maybe I didn't read the changelog well but I didn't find this change in it)
- 3.3.0 -> 3.4.0 I didn't notice any changes
- 3.4.0 - 3.5.0 here very strange behavior starts, epoch_ns was added to the data field, also pid and a random query appear in the tag_data field in those metrics where it is not declared, and where it is declared it duplicates the query from data (but the queries are completely different), writing additional data to tag_data looks like a bug (+ declared tag_data env/cluster/database/tablename/ndex_full_name disappears). And what is very interesting is that if I turn off the collection of stat_statements in the preset, then in the problematic metrics, pid and query will disappear from tag_data, only epoch_ns remains in data.
invalid_indexes metric example
invalid_indexes:
sqls:
11: |-
select /* pgwatch_generated */
(extract(epoch from now()) * 1e9)::int8 as epoch_ns,
format('%I.%I', n.nspname , ci.relname) as tag_index_full_name,
cr.relname as tag_tablename,
1 as int
from
pg_index i
join pg_class ci on ci.oid = i.indexrelid
join pg_class cr on cr.oid = i.indrelid
join pg_namespace n on n.oid = ci.relnamespace
where not n.nspname like E'pg\\_temp%'
and not indisvalid
and not exists ( /* leave out ones that are being actively rebuilt */
select * from pg_locks l
join pg_stat_activity a using (pid)
where l.relation = i.indexrelid
and a.state = 'active'
and a.query ~* 'concurrently'
)
limit 100
node_status: primary
tag 3.1.0 -> 3.2.0 (with stat_statements)
│ time │ 2025-06-30 07:28:50.625773
│ dbname │ db1-host1
│ data │ {"int": 1}
│ tag_data │ {"host": "host1", "env": "stage", "cluster": "tst-cluster", "database": "db1", "tablename": "mytbl", "index_full_name": "public.mytbl_t_idx_ccnew"}
tag 3.2.0 -> 3.3.0-> 3.4.0 (with stat_statements)
│ time │ 2025-06-30 07:36:28.674967
│ dbname │ db1-host1
│ data │ {"int": 1, "sys_id": "7477487578428529559", "real_dbname": "db1"}
│ tag_data │ {"host": "host1", "env": "stage", "cluster": "tst-cluster", "database": "db1", "tablename": "mytbl", "index_full_name": "public.mytbl_t_idx_ccnew"}
3.4.0 -> 3.5.0 (with stat_statements)
│ time │ 2025-06-30 07:39:05.410566
│ dbname │ db1-host1
│ data │ {"int": 1, "sys_id": "7477487578428529559", "epoch_ns": 1751258345410566000, "real_dbname": "db1"}
│ tag_data │ {"pid": "2894135", "host": "host1", "query": "select /* pgwatch_generated */ (extract($1 from now()) * $2)::int8 as epoch_ns, format($3, n.nspname , ci.relname) as
3.4.0 -> 3.5.0 (disable scrape stat_statements)
time │ 2025-06-30 07:43:57.312991
│ dbname │ db1-host1
│ data │ {"int": 1, "sys_id": "7477487578428529559", "epoch_ns": 1751258637312991000, "real_dbname": "db1"}
│ tag_data │ {"pid": "2894135", "host": "host1", "env": "stage", "cluster": "tst-cluster", "database": "db1", "tablename": "mytbl", "index_full_name": "public.mytbl
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingmetricsMetrics related issuesMetrics related issuesquestionFurther information is requestedFurther information is requested