bug reapper: Empty query_id in pg_stat_activity
hello, everyone!
there is a report: "Empty query_id in pg_stat_activity"
/messages/by-id/CAJ+5Ff4sLuTMpg__fZH67UdaC2bWpLK1U3a8kkj_BvqO8ScYQw@mail.gmail.com
Maybe it was fixed, but i reappear this bug in pg 15.14.
In my case ,just set compute_query_id = on in PostgreSQL.conf, then reproduce it:
1. in term1:
/opt/pg/15/bin/pgbench -p 1514 -i -s 10 pgbench
/opt/pg/15/bin/pgbench -p 1514 -T 1200 -M simple -d pgbench
2.in term2:
file .psqlrc \pset null 'NULL'
/opt/pg/15/bin/psql -p 1514 -t pgbench
Null display is "NULL".
psql (15.14)
Type "help" for help.
pgbench=# select query_id from pg_stat_activity where pid=7008;
3370541688642856312
pgbench=# \watch 0.2
-7505942672311739493
-7505942672311739493
NULL
-7505942672311739493
-7505942672311739493
NULL
NULL
-7810315603562552972
NULL
-7810315603562552972
NULL
4477171674326430251
-8406700390779190035
-7810315603562552972
....
NULL display the query_id
On Thu, Aug 28, 2025 at 3:09 PM duomi.peng <duomi.peng@qq.com> wrote:
hello, everyone!
there is a report: "Empty query_id in pg_stat_activity"
/messages/by-id/CAJ+5Ff4sLuTMpg__fZH67UdaC2bWpLK1U3a8kkj_BvqO8ScYQw@mail.gmail.com
Maybe it was fixed, but i reappear this bug in pg 15.14.
In my case ,just set compute_query_id = on in PostgreSQL.conf, then
reproduce it:1. in term1:
/opt/pg/15/bin/pgbench -p 1514 -i -s 10 pgbench
/opt/pg/15/bin/pgbench -p 1514 -T 1200 -M simple -d pgbench2.in term2:
file .psqlrc \pset null 'NULL'
/opt/pg/15/bin/psql -p 1514 -t pgbench
Null display is "NULL".
psql (15.14)
Type "help" for help.pgbench=# select query_id from pg_stat_activity where pid=7008;
3370541688642856312
pgbench=# \watch 0.2
-7505942672311739493-7505942672311739493
NULL
-7505942672311739493
-7505942672311739493
NULL
NULL
-7810315603562552972
NULL
-7810315603562552972NULL
4477171674326430251
-8406700390779190035
-7810315603562552972
....NULL display the query_id
Thanks for reporting , I was able to reproduce this locally and my
immediate observation is as we "pgstat_report_query_id(0, true);"
in exec_simple_query which resets the st_query_id to 0 and we use the same
in pg_stat_get_activity which is used to return the query_id in
the pg_stat_activity view because of this the query_id is set to NULL
in pg_stat_activity view , i might be totally wrong here ,please correct me
if i am wrong.
--
Thanks,
Srinath Reddy Sadipiralla
EDB: https://www.enterprisedb.com/
If I am reading the code in utils/activity/backend_status.c correctly, I
not only see where it is happening (pgstat_report_activity), but why it is
happening as well. There is definitely a race condition as we know what the
query is before we have a chance to parse it and generate a query
identifier for it. I'm not sure of the solution - we cannot exclude null
query identifiers from the output, or callers will see the backend
disappear briefly! We cannot avoid setting it to 0, as then we could have a
query != query_id. We could store the query string somewhere and only
reveal it once we have the identifier, but that's also tricky because we
risk not setting it (and because we also set other things at the same time
as the name, which would then be further disconnected). But maybe its the
best option? I do agree this is a bug (and like Srinath, I can duplicate
easily).
Anyway, my pre-caffeine two cents.
Cheers,
Greg
--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support
On Thu, Aug 28, 2025 at 5:40 PM Greg Sabino Mullane <htamfids@gmail.com>
wrote:
If I am reading the code in utils/activity/backend_status.c correctly, I
not only see where it is happening (pgstat_report_activity), but why it is
happening as well.
sorry for half cooked analysis ,the reason is pgstat_report_query_id()
zeroes st_query_id before the new one is computed in exec_simple_query(). A
concurrent reader of pg_stat_activity during that narrow timing gap
observes 0 (displayed as NULL). The fix I tried is to retain the previous
query_id (prev_st_query_id) and return it when st_query_id is still 0 but
query_id computation is enabled, thoughts?
--
Thanks,
Srinath Reddy Sadipiralla
EDB: https://www.enterprisedb.com/
Attachments:
v1-0001-Fix-transient-NULL-query_id-in-pg_stat_activity.patchapplication/octet-stream; name=v1-0001-Fix-transient-NULL-query_id-in-pg_stat_activity.patchDownload+10-2
On Thu, Aug 28, 2025 at 5:55 PM Srinath Reddy Sadipiralla <
srinath2133@gmail.com> wrote:
sorry for half cooked analysis ,the reason is pgstat_report_query_id()
zeroes st_query_id before the new one is computed in exec_simple_query(). A
concurrent reader of pg_stat_activity during that narrow timing gap
observes 0 (displayed as NULL). The fix I tried is to retain the previous
query_id (prev_st_query_id) and return it when st_query_id is still 0 but
query_id computation is enabled, thoughts?
I was able to reproduce this on 15.14 and master also , I think we have to
backpatch the fix.
--
Thanks,
Srinath Reddy Sadipiralla
EDB: https://www.enterprisedb.com/
On Thu, Aug 28, 2025 at 8:25 AM Srinath Reddy Sadipiralla <
srinath2133@gmail.com> wrote:
The fix I tried is to retain the previous query_id (prev_st_query_id) and
return it when st_query_id is still 0 but query_id computation is enabled,
thoughts?
This just moves the race condition around a little - you can end up with a
query that does not agree with its query_id.
Cheers,
Greg
--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support
On Thu, Aug 28, 2025 at 08:42:32AM -0400, Greg Sabino Mullane wrote:
This just moves the race condition around a little - you can end up with a
query that does not agree with its query_id.
The query ID assignment in a backend entry has always been kind of
a pessimistic concept: the backend sets it when it knows about it.
Note that there are a few cases with the extended query protocol for
execute messages, where the query ID can be set only when we're in the
executor start path. Always reporting the previous query ID if the
current one is not set is also breaking an existing property of the
current implementation: a NULL query ID means that we are in the
process of running a query. The proposed patch would always show a
query ID, even if the backend has begun running an entire different
query, for which the query ID would be assigned later, just we don't
know about it yet.
In short, I don't think that there's anything to fix here. The
proposed patch is breaking some assumptions that the current
backend implementation relies on.
--
Michael
On Sun, Aug 31, 2025 at 1:02 AM Michael Paquier <michael@paquier.xyz> wrote:
a NULL query ID means that we are in the process of running a query.
Exactly. Maybe we document this somewhere? That would at least have given
us something we could have pointed the OP to.
In short, I don't think that there's anything to fix here.
I agree, and withdraw my "this is a bug" statement.
Cheers,
Greg
--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support
On Mon, Sep 1, 2025 at 10:09 PM Greg Sabino Mullane <htamfids@gmail.com>
wrote:
On Sun, Aug 31, 2025 at 1:02 AM Michael Paquier <michael@paquier.xyz>
wrote:a NULL query ID means that we are in the process of running a query.
Exactly. Maybe we document this somewhere? That would at least have given
us something we could have pointed the OP to.In short, I don't think that there's anything to fix here.
I agree, and withdraw my "this is a bug" statement.
+1
--
Thanks,
Srinath Reddy Sadipiralla
EDB: https://www.enterprisedb.com/