is pg_stat_activity "transactional"? How fast does it update?
Hi. We're getting unit-test failures that look like data-races,
which somehow are getting very frequent recently, tripping our CI.
Basically we have a `services` table, for service discovery, which
records the backend PID of its main DB Connection. Such that we can
account for "stale" / "zombie" services, which are still registered
(e.g. crashed), but who's connection is gone, when querying that
`services` table, like so:
```sql
select s.name, s.backend_pid, ...
from services s
join pg_stat_activity a
on a.pid = s.backend_pid
where a.datname = current_database()
```
The unit-test code is "linear", i.e. single-threaded, with pseudo-code:
1) Open Connection outer
2) Open Connection inner
3) register service using inner (i.e. add row in `services` with
backend PID of inner)
4) Close Connection inner (calls PQfinish)
5) Run query above on outer. Sometimes still sees that "stale" row,
despite closing inner.
There's is no question about the ordering above.
i.e. PQfinish(inner) returned before the query is run.
there's of course the possibility of another (3rd) connection reusing
the same backend PID, but that's remote a chance I believe. And I
don't know of any other reliable ID for a connection, than its backend
PID.
So when and how fast does pg_stat_activity update, in the face of a
PQfinish? What other scheme to detect "table rows" associated to
"stale" connections?
Thanks for any insights, --DD
Dominique Devienne <ddevienne@gmail.com> writes:
So when and how fast does pg_stat_activity update, in the face of a
PQfinish?
The shared state underlying pg_stat_activity is updated
immediately when something changes. However ...
PQfinish is asynchronous: it sends a "goodbye" message and then
closes the connection without waiting for a response. So
depending on system load, the associated backend could still be
alive for some milliseconds. There might be other client
libraries that do that differently.
Another thing that might be biting you is that a backend inspecting
pg_stat_activity() takes a snapshot of the view's contents and then
holds onto that snapshot until end of transaction. You can get
around that with pg_stat_clear_snapshot().
regards, tom lane
On Tue, Jun 10, 2025 at 4:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Dominique Devienne <ddevienne@gmail.com> writes:
So when and how fast does pg_stat_activity update, in the face of a
PQfinish?The shared state underlying pg_stat_activity is updated
immediately when something changes. However ...PQfinish is asynchronous: it sends a "goodbye" message and then
closes the connection without waiting for a response. So
depending on system load, the associated backend could still be
alive for some milliseconds. There might be other client
libraries that do that differently.Another thing that might be biting you is that a backend inspecting
pg_stat_activity() takes a snapshot of the view's contents and then
holds onto that snapshot until end of transaction. You can get
around that with pg_stat_clear_snapshot().
Many thanks Tom. Didn't know either of those facts. --DD
On 6/10/25 08:46, Dominique Devienne wrote:
On Tue, Jun 10, 2025 at 4:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Dominique Devienne <ddevienne@gmail.com> writes:
So when and how fast does pg_stat_activity update, in the face of a
PQfinish?The shared state underlying pg_stat_activity is updated
immediately when something changes. However ...PQfinish is asynchronous: it sends a "goodbye" message and then
closes the connection without waiting for a response. So
depending on system load, the associated backend could still be
alive for some milliseconds. There might be other client
libraries that do that differently.Another thing that might be biting you is that a backend inspecting
pg_stat_activity() takes a snapshot of the view's contents and then
holds onto that snapshot until end of transaction. You can get
around that with pg_stat_clear_snapshot().Many thanks Tom. Didn't know either of those facts. --DD
For more detail see:
https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-STATS-VIEWS
--
Adrian Klaver
adrian.klaver@aklaver.com
On Tue, Jun 10, 2025 at 5:46 PM Dominique Devienne <ddevienne@gmail.com> wrote:
On Tue, Jun 10, 2025 at 4:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
PQfinish is asynchronous: [...]
[...] pg_stat_activity() takes a snapshot of the view's contentsMany thanks Tom. Didn't know either of those facts.
Hi. I ended up polling pg_stat_activity to wait for the disappearance
of the backend, to avoid the race in my unit tests. Initially
pg_stat_clear_snapshot() seemed to help, but then it didn't. Not
ideal, but I don't see any solution (work-around, really). There's a
new LOGIN event trigger, but no symmetrical FINISH one, and those are
restricted to privileged users, so useless in my case. Oh well. --DD