injection_points: Switch wait/wakeup to use atomics rather than latches

Started by Michael Paquierabout 2 months ago21 messageshackers

michael@paquier.xyz

about 2 months ago

Hi all,
(Adding Andrey in CC, as I'm sure he is interested in that.)

While looking at the test proposed on the thread about the ProcKill(),
I have been reminded about the fact that relying on latches and a
condition variable for the wait and the wakeups has its limits:
/messages/by-id/aheVjCHmcbXBtiy0@paquier.xyz

In this case, we are trying to synchronize backends once they don't
have latch assigned anymore, which defeats the purpose of wait/wakeup
because the condition variable used in injection_points while waiting
expects a Latch to be set for the processes we are waiting on.

Folks have complained about this limitation a couple of times in the
past, and I never got around to do something about it. While looking
at that I have finished with the patch attached, which was
surprisingly simpler than what I thought was needed. This replaces
the condition variable with a set of atomic counters. The counters
are incremented at wakeup, and the wait checks them on a periodic
basis. The wait loop uses a delay that increases over time, maxed at
100ms so as we can get a good responsiveness on fast machines, without
burning CPU for nothing in tests that require more wait time due to a
tight loop with the counter checks.

One thing worth noticing is the CHECK_FOR_INTERRUPTS() in the wait
loop, which is something we need for the autovacuum test in test_misc
that requires some signaling and interrupt processing.

It may make sense to be conservative and limit ourselves to do this
change on HEAD, but I'd like to suggest a backpatch down to v17 so as
future tests that rely on a such change can be backpatched. I would
need this change for the other test, still consistency in the facility
primes for me here.

Note: The CI seems happy with the patch.

Thoughts or comments?
--
Michael

Robert Haas

robertmhaas@gmail.com

about 2 months ago

In reply to: Michael Paquier (#1)

Re: injection_points: Switch wait/wakeup to use atomics rather than latches

On Wed, May 27, 2026 at 10:43 PM Michael Paquier <michael@paquier.xyz> wrote:

While looking at the test proposed on the thread about the ProcKill(),
I have been reminded about the fact that relying on latches and a
condition variable for the wait and the wakeups has its limits:
/messages/by-id/aheVjCHmcbXBtiy0@paquier.xyz

After reading this email, the linked-to email, and the commit message
for the patch, I still don't have a clear understanding of what this
is intended to fix. It seems like it's going to make the
responsiveness worse. In general, we want to replace escalating wait
loops with things that wake up instantly at the right time, and this
is going in the opposite direction.

--
Robert Haas
EDB: http://www.enterprisedb.com

Michael Paquier

michael@paquier.xyz

about 2 months ago

In reply to: Robert Haas (#2)

Re: injection_points: Switch wait/wakeup to use atomics rather than latches

On Thu, May 28, 2026 at 08:40:39AM -0400, Robert Haas wrote:

After reading this email, the linked-to email, and the commit message
for the patch, I still don't have a clear understanding of what this
is intended to fix. It seems like it's going to make the
responsiveness worse. In general, we want to replace escalating wait
loops with things that wake up instantly at the right time, and this
is going in the opposite direction.

This is an exchange between responsiveness of the system and
flexibility. I have had two complaints in the past about the fact
that the waits and wakeups were not doable due to the fact that we
rely on condition variables and latches:
- Postmaster context (lack of dsm access as one). Heikki has
mentioned that to me once as annoying when hacking on tests there at
protocol level, at least.
- Second case as shown on the previous thread, which was a tricky
scenario involving the termination of backends.

One limitation is also related to wait event visibility, which may not
be visible in pg_stat_activity. We could simply add a LOG entry in
injection_wait() once the old count is read, and rely on a server log
lookup in the TAP tests where we cannot use pg_stat_activity.

Compared to redesigning all the facilities that injection_points
relies on, this patch was striking me as having a good balance in
terms of responsiveness (min 10us, max 100ms) vs portability. The
minimum threshold does not really matter much in terms of runtime on
fast machines.

Does this explanation make sense?
--
Michael

Robert Haas

robertmhaas@gmail.com

about 2 months ago

In reply to: Michael Paquier (#3)

Re: injection_points: Switch wait/wakeup to use atomics rather than latches

On Thu, May 28, 2026 at 7:19 PM Michael Paquier <michael@paquier.xyz> wrote:

On Thu, May 28, 2026 at 08:40:39AM -0400, Robert Haas wrote:

After reading this email, the linked-to email, and the commit message
for the patch, I still don't have a clear understanding of what this
is intended to fix. It seems like it's going to make the
responsiveness worse. In general, we want to replace escalating wait
loops with things that wake up instantly at the right time, and this
is going in the opposite direction.

This is an exchange between responsiveness of the system and
flexibility. I have had two complaints in the past about the fact
that the waits and wakeups were not doable due to the fact that we
rely on condition variables and latches:

I'm still struggling to understand. Condition variables and latches
are both designed to allow for nice waits and wakeups.

--
Robert Haas
EDB: http://www.enterprisedb.com

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

about 2 months ago

In reply to: Robert Haas (#4)

Re: injection_points: Switch wait/wakeup to use atomics rather than latches

On 29/05/2026 15:48, Robert Haas wrote:

On Thu, May 28, 2026 at 7:19 PM Michael Paquier <michael@paquier.xyz> wrote:

On Thu, May 28, 2026 at 08:40:39AM -0400, Robert Haas wrote:

After reading this email, the linked-to email, and the commit message
for the patch, I still don't have a clear understanding of what this
is intended to fix. It seems like it's going to make the
responsiveness worse. In general, we want to replace escalating wait
loops with things that wake up instantly at the right time, and this
is going in the opposite direction.

This is an exchange between responsiveness of the system and
flexibility. I have had two complaints in the past about the fact
that the waits and wakeups were not doable due to the fact that we
rely on condition variables and latches:

I'm still struggling to understand. Condition variables and latches
are both designed to allow for nice waits and wakeups.

They only work after you have a PGPROC slot. If you want to inject code
to authentication, or into postmaster, you cannot use them.

- Heikki

Robert Haas

robertmhaas@gmail.com

about 2 months ago

In reply to: Heikki Linnakangas (#5)

Re: injection_points: Switch wait/wakeup to use atomics rather than latches

On Fri, May 29, 2026 at 9:31 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

They only work after you have a PGPROC slot. If you want to inject code
to authentication, or into postmaster, you cannot use them.

OK, got it now.

--
Robert Haas
EDB: http://www.enterprisedb.com

Michael Paquier

michael@paquier.xyz

about 2 months ago

In reply to: Robert Haas (#6)

Re: injection_points: Switch wait/wakeup to use atomics rather than latches

On Fri, May 29, 2026 at 12:00:46PM -0400, Robert Haas wrote:

On Fri, May 29, 2026 at 9:31 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

They only work after you have a PGPROC slot. If you want to inject code
to authentication, or into postmaster, you cannot use them.

OK, got it now.

It seems like Heikki's comment was better worded than mine.

Also mentioned upthread, but the lack of PGPROC also means a lack of
monitoring as wait events cannot be tracked. Currently, we rely on
that in the TAP tests. For cases where the procs are not available, I
don't have a better idea than generating a LOG entry after the wait
counts have been generated (with a PID) and couple that with a poll of
the server logs to let a script understand that a process is in
waiting mode.

As a whole, I don't think that we should try to be fancy with the
implementation, which is why I have used primitives that should work
in any context, and I'm not convinced that this is worth its own
facility if that just means more responsiveness (in most cases the
wait should not take more than a couple ms to notice a wakeup). I'm
open to more fancy ideas, of course.
--
Michael

Andrey Borodin

amborodin@acm.org

about 2 months ago

In reply to: Heikki Linnakangas (#5)

Re: injection_points: Switch wait/wakeup to use atomics rather than latches

On 28 May 2026, at 07:43, Michael Paquier <michael@paquier.xyz> wrote:

Andrey in CC, as I'm sure he is interested in that.

Thanks! That's exactly what I need for my tests.

On 29 May 2026, at 18:31, Heikki Linnakangas <hlinnaka@iki.fi> wrote:

I'm still struggling to understand. Condition variables and latches
are both designed to allow for nice waits and wakeups.

They only work after you have a PGPROC slot. If you want to inject code to authentication, or into postmaster, you cannot use them.

I have another reason: postmaster death behavior. When we wait on
ConVar and postmaster is kill-9-ed, we release all LWLocks. Which causes
corruption [0]/messages/by-id/B3C69B86-7F82-4111-B97F-0005497BB745@yandex-team.ru, because checkpointer can flush something that's not in WAL.

So I'm trying to build corruption-seeking tests using tool that can induce corruption
in tests.

About the patch:
- inj_state->wait_counts[index]++;
SpinLockRelease(&inj_state->lock);

- /* And broadcast the change to the waiters */
- ConditionVariableBroadcast(&inj_state->wait_point);
+ pg_atomic_fetch_add_u32(&inj_state->wait_counts[index], 1);

Can we move pg_atomic_fetch_add_u32() back under the lock?
We determine slot index under lock, then wakeup slot outside the lock.
In a correctly written test meaning this is not a problem.
However, technically, identity of a slot can change between releasing the lock
and incrementing wait_counts[index].

I'll do another pass tomorrow, maybe something else will catch my eye.

Best regards, Andrey Borodin.

[0]: /messages/by-id/B3C69B86-7F82-4111-B97F-0005497BB745@yandex-team.ru

Andrey Borodin

amborodin@acm.org

about 2 months ago

In reply to: Andrey Borodin (#8)

Re: injection_points: Switch wait/wakeup to use atomics rather than latches

On 30 May 2026, at 13:05, Andrey Borodin <x4mmm@yandex-team.ru> wrote:

I'll do another pass tomorrow, maybe something else will catch my eye.

I've tried the patch on my old corruption experiments. And it works for me.

I had to switch killing to something like
foreach my $i (1 .. 100)
{
my @alive = grep { kill 0, $_ } @cluster_pids;
last unless @alive;
kill 'KILL', @alive;
usleep(100_000);
}

The shared memory segment must be released before we attempt recovery.
But that's exactly what I wanted anyway. Thank you!

Best regards, Andrey Borodin.

#10

Michael Paquier

michael@paquier.xyz

about 2 months ago

In reply to: Andrey Borodin (#9)

Re: injection_points: Switch wait/wakeup to use atomics rather than latches

On Mon, Jun 01, 2026 at 04:25:40PM +0500, Andrey Borodin wrote:

The shared memory segment must be released before we attempt recovery.
But that's exactly what I wanted anyway. Thank you!

How do you guarantee that a wait position is reached in the case where
you don't have access to wait events to make sure that the wait point
is reached?
--
Michael

#11

Andrey Borodin

amborodin@acm.org

about 2 months ago

In reply to: Michael Paquier (#10)

Re: injection_points: Switch wait/wakeup to use atomics rather than latches

On 2 Jun 2026, at 09:15, Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Jun 01, 2026 at 04:25:40PM +0500, Andrey Borodin wrote:

The shared memory segment must be released before we attempt recovery.
But that's exactly what I wanted anyway. Thank you!

How do you guarantee that a wait position is reached in the case where
you don't have access to wait events to make sure that the wait point
is reached?

In a research test? sleep(1)

Best regards, Andrey Borodin.

#12

Michael Paquier

michael@paquier.xyz

about 2 months ago

In reply to: Andrey Borodin (#11)

Re: injection_points: Switch wait/wakeup to use atomics rather than latches

On Tue, Jun 02, 2026 at 10:13:01AM +0500, Andrey Borodin wrote:

In a research test? sleep(1)

A hardcoded sleep can work on a fast machine but it makes the test
slow. A sleep is not a reliable technique if running the tests on a
slow machine, as an expected wait point may not have been reached. We
have both very slow and very fast animals in the buildfarm.

Rewording my question a bit: did you consider some options regarding
what an equivalent of a wait event lookup should look like when we
don't have a PGPROC? My idea of printing a LOG and do a server log
lookup would work, just asking if others have better ideas than the
only one I got.
--
Michael

#13

Andrey Borodin

amborodin@acm.org

about 2 months ago

In reply to: Michael Paquier (#12)

Re: injection_points: Switch wait/wakeup to use atomics rather than latches

On 2 Jun 2026, at 10:27, Michael Paquier <michael@paquier.xyz> wrote:

Rewording my question a bit: did you consider some options regarding
what an equivalent of a wait event lookup should look like when we
don't have a PGPROC? My idea of printing a LOG and do a server log
lookup would work, just asking if others have better ideas than the
only one I got.

For tests without PGPROC we can mmap() inj_state to a fixed file in
PGDATA/injection_points.shm. TAP can poll name[] to detect that a wait
point was reached and bump wait_counts[] to wake.

Best regards, Andrey Borodin.

#14

Michael Paquier

michael@paquier.xyz

about 2 months ago

In reply to: Andrey Borodin (#13)

Re: injection_points: Switch wait/wakeup to use atomics rather than latches

On Tue, Jun 02, 2026 at 11:46:52PM +0500, Andrey Borodin wrote:

For tests without PGPROC we can mmap() inj_state to a fixed file in
PGDATA/injection_points.shm. TAP can poll name[] to detect that a wait
point was reached and bump wait_counts[] to wake.

That's a direction. Only mmap() would not be sufficient, as WIN32 has
its own non-POSIX idea on the matter with CreateFileMapping() &
friends. I am wondering if we should think harder about an interface
that could make such things easier for extensions. Or we could have a
portable layer added to injection_points, as well..
--
Michael

#15

Andrey Borodin

amborodin@acm.org

about 1 month ago

In reply to: Michael Paquier (#14)

Re: injection_points: Switch wait/wakeup to use atomics rather than latches

Hi Michael,

On 3 Jun 2026, at 03:23, Michael Paquier <michael@paquier.xyz> wrote:

That's a direction. Only mmap() would not be sufficient, as WIN32 has
its own non-POSIX idea on the matter with CreateFileMapping() &
friends.

Right, mmap() alone is not enough. I hacked up a prototype (on top of
your atomics commit) that maps the state portably on both sides: POSIX
mmap() in the backend and a file-backed CreateFileMapping() on WIN32,
and the same in a small standalone client. Perl cannot mmap()
portably, so the mapping is done by a tiny C helper that TAP drives,
rather than from the test script itself.

I am wondering if we should think harder about an interface that could
make such things easier for extensions. Or we could have a portable
layer added to injection_points, as well..

While prototyping I hit a constraint that I think answers this. To arm
a point without SQL it is not enough to reach this module's wait state:
you have to reach the registry of active points, i.e.
ActiveInjectionPoints in injection_point.c. INJECTION_POINT() consults
that array, and the module only supplies the callback once the core has
found the name. So the part that has to be reachable from outside lives
in the core, not in the module.

That splits the problem into two layers rather cleanly:

- core: the active-points array is backed by a file
(injection_points.shm in the data directory). This is the generic
piece - any extension could arm or inspect points out of band, and
it is what makes attach-without-SQL possible. The lock-free
generation protocol that reads the array is unchanged, so external
readers can rely on it too.

- module: injection_points keeps its own file
(injection_points_wait.shm) for the wait/wakeup coordination of the
"wait" action. That part is test-specific and stays out of the
core.

A standalone client (injection_points_state) maps both files the way
the backend does and can attach/detach a point and detect+release a
waiter, with no backend connection and no PGPROC. A TAP test runs the
whole flow without SQL for arming or waking: arm a wait point with the
client, trigger it from a background session, let the client poll the
mapped file until the point is reached (so no sleep, and it behaves the
same on slow and fast animals), disarm it, then wake. SQL is used only
to trigger the point and to observe injection_points_list().

This is a rough prototype to make the discussion concrete, not a
proposal of the final shape. To name few open points:

1. Where the portable mapping helper should live.
1. Keep the create-or-attach-file helper (POSIX + WIN32) local to
injection_point.c, as in the prototype. Simplest, no new public
surface.
2. Factor a small portable "named file-backed shared region" that any
extension could use - an "interface for extensions". More
general, but a larger commitment and more to review.
I have a slight preference for 1 now, growing into 2 only once a
second user actually appears.

2. One file or two. The prototype keeps the core registry and the
module's wait coordination in two files. Folding a wait counter into
the core InjectionPointEntry would make it a single file, but it
pushes test-only wait semantics into the core struct, which I would
rather avoid. Slight preference for two files.

3. The client writes the registry locklessly (no InjectionPointLock), so
it assumes nothing else attaches/detaches the same array
concurrently. That holds for arming points out of band around a
controlled session, but it is not a general concurrent-safe writer.
I think that is acceptable for tests, but I am not sure.

4. The path defaults to injection_points.shm under the data directory,
with an env override (PG_INJECTION_POINTS_FILE) so a point could be
armed before the data dir exists or in single player mode. I have
not added a test for that, so probably it does not work. Yet.

5. The external mirror reads and writes the pg_atomic_uint{32,64} fields
as plain integers. That only holds where those atomics are a bare
value; on a platform without native 64-bit atomics pg_atomic_uint64
falls back to a spinlock-protected struct with a different layout, so
the byte mirror would not match (and the backend's width assertion
would fail to build there in the first place). Even where they are
native, the 64-bit field forces 8-byte alignment that the mirror must
replicate - I got bitten by exactly that on the 32-bit build, where
the entries array otherwise starts 4 bytes too early. So a robust
portable layer, if we want one, is not as trivial as it first looked
to me.

6. Whether the external interface should do more than "wait". The
file-backed registry already lets an outside process attach a point
to any callback (error, notice, ...); the only thing we can observe
back without SQL today is that a wait was reached. A natural
generalization is a per-point hit counter that any process bumps when
it runs the point, readable from outside. That would let a test
assert "initdb really went through this path", or "the postmaster
reached this point before it failed at startup" - cases where there
is no backend to query. I have not built it, and it leans the same
way as 2. (it would put an observation counter into the core entry),
but it seems like the obvious next use-case if we go this route.

WDYT? I am not at all sure the file-backed registry is the right
long-term shape; if you prefer the elog-and-grep route I am happy to drop
this, it was mostly a way to see how invasive the alternative really is.

Thank you!

Best regards, Andrey Borodin.

P.S. Fun observation - it speeds up tests. I swapped the "point reached"
detection in test_misc/010_index_concurrently_upsert.pl (41 cases, each polling
pg_stat_activity every 100ms via a fresh psql) for a 10ms poll of the mapped
file through the client. Median test wall time dropped from 2.2s to 1.6s
(~27%) on my MacBook Air M5.

#16

Andrey Borodin

amborodin@acm.org

about 1 month ago

In reply to: Andrey Borodin (#15)

Re: injection_points: Switch wait/wakeup to use atomics rather than latches

On 12 Jun 2026, at 12:02, Andrey Borodin <x4mmm@yandex-team.ru> wrote:

I hacked up a prototype (on top of
your atomics commit) that maps the state portably on both sides

v2026-06-12 passed locally but tripped on the Windows CI (EXEC_BACKEND),
v2026-06-14 fixes that.

The mistake was handing ActiveInjectionPoints to children as the
postmaster's pointer through BackendParameters. That only works for the
main shared memory segment, which is re-attached at a fixed address; the
file-backed array is mapped wherever each process lands, so the inherited
pointer was garbage and the first point hit in a child crashed.

v2026-06-14 drops that and lets each process map the file itself. Rather
than racing to map it at the right moment in child startup, the registry
is mapped lazily on first use: the first time a process checks a point it
attaches the file if it exists, and treats "no file" as "nothing armed".
So a point is reached even when it fires before the child has attached
shared memory - e.g. "backend-initialize", which 005_negotiate_encryption
exercises - and a point armed out of band, even before the server is up,
is not silently missed. Forked children still just inherit the
postmaster's mapping.

Personally I do not like lazy initialization, it's race-prone. But it
guarantees the file is mapped before the first point is checked, wherever
that point sits.

The same revision also closes an unrelated init race: when two processes
create the backing file at once, the loser could map it before the winner
sized it (SIG-something on first touch). The attach path now waits for the
file to reach full size before mapping.

CI is green. Same two patches on top of your atomics commit. Let me know
if this prototype goes radically wrong way.

Thanks!

Best regards, Andrey Borodin.

#17

Michael Paquier

michael@paquier.xyz

about 1 month ago

In reply to: Andrey Borodin (#16)

Re: injection_points: Switch wait/wakeup to use atomics rather than latches

On Sun, Jun 14, 2026 at 02:23:54PM +0500, Andrey Borodin wrote:

The same revision also closes an unrelated init race: when two processes
create the backing file at once, the loser could map it before the winner
sized it (SIG-something on first touch). The attach path now waits for the
file to reach full size before mapping.

CI is green. Same two patches on top of your atomics commit. Let me know
if this prototype goes radically wrong way.

You may find my reply surprising (or not), but using a client tool to
bypass the SQL protocol is super invasive in terms of the in-core
changes, and I'm -1 on that.

Test tooling should rely on simple facilities, and you are
re-implementing quite a few things here that come with a new class of
bugs and what look like design issues to me due to the invasiveness:
- Reimplementation of the registry protocol for file mapping. That
may be useful for other things, and there may be use for a refactored
in-core API, but I don't see why we should use it here:
- Mirroring of internal structs for shmem manipulation.
- Core data updates with zero locking.

I am wondering if we are not overcomplicating things here.. How about
this idea instead of 0002 and 0003, for paths that cannot rely on some
SQL:
- Let's use a file-based markup, located in a injection_points/ in the
data folder.
- When attaching a wait point, write a marker, say wait_$POINT.
- On wakeup, write a wakeup_$POINT.
- The wait routine does periodic checks of the wakeup_$POINT file,
using a stat().

It depends on how much we want to achieve, but forcing a postmaster to
wait at an early startup sequence would then be something like:
- Add the a wait markup.
- At startup, shared_preload_libraries scans the pg_injection_points/
repo, fills in its shmem state by calling InjectionPointAttach().
- postmaster hits the injection point, calls injection_wait()
- Test does what it wants while the postmaster waits, manipulates
states.
- Test script drops a wakeup file.
- Postmaster sees the file, continues its startup sequence.
This means one cannot set an injection point until
shared_preload_libraries is loaded, of course. Cleanup logic feels
kind of nice: test cleans his stuff, or just remove
pg_injection_points/ at shutdown with an exit callback. No platform
specific logic, only FS checks.

This means a bit higher latency, but it eliminates a full class of
bugs due to the fact that it is simpler. This would need to live
alongside patch 0001 so as a _PG_init() can be loaded when the
injection_points lib is loaded; we're still going to need it to fill
the shmem info with the wait slot being occupied.
--
Michael

#18

Andrey Borodin

amborodin@acm.org

19 days ago

In reply to: Michael Paquier (#17)

Re: injection_points: Switch wait/wakeup to use atomics rather than latches

Hi Michael,

You suggested driving the module's wait points through the filesystem
instead of mapping shared memory out of process. Here is that, rebased
on your atomics commit. It is a test-module-only change; the core
injection point registry is untouched. This message and the patch is
AI-editorialized for readability.

Design
------

Control dirs and files live under pg_injection_points/ in the data
directory:

pg_injection_points/<point>/ attaches <point> as a wait point
pg_injection_points/<point>/<pid> that backend is waiting at <point>

The module scans this directory once at postmaster startup and attaches
a wait point for each subdirectory, so a test can arm a point before the
server is up (shared_preload_libraries is required for that). When a
process reaches the point it creates its <pid> file and waits, polling
with stat() until the file is removed, in addition to the existing
wakeup counter. So an out-of-process test can:

- arm a point with mkdir, before SQL is available;
- see which backends are blocked by listing the directory;
- wake one specific backend by unlinking its <pid> file.

No SQL connection, no platform-specific code, and no out-of-process
access to shared memory.

The scan is deliberately one-shot at startup. If a consumer ever needs
to (re)load points from the filesystem at runtime, that can be added
later as a dedicated injection point action (say a "reload" type); I
think we grow the facility only as the tests actually require.

The test question
-----------------

But in this project we grow the facility only when a test actually needs
it, so the real question is which test justifies this one.

The first case that seems to call for it is the ProcKill lock-group /
procLatch recycle race [0]/messages/by-id/d2983796-2603-41b7-a66e-fc8489ddb954@gmail.com. On that thread you concluded that
a wait point inside ProcKill() cannot use the latch-based wait, because
the fix (84b9d6bceab6) now disowns the latch earlier, and that the test
had to lean on statement_timeout to keep the leader parked long enough
- which is an anti-pattern. You (IDK, maybe someone else) suggested
switching the wait to a latch-free shmem flag on HEAD; that is your atomics
commit, and this patch builds on it.

What the filesystem adds on top is the observability that was still
awkward there. Instead of statement_timeout and an unreliable
pg_stat_activity (or query_until banners a la 011_lock_stats.pl), the
controller just waits for pg_injection_points/<point>/<pid> to appear
and then wakes that exact PID. It also sidesteps the
before_shmem_exit(injection_points_cleanup) hook that detaches a
self-attached point before ProcKill runs, since the point is armed by
the startup scan rather than by the victim.

So the question is whether to rewrite that reproducer on top of these
two patches. This design goes far beyond the minimal
plan you sketched (latch-free wait, attach() with an optional PID, then
the test), and that you were not sure the ProcKill case alone justifies
the churn. Before writing it I would confirm it really needs the
filesystem parts and is not better served by something simpler.

Alternatively, we can use new capability for some other tests.

WDYT?

Thank you!

Best regards, Andrey Borodin.

[0]: /messages/by-id/d2983796-2603-41b7-a66e-fc8489ddb954@gmail.com

#19

Michael Paquier

michael@paquier.xyz

18 days ago

In reply to: Andrey Borodin (#18)

Re: injection_points: Switch wait/wakeup to use atomics rather than latches

On Tue, Jul 07, 2026 at 02:51:04PM +0500, Andrey Borodin wrote:

So the question is whether to rewrite that reproducer on top of these
two patches. This design goes far beyond the minimal
plan you sketched (latch-free wait, attach() with an optional PID, then
the test), and that you were not sure the ProcKill case alone justifies
the churn. Before writing it I would confirm it really needs the
filesystem parts and is not better served by something simpler.

Alternatively, we can use new capability for some other tests.

WDYT?

The hard part is that the implementation choices are driven by the
needs, where we may want to be able to do the following things without
having to touch SQL:
- Register a wait() at very early stages.
- Know from a client perspective that a PID is waiting, as we may not
have access to pg_stat_activity.
- Trigger a wakeup.
Your patch is able to achieve all of that, but it may be better to
know better about more use cases folks have seen before taking any
hard decision.

Here, one good case that I could see in the tree is 007_pre_auth.pl,
that uses currently as a workaround a background connection to create
a wait point. We could switch that to register a wait early, but the
impact is limited.

@Heikki, what kind of ideas did you have for some of your toy tests
recently, particularly with the protocol area? The protocol tests now
in the tree use errors (backend-initialize, GSSAPI and SSL startup
points), not waits and/or wakeups.

Applied the patch that removes the condition variable dependency, btw.
that's one thing less to worry about.
--
Michael

#20

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

18 days ago

In reply to: Michael Paquier (#19)

Re: injection_points: Switch wait/wakeup to use atomics rather than latches

On 08/07/2026 09:29, Michael Paquier wrote:

On Tue, Jul 07, 2026 at 02:51:04PM +0500, Andrey Borodin wrote:

So the question is whether to rewrite that reproducer on top of these
two patches. This design goes far beyond the minimal
plan you sketched (latch-free wait, attach() with an optional PID, then
the test), and that you were not sure the ProcKill case alone justifies
the churn. Before writing it I would confirm it really needs the
filesystem parts and is not better served by something simpler.

Alternatively, we can use new capability for some other tests.

WDYT?

The hard part is that the implementation choices are driven by the
needs, where we may want to be able to do the following things without
having to touch SQL:
- Register a wait() at very early stages.
- Know from a client perspective that a PID is waiting, as we may not
have access to pg_stat_activity.
- Trigger a wakeup.
Your patch is able to achieve all of that, but it may be better to
know better about more use cases folks have seen before taking any
hard decision.

Here, one good case that I could see in the tree is 007_pre_auth.pl,
that uses currently as a workaround a background connection to create
a wait point. We could switch that to register a wait early, but the
impact is limited.

@Heikki, what kind of ideas did you have for some of your toy tests
recently, particularly with the protocol area? The protocol tests now
in the tree use errors (backend-initialize, GSSAPI and SSL startup
points), not waits and/or wakeups.

I've got nothing in mind right now, but I remember when we started to
talk about this, I was working on something where I wanted to inject
waits before authentication. I think it was related to dead-end
backends, max_connections, reserved_connections and all that.

- Heikki

#21

Andrey Borodin

amborodin@acm.org

17 days ago

In reply to: Heikki Linnakangas (#20)

injection_points: Switch wait/wakeup to use atomics rather than latches

Attachments:

Attachments:

Attachments:

Attachments: