waiting for reload in tests
Hi,
In a couple tests I (IIRC others as well) had the problem that a config reload
isn't actually synchronous. I.e. a sequence like
$node_primary->reload;
$node_primary->safe_psql('postgres',...)
isn't actually guaranteed to observe the config as reloaded in the the
safe_psql(). It *typically* will see the new config results, but if the system
busy and/or slow, the sighup might not yet have been propagated by postmaster
and/or not yet received by the relevant process.
I don't really see a way to guarantee this with reasonable effort in the
back-branches. In HEAD we could (with some difficulties around postmaster and
UI) use a global barrier to wait for the reload to complete. For the
backbranches I guess we could hack something using retries and setting a
pseudo-guc to check whether the reload has been processed - but that's not
bulletproof at all, some process(es) could take longer to receive the signal.
Anybody got a better idea?
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
In a couple tests I (IIRC others as well) had the problem that a config reload
isn't actually synchronous. I.e. a sequence like
$node_primary->reload;
$node_primary->safe_psql('postgres',...)
isn't actually guaranteed to observe the config as reloaded in the the
safe_psql().
Brute force way: s/reload/restart/
Less brute force: wait for "SHOW variable-you-changed" to report the
value you expect.
regards, tom lane
On Mon, May 09, 2022 at 09:29:32PM -0400, Tom Lane wrote:
Brute force way: s/reload/restart/
That was my first thought, as it can be tricky to make sure that all
the processes got the update because we don't publish such a state.
One thing I was also thinking about would be to update
pg_stat_activity.state_change when a reload is processed on top of its
current updates, then wait for it to be effective in all the processes
reported. The field remains NULL for most non-backend processes,
which would be a compatibility change.
Less brute force: wait for "SHOW variable-you-changed" to report the
value you expect.
This method may still be unreliable in some processes like a logirep
launcher/receiver or just autovacuum, no?
--
Michael
Michael Paquier <michael@paquier.xyz> writes:
On Mon, May 09, 2022 at 09:29:32PM -0400, Tom Lane wrote:
Less brute force: wait for "SHOW variable-you-changed" to report the
value you expect.
This method may still be unreliable in some processes like a logirep
launcher/receiver or just autovacuum, no?
Yeah, if your test case requires knowing that some background process
has gotten the word, it's a *lot* harder. I think we'd have to add a
last-config-update-time column in pg_stat_activity or something like that.
regards, tom lane
Hi,
On 2022-05-09 21:42:20 -0400, Tom Lane wrote:
Michael Paquier <michael@paquier.xyz> writes:
On Mon, May 09, 2022 at 09:29:32PM -0400, Tom Lane wrote:
Less brute force: wait for "SHOW variable-you-changed" to report the
value you expect.This method may still be unreliable in some processes like a logirep
launcher/receiver or just autovacuum, no?
Yept, that's the problem. In my case it's the startup process...
Yeah, if your test case requires knowing that some background process
has gotten the word, it's a *lot* harder. I think we'd have to add a
last-config-update-time column in pg_stat_activity or something like that.
That's basically what I was referencing with global barriers...
Greetings,
Andres Freund