DataChecksumsStateStruct cost_delay fields and locking
Hi,
The comment in the DataChecksumsStateStruct says:
/*
* These fields indicate the target state that the launcher is currently
* working towards. They can be different from the corresponding launch_*
* fields, if a new pg_enable/disable_data_checksums() call was made while
* the launcher/worker was already running.
*
* The below members are set when the launcher starts, and are only
* accessed read-only by the single worker. Thus, we can access these
* without a lock. If multiple workers, or dynamic cost parameters, are
* supported at some point then this would need to be revisited.
*/
DataChecksumsWorkerOperation operation;
int cost_delay;
int cost_limit;
The "only accessed read-only by the single worker" part isn't true,
because DataChecksumsWorkerMain() does this:
/*
* Check if the cost settings changed during runtime and if so, update
* to reflect the new values and signal that the access strategy needs
* to be refreshed.
*/
LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
if ((DataChecksumState->launch_cost_delay != DataChecksumState->cost_delay)
|| (DataChecksumState->launch_cost_limit != DataChecksumState->cost_limit))
{
costs_updated = true;
VacuumCostDelay = DataChecksumState->launch_cost_delay;
VacuumCostLimit = DataChecksumState->launch_cost_limit;
VacuumUpdateCosts();DataChecksumState->cost_delay = DataChecksumState->launch_cost_delay;
DataChecksumState->cost_limit = DataChecksumState->launch_cost_limit;
}
else
costs_updated = false;
LWLockRelease(DataChecksumsWorkerLock);
So contrary to the comment, the cost_delay fields are updated by the worker.
Some other observations:
- at the beginning of DataChecksumsWorkerMain(), it reads
DataChecksumState->cost_delay and cost_limit without holding the lock,
and also DataChecksumState->process_shared_catalogs. I think that's OK,
but I would just add locking to it, to remove all doubt.
- the comment says "These fields indicate the target state that the
launcher is currently working towards". It took me a moment to
understand what the means. At first, I thought that it means that it's
the target state of the launcher, but not necessarily what it's
currently using. That's wrong. Those are in fact the values that the
*worker* is *currently* using. launch_cost_delay/limit are the "target"
values, which will eventually be copied into the current
cost_delay/limit values. Perhaps change "... launcher is currently
working towards" into "... worker is currently running with" or
something like that.
- Heikki
On 17 Jun 2026, at 09:42, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
Hi,
The comment in the DataChecksumsStateStruct says:
/*
* These fields indicate the target state that the launcher is currently
* working towards. They can be different from the corresponding launch_*
* fields, if a new pg_enable/disable_data_checksums() call was made while
* the launcher/worker was already running.
*
* The below members are set when the launcher starts, and are only
* accessed read-only by the single worker. Thus, we can access these
* without a lock. If multiple workers, or dynamic cost parameters, are
* supported at some point then this would need to be revisited.
*/
DataChecksumsWorkerOperation operation;
int cost_delay;
int cost_limit;The "only accessed read-only by the single worker" part isn't true, because DataChecksumsWorkerMain() does this:
/*
* Check if the cost settings changed during runtime and if so, update
* to reflect the new values and signal that the access strategy needs
* to be refreshed.
*/
LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
if ((DataChecksumState->launch_cost_delay != DataChecksumState->cost_delay)
|| (DataChecksumState->launch_cost_limit != DataChecksumState->cost_limit))
{
costs_updated = true;
VacuumCostDelay = DataChecksumState->launch_cost_delay;
VacuumCostLimit = DataChecksumState->launch_cost_limit;
VacuumUpdateCosts();
DataChecksumState->cost_delay = DataChecksumState->launch_cost_delay;
DataChecksumState->cost_limit = DataChecksumState->launch_cost_limit;
}
else
costs_updated = false;
LWLockRelease(DataChecksumsWorkerLock);So contrary to the comment, the cost_delay fields are updated by the worker.
Ugh, that comment was missed when I added the ability to update the cost params
during runtime.
Some other observations:
- at the beginning of DataChecksumsWorkerMain(), it reads DataChecksumState->cost_delay and cost_limit without holding the lock, and also DataChecksumState->process_shared_catalogs. I think that's OK, but I would just add locking to it, to remove all doubt.
Agreed, it should be wrapped in a lock even if it's (currently) safe.
- the comment says "These fields indicate the target state that the launcher is currently working towards". It took me a moment to understand what the means. At first, I thought that it means that it's the target state of the launcher, but not necessarily what it's currently using. That's wrong. Those are in fact the values that the *worker* is *currently* using. launch_cost_delay/limit are the "target" values, which will eventually be copied into the current cost_delay/limit values. Perhaps change "... launcher is currently working towards" into "... worker is currently running with" or something like that.
Yes, that sentence was clearly a pretty strong wordsoup. Thanks for pointing
it out.
All comments addressed in the attached.
--
Daniel Gustafsson
Attachments:
0001-Fix-comments-on-data-checksum-cost-settings.patchapplication/octet-stream; name=0001-Fix-comments-on-data-checksum-cost-settings.patch; x-unix-mode=0644Download+14-12
On 17/06/2026 16:04, Daniel Gustafsson wrote:
All comments addressed in the attached.
Thanks, LGTM.
@@ -1533,9 +1532,13 @@ DataChecksumsWorkerMain(Datum arg) /* * Get a list of all temp tables present as we start in this database. We * need to wait until they are all gone until we are done, since we cannot - * access these relations and modify them. + * access these relations and modify them. For the list of relations to + * process once the temp relations are gone, check if shared catalogs have + * been processed already. */ InitialTempTableList = BuildRelationList(true, false); + LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE); + process_shared = DataChecksumState->process_shared_catalogs;/*
* Enable vacuum cost delay, if any. While this process isn't doing any
Not new with this patch, but caught my eye now: the double "until" in
the phrase "We need to wait until they are all gone until we are done"
sounds a little awkward. I had to read it a few times to parse it right.
I'd suggest "We need to wait until they are all gone before we exit" or
something like that.
- Heikki
On 17 Jun 2026, at 20:12, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 17/06/2026 16:04, Daniel Gustafsson wrote:
All comments addressed in the attached.
Thanks, LGTM.
Thanks for review of the patch and the post-commit review of the file. Pushed
(with the additional commit fix you mentioned).
--
Daniel Gustafsson