DataChecksumsStateStruct cost_delay fields and locking

Started by Heikki Linnakangas2 days ago4 messageshackers
Jump to latest
#1Heikki Linnakangas
heikki.linnakangas@enterprisedb.com

Hi,

The comment in the DataChecksumsStateStruct says:

/*
* These fields indicate the target state that the launcher is currently
* working towards. They can be different from the corresponding launch_*
* fields, if a new pg_enable/disable_data_checksums() call was made while
* the launcher/worker was already running.
*
* The below members are set when the launcher starts, and are only
* accessed read-only by the single worker. Thus, we can access these
* without a lock. If multiple workers, or dynamic cost parameters, are
* supported at some point then this would need to be revisited.
*/
DataChecksumsWorkerOperation operation;
int cost_delay;
int cost_limit;

The "only accessed read-only by the single worker" part isn't true,
because DataChecksumsWorkerMain() does this:

/*
* Check if the cost settings changed during runtime and if so, update
* to reflect the new values and signal that the access strategy needs
* to be refreshed.
*/
LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
if ((DataChecksumState->launch_cost_delay != DataChecksumState->cost_delay)
|| (DataChecksumState->launch_cost_limit != DataChecksumState->cost_limit))
{
costs_updated = true;
VacuumCostDelay = DataChecksumState->launch_cost_delay;
VacuumCostLimit = DataChecksumState->launch_cost_limit;
VacuumUpdateCosts();

DataChecksumState->cost_delay = DataChecksumState->launch_cost_delay;
DataChecksumState->cost_limit = DataChecksumState->launch_cost_limit;
}
else
costs_updated = false;
LWLockRelease(DataChecksumsWorkerLock);

So contrary to the comment, the cost_delay fields are updated by the worker.

Some other observations:

- at the beginning of DataChecksumsWorkerMain(), it reads
DataChecksumState->cost_delay and cost_limit without holding the lock,
and also DataChecksumState->process_shared_catalogs. I think that's OK,
but I would just add locking to it, to remove all doubt.

- the comment says "These fields indicate the target state that the
launcher is currently working towards". It took me a moment to
understand what the means. At first, I thought that it means that it's
the target state of the launcher, but not necessarily what it's
currently using. That's wrong. Those are in fact the values that the
*worker* is *currently* using. launch_cost_delay/limit are the "target"
values, which will eventually be copied into the current
cost_delay/limit values. Perhaps change "... launcher is currently
working towards" into "... worker is currently running with" or
something like that.

- Heikki

#2Daniel Gustafsson
daniel@yesql.se
In reply to: Heikki Linnakangas (#1)
Re: DataChecksumsStateStruct cost_delay fields and locking

On 17 Jun 2026, at 09:42, Heikki Linnakangas <hlinnaka@iki.fi> wrote:

Hi,

The comment in the DataChecksumsStateStruct says:

/*
* These fields indicate the target state that the launcher is currently
* working towards. They can be different from the corresponding launch_*
* fields, if a new pg_enable/disable_data_checksums() call was made while
* the launcher/worker was already running.
*
* The below members are set when the launcher starts, and are only
* accessed read-only by the single worker. Thus, we can access these
* without a lock. If multiple workers, or dynamic cost parameters, are
* supported at some point then this would need to be revisited.
*/
DataChecksumsWorkerOperation operation;
int cost_delay;
int cost_limit;

The "only accessed read-only by the single worker" part isn't true, because DataChecksumsWorkerMain() does this:

/*
* Check if the cost settings changed during runtime and if so, update
* to reflect the new values and signal that the access strategy needs
* to be refreshed.
*/
LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
if ((DataChecksumState->launch_cost_delay != DataChecksumState->cost_delay)
|| (DataChecksumState->launch_cost_limit != DataChecksumState->cost_limit))
{
costs_updated = true;
VacuumCostDelay = DataChecksumState->launch_cost_delay;
VacuumCostLimit = DataChecksumState->launch_cost_limit;
VacuumUpdateCosts();
DataChecksumState->cost_delay = DataChecksumState->launch_cost_delay;
DataChecksumState->cost_limit = DataChecksumState->launch_cost_limit;
}
else
costs_updated = false;
LWLockRelease(DataChecksumsWorkerLock);

So contrary to the comment, the cost_delay fields are updated by the worker.

Ugh, that comment was missed when I added the ability to update the cost params
during runtime.

Some other observations:

- at the beginning of DataChecksumsWorkerMain(), it reads DataChecksumState->cost_delay and cost_limit without holding the lock, and also DataChecksumState->process_shared_catalogs. I think that's OK, but I would just add locking to it, to remove all doubt.

Agreed, it should be wrapped in a lock even if it's (currently) safe.

- the comment says "These fields indicate the target state that the launcher is currently working towards". It took me a moment to understand what the means. At first, I thought that it means that it's the target state of the launcher, but not necessarily what it's currently using. That's wrong. Those are in fact the values that the *worker* is *currently* using. launch_cost_delay/limit are the "target" values, which will eventually be copied into the current cost_delay/limit values. Perhaps change "... launcher is currently working towards" into "... worker is currently running with" or something like that.

Yes, that sentence was clearly a pretty strong wordsoup. Thanks for pointing
it out.

All comments addressed in the attached.

--
Daniel Gustafsson

Attachments:

0001-Fix-comments-on-data-checksum-cost-settings.patchapplication/octet-stream; name=0001-Fix-comments-on-data-checksum-cost-settings.patch; x-unix-mode=0644Download+14-12
#3Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Daniel Gustafsson (#2)
Re: DataChecksumsStateStruct cost_delay fields and locking

On 17/06/2026 16:04, Daniel Gustafsson wrote:

All comments addressed in the attached.

Thanks, LGTM.

@@ -1533,9 +1532,13 @@ DataChecksumsWorkerMain(Datum arg)
/*
* Get a list of all temp tables present as we start in this database. We
* need to wait until they are all gone until we are done, since we cannot
-	 * access these relations and modify them.
+	 * access these relations and modify them.  For the list of relations to
+	 * process once the temp relations are gone, check if shared catalogs have
+	 * been processed already.
*/
InitialTempTableList = BuildRelationList(true, false);
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	process_shared = DataChecksumState->process_shared_catalogs;

/*
* Enable vacuum cost delay, if any. While this process isn't doing any

Not new with this patch, but caught my eye now: the double "until" in
the phrase "We need to wait until they are all gone until we are done"
sounds a little awkward. I had to read it a few times to parse it right.
I'd suggest "We need to wait until they are all gone before we exit" or
something like that.

- Heikki

#4Daniel Gustafsson
daniel@yesql.se
In reply to: Heikki Linnakangas (#3)
Re: DataChecksumsStateStruct cost_delay fields and locking

On 17 Jun 2026, at 20:12, Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 17/06/2026 16:04, Daniel Gustafsson wrote:

All comments addressed in the attached.

Thanks, LGTM.

Thanks for review of the patch and the post-commit review of the file. Pushed
(with the additional commit fix you mentioned).

--
Daniel Gustafsson