Should vacuum process config file reload more often

Started by Melanie Plagemanalmost 3 years ago94 messages
#1Melanie Plageman
melanieplageman@gmail.com
1 attachment(s)

Hi,

Users may wish to speed up long-running vacuum of a large table by
decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
config file is only reloaded between tables (for autovacuum) or after
the statement (for explicit vacuum). This has been brought up for
autovacuum in [1]/messages/by-id/22CA91B4-D341-4075-BD3C-4BAB52AF1E80@amazon.com.

Andres suggested that it might be possible to check ConfigReloadPending
in vacuum_delay_point(), so I thought I would draft a rough patch and
start a discussion.

Since vacuum_delay_point() is also called by analyze and we do not want
to reload the configuration file if we are in a user transaction, I
widened the scope of the in_outer_xact variable in vacuum() and allowed
analyze in a user transaction to default to the current configuration
file reload cadence in PostgresMain().

I don't think I can set and leave vac_in_outer_xact the way I am doing
it in this patch, since I use vac_in_outer_xact in vacuum_delay_point(),
which I believe is reachable from codepaths that would not have called
vacuum(). It seems that if a backend sets it, the outer transaction
commits, and then the backend ends up calling vacuum_delay_point() in a
different way later, it wouldn't be quite right.

Apart from this, one higher level question I have is if there are other
gucs whose modification would make reloading the configuration file
during vacuum/analyze unsafe.

- Melanie

[1]: /messages/by-id/22CA91B4-D341-4075-BD3C-4BAB52AF1E80@amazon.com

Attachments:

v1-0001-reload-config-file-vac.patchtext/x-patch; charset=US-ASCII; name=v1-0001-reload-config-file-vac.patchDownload
From aea6fbfd93ab12e4e27869b755367ab8454e3eef Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Thu, 23 Feb 2023 15:54:55 -0500
Subject: [PATCH v1] reload config file vac

---
 src/backend/commands/vacuum.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index aa79d9de4d..979d19222d 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -48,6 +48,7 @@
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/interrupt.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -75,6 +76,7 @@ int			vacuum_multixact_failsafe_age;
 /* A few variables that don't seem worth passing around as parameters */
 static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
+static bool vac_in_outer_xact = false;
 
 
 /*
@@ -309,8 +311,7 @@ vacuum(List *relations, VacuumParams *params,
 	static bool in_vacuum = false;
 
 	const char *stmttype;
-	volatile bool in_outer_xact,
-				use_own_xacts;
+	volatile bool use_own_xacts;
 
 	Assert(params != NULL);
 
@@ -327,10 +328,10 @@ vacuum(List *relations, VacuumParams *params,
 	if (params->options & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
-		in_outer_xact = false;
+		vac_in_outer_xact = false;
 	}
 	else
-		in_outer_xact = IsInTransactionBlock(isTopLevel);
+		vac_in_outer_xact = IsInTransactionBlock(isTopLevel);
 
 	/*
 	 * Due to static variables vac_context, anl_context and vac_strategy,
@@ -451,7 +452,7 @@ vacuum(List *relations, VacuumParams *params,
 		Assert(params->options & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
-		else if (in_outer_xact)
+		else if (vac_in_outer_xact)
 			use_own_xacts = false;
 		else if (list_length(relations) > 1)
 			use_own_xacts = true;
@@ -469,7 +470,7 @@ vacuum(List *relations, VacuumParams *params,
 	 */
 	if (use_own_xacts)
 	{
-		Assert(!in_outer_xact);
+		Assert(!vac_in_outer_xact);
 
 		/* ActiveSnapshot is not set by autovacuum */
 		if (ActiveSnapshotSet())
@@ -521,7 +522,7 @@ vacuum(List *relations, VacuumParams *params,
 				}
 
 				analyze_rel(vrel->oid, vrel->relation, params,
-							vrel->va_cols, in_outer_xact, vac_strategy);
+							vrel->va_cols, vac_in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
 				{
@@ -2214,6 +2215,12 @@ vacuum_delay_point(void)
 						 WAIT_EVENT_VACUUM_DELAY);
 		ResetLatch(MyLatch);
 
+		if (ConfigReloadPending && !vac_in_outer_xact)
+		{
+			ConfigReloadPending = false;
+			ProcessConfigFile(PGC_SIGHUP);
+		}
+
 		VacuumCostBalance = 0;
 
 		/* update balance values for workers */
-- 
2.37.2

#2Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Melanie Plageman (#1)
Re: Should vacuum process config file reload more often

Hi, Melanie!

On Fri, 24 Feb 2023 at 02:08, Melanie Plageman
<melanieplageman@gmail.com> wrote:

Hi,

Users may wish to speed up long-running vacuum of a large table by
decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
config file is only reloaded between tables (for autovacuum) or after
the statement (for explicit vacuum). This has been brought up for
autovacuum in [1].

Andres suggested that it might be possible to check ConfigReloadPending
in vacuum_delay_point(), so I thought I would draft a rough patch and
start a discussion.

Since vacuum_delay_point() is also called by analyze and we do not want
to reload the configuration file if we are in a user transaction, I
widened the scope of the in_outer_xact variable in vacuum() and allowed
analyze in a user transaction to default to the current configuration
file reload cadence in PostgresMain().

I don't think I can set and leave vac_in_outer_xact the way I am doing
it in this patch, since I use vac_in_outer_xact in vacuum_delay_point(),
which I believe is reachable from codepaths that would not have called
vacuum(). It seems that if a backend sets it, the outer transaction
commits, and then the backend ends up calling vacuum_delay_point() in a
different way later, it wouldn't be quite right.

Apart from this, one higher level question I have is if there are other
gucs whose modification would make reloading the configuration file
during vacuum/analyze unsafe.

I have a couple of small questions:
Can this patch also read the current GUC value if it's modified by the
SET command, without editing config file?
What will be if we modify config file with mistakes? (When we try to
start the cluster with an erroneous config file it will fail to start,
not sure about re-read config)

Overall the proposal seems legit and useful.

Kind regards,
Pavel Borisov

#3Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Melanie Plageman (#1)
Re: Should vacuum process config file reload more often

Hi,

On Fri, Feb 24, 2023 at 7:08 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Hi,

Users may wish to speed up long-running vacuum of a large table by
decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
config file is only reloaded between tables (for autovacuum) or after
the statement (for explicit vacuum). This has been brought up for
autovacuum in [1].

Andres suggested that it might be possible to check ConfigReloadPending
in vacuum_delay_point(), so I thought I would draft a rough patch and
start a discussion.

In vacuum_delay_point(), we need to update VacuumCostActive too if necessary.

Apart from this, one higher level question I have is if there are other
gucs whose modification would make reloading the configuration file
during vacuum/analyze unsafe.

As far as I know there are not such GUC parameters in the core but
there might be in third-party table AM and index AM extensions. Also,
I'm concerned that allowing to change any GUC parameters during
vacuum/analyze could be a foot-gun in the future. When modifying
vacuum/analyze-related codes, we have to consider the case where any
GUC parameters could be changed during vacuum/analyze. I guess it
would be better to apply the parameter changes for only vacuum delay
related parameters. For example, autovacuum launcher advertises the
values of the vacuum delay parameters on the shared memory not only
for autovacuum processes but also for manual vacuum/analyze processes.
Both processes can update them accordingly in vacuum_delay_point().

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#4Andres Freund
andres@anarazel.de
In reply to: Masahiko Sawada (#3)
Re: Should vacuum process config file reload more often

Hi,

On 2023-02-27 23:11:53 +0900, Masahiko Sawada wrote:

As far as I know there are not such GUC parameters in the core but
there might be in third-party table AM and index AM extensions.

We already reload in a pretty broad range of situations, so I'm not sure
there's a lot that could be unsafe that isn't already.

Also, I'm concerned that allowing to change any GUC parameters during
vacuum/analyze could be a foot-gun in the future. When modifying
vacuum/analyze-related codes, we have to consider the case where any GUC
parameters could be changed during vacuum/analyze.

What kind of scenario are you thinking of?

I guess it would be better to apply the parameter changes for only vacuum
delay related parameters. For example, autovacuum launcher advertises the
values of the vacuum delay parameters on the shared memory not only for
autovacuum processes but also for manual vacuum/analyze processes. Both
processes can update them accordingly in vacuum_delay_point().

I don't think this is a good idea. It'd introduce a fair amount of complexity
without, as far as I can tell, a benefit.

Greetings,

Andres Freund

#5Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Andres Freund (#4)
Re: Should vacuum process config file reload more often

On Tue, Feb 28, 2023 at 10:21 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2023-02-27 23:11:53 +0900, Masahiko Sawada wrote:

As far as I know there are not such GUC parameters in the core but
there might be in third-party table AM and index AM extensions.

We already reload in a pretty broad range of situations, so I'm not sure
there's a lot that could be unsafe that isn't already.

Also, I'm concerned that allowing to change any GUC parameters during
vacuum/analyze could be a foot-gun in the future. When modifying
vacuum/analyze-related codes, we have to consider the case where any GUC
parameters could be changed during vacuum/analyze.

What kind of scenario are you thinking of?

For example, I guess we will need to take care of changes of
maintenance_work_mem. Currently we initialize the dead tuple space at
the beginning of lazy vacuum, but perhaps we would need to
enlarge/shrink it based on the new value?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#6Melanie Plageman
melanieplageman@gmail.com
In reply to: Pavel Borisov (#2)
Re: Should vacuum process config file reload more often

Thanks for the feedback and questions, Pavel!

On Fri, Feb 24, 2023 at 3:43 AM Pavel Borisov <pashkin.elfe@gmail.com> wrote:

I have a couple of small questions:
Can this patch also read the current GUC value if it's modified by the
SET command, without editing config file?

If a user sets a guc like vacuum_cost_limit with SET, this only modifies
the value for that session. That wouldn't affect the in-progress vacuum
you initiated from that session because you would have to wait for the
vacuum to complete before issuing the SET command.

What will be if we modify config file with mistakes? (When we try to
start the cluster with an erroneous config file it will fail to start,
not sure about re-read config)

If you manually add an invalid valid to your postgresql.conf, when it is
reloaded, the existing value will remain unchanged and an error will be
logged. If you attempt to change the guc value to an invalid value with
ALTER SYSTEM, the ALTER SYSTEM command will fail and the existing value
will remain unchanged.

- Melanie

#7Andres Freund
andres@anarazel.de
In reply to: Masahiko Sawada (#5)
Re: Should vacuum process config file reload more often

Hi,

On 2023-02-28 11:16:45 +0900, Masahiko Sawada wrote:

On Tue, Feb 28, 2023 at 10:21 AM Andres Freund <andres@anarazel.de> wrote:

On 2023-02-27 23:11:53 +0900, Masahiko Sawada wrote:

As far as I know there are not such GUC parameters in the core but
there might be in third-party table AM and index AM extensions.

We already reload in a pretty broad range of situations, so I'm not sure
there's a lot that could be unsafe that isn't already.

Also, I'm concerned that allowing to change any GUC parameters during
vacuum/analyze could be a foot-gun in the future. When modifying
vacuum/analyze-related codes, we have to consider the case where any GUC
parameters could be changed during vacuum/analyze.

What kind of scenario are you thinking of?

For example, I guess we will need to take care of changes of
maintenance_work_mem. Currently we initialize the dead tuple space at
the beginning of lazy vacuum, but perhaps we would need to
enlarge/shrink it based on the new value?

I don't think we need to do anything about that initially, just because the
config can be changed in a more granular way, doesn't mean we have to react to
every change for the current operation.

Greetings,

Andres Freund

#8Melanie Plageman
melanieplageman@gmail.com
In reply to: Masahiko Sawada (#3)
Re: Should vacuum process config file reload more often

On Mon, Feb 27, 2023 at 9:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Feb 24, 2023 at 7:08 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Users may wish to speed up long-running vacuum of a large table by
decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
config file is only reloaded between tables (for autovacuum) or after
the statement (for explicit vacuum). This has been brought up for
autovacuum in [1].

Andres suggested that it might be possible to check ConfigReloadPending
in vacuum_delay_point(), so I thought I would draft a rough patch and
start a discussion.

In vacuum_delay_point(), we need to update VacuumCostActive too if necessary.

Yes, good point. Thank you!

On Thu, Feb 23, 2023 at 5:08 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

I don't think I can set and leave vac_in_outer_xact the way I am doing
it in this patch, since I use vac_in_outer_xact in vacuum_delay_point(),
which I believe is reachable from codepaths that would not have called
vacuum(). It seems that if a backend sets it, the outer transaction
commits, and then the backend ends up calling vacuum_delay_point() in a
different way later, it wouldn't be quite right.

Perhaps I could just set in_outer_xact to false in the PG_FINALLY()
section in vacuum() to avoid this problem.

On Wed, Mar 1, 2023 at 7:15 PM Andres Freund <andres@anarazel.de> wrote:

On 2023-02-28 11:16:45 +0900, Masahiko Sawada wrote:

On Tue, Feb 28, 2023 at 10:21 AM Andres Freund <andres@anarazel.de> wrote:

On 2023-02-27 23:11:53 +0900, Masahiko Sawada wrote:

Also, I'm concerned that allowing to change any GUC parameters during
vacuum/analyze could be a foot-gun in the future. When modifying
vacuum/analyze-related codes, we have to consider the case where any GUC
parameters could be changed during vacuum/analyze.

What kind of scenario are you thinking of?

For example, I guess we will need to take care of changes of
maintenance_work_mem. Currently we initialize the dead tuple space at
the beginning of lazy vacuum, but perhaps we would need to
enlarge/shrink it based on the new value?

I don't think we need to do anything about that initially, just because the
config can be changed in a more granular way, doesn't mean we have to react to
every change for the current operation.

Perhaps we can mention in the docs that a change to maintenance_work_mem
will not take effect in the middle of vacuuming a table. But, Ithink it probably
isn't needed.

On another topic, I've just realized that when autovacuuming we only
update tab->at_vacuum_cost_delay/limit from
autovacuum_vacuum_cost_delay/limit for each table (in
table_recheck_autovac()) and then use that to update
MyWorkerInfo->wi_cost_delay/limit. MyWorkerInfo->wi_cost_delay/limit is
what is used to update VacuumCostDelay/Limit in AutoVacuumUpdateDelay().
So, even if we reload the config file in vacuum_delay_point(), if we
don't use the new value of autovacuum_vacuum_cost_delay/limit it will
have no effect for autovacuum.

I started writing a little helper that could be used to update these
workerinfo->wi_cost_delay/limit in vacuum_delay_point(), but I notice
when they are first set, we consider the autovacuum table options. So,
I suppose I would need to consider these when updating
wi_cost_delay/limit later as well? (during vacuum_delay_point() or
in AutoVacuumUpdateDelay())

I wasn't quite sure because I found these chained ternaries rather
difficult to interpret, but I think table_recheck_autovac() is saying
that the autovacuum table options override all other values for
vac_cost_delay?

vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
? avopts->vacuum_cost_delay
: (autovacuum_vac_cost_delay >= 0)
? autovacuum_vac_cost_delay
: VacuumCostDelay;

i.e. this?

if (avopts && avopts->vacuum_cost_delay >= 0)
vac_cost_delay = avopts->vacuum_cost_delay;
else if (autovacuum_vac_cost_delay >= 0)
vac_cost_delay = autovacuum_vacuum_cost_delay;
else
vac_cost_delay = VacuumCostDelay

- Melanie

#9Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#7)
Re: Should vacuum process config file reload more often

On Thu, Mar 2, 2023 at 5:45 AM Andres Freund <andres@anarazel.de> wrote:

On 2023-02-28 11:16:45 +0900, Masahiko Sawada wrote:

On Tue, Feb 28, 2023 at 10:21 AM Andres Freund <andres@anarazel.de> wrote:

On 2023-02-27 23:11:53 +0900, Masahiko Sawada wrote:

As far as I know there are not such GUC parameters in the core but
there might be in third-party table AM and index AM extensions.

We already reload in a pretty broad range of situations, so I'm not sure
there's a lot that could be unsafe that isn't already.

Also, I'm concerned that allowing to change any GUC parameters during
vacuum/analyze could be a foot-gun in the future. When modifying
vacuum/analyze-related codes, we have to consider the case where any GUC
parameters could be changed during vacuum/analyze.

What kind of scenario are you thinking of?

For example, I guess we will need to take care of changes of
maintenance_work_mem. Currently we initialize the dead tuple space at
the beginning of lazy vacuum, but perhaps we would need to
enlarge/shrink it based on the new value?

I don't think we need to do anything about that initially, just because the
config can be changed in a more granular way, doesn't mean we have to react to
every change for the current operation.

+1. I also don't see the need to do anything for this case.

--
With Regards,
Amit Kapila.

#10Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Melanie Plageman (#8)
Re: Should vacuum process config file reload more often

On Thu, Mar 2, 2023 at 10:41 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Mon, Feb 27, 2023 at 9:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Feb 24, 2023 at 7:08 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Users may wish to speed up long-running vacuum of a large table by
decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
config file is only reloaded between tables (for autovacuum) or after
the statement (for explicit vacuum). This has been brought up for
autovacuum in [1].

Andres suggested that it might be possible to check ConfigReloadPending
in vacuum_delay_point(), so I thought I would draft a rough patch and
start a discussion.

In vacuum_delay_point(), we need to update VacuumCostActive too if necessary.

Yes, good point. Thank you!

On Thu, Feb 23, 2023 at 5:08 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

I don't think I can set and leave vac_in_outer_xact the way I am doing
it in this patch, since I use vac_in_outer_xact in vacuum_delay_point(),
which I believe is reachable from codepaths that would not have called
vacuum(). It seems that if a backend sets it, the outer transaction
commits, and then the backend ends up calling vacuum_delay_point() in a
different way later, it wouldn't be quite right.

Perhaps I could just set in_outer_xact to false in the PG_FINALLY()
section in vacuum() to avoid this problem.

On Wed, Mar 1, 2023 at 7:15 PM Andres Freund <andres@anarazel.de> wrote:

On 2023-02-28 11:16:45 +0900, Masahiko Sawada wrote:

On Tue, Feb 28, 2023 at 10:21 AM Andres Freund <andres@anarazel.de> wrote:

On 2023-02-27 23:11:53 +0900, Masahiko Sawada wrote:

Also, I'm concerned that allowing to change any GUC parameters during
vacuum/analyze could be a foot-gun in the future. When modifying
vacuum/analyze-related codes, we have to consider the case where any GUC
parameters could be changed during vacuum/analyze.

What kind of scenario are you thinking of?

For example, I guess we will need to take care of changes of
maintenance_work_mem. Currently we initialize the dead tuple space at
the beginning of lazy vacuum, but perhaps we would need to
enlarge/shrink it based on the new value?

I don't think we need to do anything about that initially, just because the
config can be changed in a more granular way, doesn't mean we have to react to
every change for the current operation.

Perhaps we can mention in the docs that a change to maintenance_work_mem
will not take effect in the middle of vacuuming a table. But, Ithink it probably
isn't needed.

Agreed.

On another topic, I've just realized that when autovacuuming we only
update tab->at_vacuum_cost_delay/limit from
autovacuum_vacuum_cost_delay/limit for each table (in
table_recheck_autovac()) and then use that to update
MyWorkerInfo->wi_cost_delay/limit. MyWorkerInfo->wi_cost_delay/limit is
what is used to update VacuumCostDelay/Limit in AutoVacuumUpdateDelay().
So, even if we reload the config file in vacuum_delay_point(), if we
don't use the new value of autovacuum_vacuum_cost_delay/limit it will
have no effect for autovacuum.

Right, but IIUC wi_cost_limit (and VacuumCostDelayLimit) might be
updated. After the autovacuum launcher reloads the config file, it
calls autovac_balance_cost() that updates that value of active
workers. I'm not sure why we don't update workers' wi_cost_delay,
though.

I started writing a little helper that could be used to update these
workerinfo->wi_cost_delay/limit in vacuum_delay_point(),

Since we set vacuum delay parameters for autovacuum workers so that we
ration out I/O equally, I think we should keep the current mechanism
that the autovacuum launcher sets workers' delay parameters and they
update accordingly.

but I notice
when they are first set, we consider the autovacuum table options. So,
I suppose I would need to consider these when updating
wi_cost_delay/limit later as well? (during vacuum_delay_point() or
in AutoVacuumUpdateDelay())

I wasn't quite sure because I found these chained ternaries rather
difficult to interpret, but I think table_recheck_autovac() is saying
that the autovacuum table options override all other values for
vac_cost_delay?

vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
? avopts->vacuum_cost_delay
: (autovacuum_vac_cost_delay >= 0)
? autovacuum_vac_cost_delay
: VacuumCostDelay;

i.e. this?

if (avopts && avopts->vacuum_cost_delay >= 0)
vac_cost_delay = avopts->vacuum_cost_delay;
else if (autovacuum_vac_cost_delay >= 0)
vac_cost_delay = autovacuum_vacuum_cost_delay;
else
vac_cost_delay = VacuumCostDelay

Yes, if the table has autovacuum table options, we use these values
and the table is excluded from the balancing algorithm I mentioned
above. See the code from table_recheck_autovac(),

/*
* If any of the cost delay parameters has been set individually for
* this table, disable the balancing algorithm.
*/
tab->at_dobalance =
!(avopts && (avopts->vacuum_cost_limit > 0 ||
avopts->vacuum_cost_delay > 0));

So if the table has autovacuum table options, the vacuum delay
parameters probably should be updated by ALTER TABLE, not by reloading
the config file.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#11Melanie Plageman
melanieplageman@gmail.com
In reply to: Masahiko Sawada (#10)
Re: Should vacuum process config file reload more often

On Thu, Mar 2, 2023 at 2:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Mar 2, 2023 at 10:41 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On another topic, I've just realized that when autovacuuming we only
update tab->at_vacuum_cost_delay/limit from
autovacuum_vacuum_cost_delay/limit for each table (in
table_recheck_autovac()) and then use that to update
MyWorkerInfo->wi_cost_delay/limit. MyWorkerInfo->wi_cost_delay/limit is
what is used to update VacuumCostDelay/Limit in AutoVacuumUpdateDelay().
So, even if we reload the config file in vacuum_delay_point(), if we
don't use the new value of autovacuum_vacuum_cost_delay/limit it will
have no effect for autovacuum.

Right, but IIUC wi_cost_limit (and VacuumCostDelayLimit) might be
updated. After the autovacuum launcher reloads the config file, it
calls autovac_balance_cost() that updates that value of active
workers. I'm not sure why we don't update workers' wi_cost_delay,
though.

Ah yes, I didn't realize this. Thanks. I went back and did more code
reading/analysis, and I see no reason why we shouldn't update
worker->wi_cost_delay to the new value of autovacuum_vac_cost_delay in
autovac_balance_cost(). Then, as you said, the autovac launcher will
call autovac_balance_cost() when it reloads the configuration file.
Then, the next time the autovac worker calls AutoVacuumUpdateDelay(), it
will update VacuumCostDelay.

I started writing a little helper that could be used to update these
workerinfo->wi_cost_delay/limit in vacuum_delay_point(),

Since we set vacuum delay parameters for autovacuum workers so that we
ration out I/O equally, I think we should keep the current mechanism
that the autovacuum launcher sets workers' delay parameters and they
update accordingly.

Yes, agreed, it should go in the same place as where we update
wi_cost_limit (autovac_balance_cost()). I think we should potentially
rename autovac_balance_cost() because its name and all its comments
point to its only purpose being to balance the total of the workers
wi_cost_limits to no more than autovacuum_vacuum_cost_limit. And the
autovacuum_vacuum_cost_delay doesn't need to be balanced in this way.

Though, since this change on its own would make autovacuum pick up new
values of autovacuum_vacuum_cost_limit (without having the worker reload
the config file), I wonder if it makes sense to try and have
vacuum_delay_point() only reload the config file if it is an explicit
vacuum or an analyze not being run in an outer transaction (to avoid
overhead of reloading config file)?

The lifecycle of this different vacuum delay-related gucs and how it
differs between autovacuum workers and explicit vacuum is quite tangled
already, though.

but I notice
when they are first set, we consider the autovacuum table options. So,
I suppose I would need to consider these when updating
wi_cost_delay/limit later as well? (during vacuum_delay_point() or
in AutoVacuumUpdateDelay())

I wasn't quite sure because I found these chained ternaries rather
difficult to interpret, but I think table_recheck_autovac() is saying
that the autovacuum table options override all other values for
vac_cost_delay?

vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
? avopts->vacuum_cost_delay
: (autovacuum_vac_cost_delay >= 0)
? autovacuum_vac_cost_delay
: VacuumCostDelay;

i.e. this?

if (avopts && avopts->vacuum_cost_delay >= 0)
vac_cost_delay = avopts->vacuum_cost_delay;
else if (autovacuum_vac_cost_delay >= 0)
vac_cost_delay = autovacuum_vacuum_cost_delay;
else
vac_cost_delay = VacuumCostDelay

Yes, if the table has autovacuum table options, we use these values
and the table is excluded from the balancing algorithm I mentioned
above. See the code from table_recheck_autovac(),

/*
* If any of the cost delay parameters has been set individually for
* this table, disable the balancing algorithm.
*/
tab->at_dobalance =
!(avopts && (avopts->vacuum_cost_limit > 0 ||
avopts->vacuum_cost_delay > 0));

So if the table has autovacuum table options, the vacuum delay
parameters probably should be updated by ALTER TABLE, not by reloading
the config file.

Yes, if the table has autovacuum table options, I think the user is
out-of-luck until the relation is done being vacuumed because the ALTER
TABLE will need to get a lock.

- Melanie

#12Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#11)
1 attachment(s)
Re: Should vacuum process config file reload more often

On Thu, Mar 2, 2023 at 6:37 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Thu, Mar 2, 2023 at 2:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Mar 2, 2023 at 10:41 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On another topic, I've just realized that when autovacuuming we only
update tab->at_vacuum_cost_delay/limit from
autovacuum_vacuum_cost_delay/limit for each table (in
table_recheck_autovac()) and then use that to update
MyWorkerInfo->wi_cost_delay/limit. MyWorkerInfo->wi_cost_delay/limit is
what is used to update VacuumCostDelay/Limit in AutoVacuumUpdateDelay().
So, even if we reload the config file in vacuum_delay_point(), if we
don't use the new value of autovacuum_vacuum_cost_delay/limit it will
have no effect for autovacuum.

Right, but IIUC wi_cost_limit (and VacuumCostDelayLimit) might be
updated. After the autovacuum launcher reloads the config file, it
calls autovac_balance_cost() that updates that value of active
workers. I'm not sure why we don't update workers' wi_cost_delay,
though.

Ah yes, I didn't realize this. Thanks. I went back and did more code
reading/analysis, and I see no reason why we shouldn't update
worker->wi_cost_delay to the new value of autovacuum_vac_cost_delay in
autovac_balance_cost(). Then, as you said, the autovac launcher will
call autovac_balance_cost() when it reloads the configuration file.
Then, the next time the autovac worker calls AutoVacuumUpdateDelay(), it
will update VacuumCostDelay.

I started writing a little helper that could be used to update these
workerinfo->wi_cost_delay/limit in vacuum_delay_point(),

Since we set vacuum delay parameters for autovacuum workers so that we
ration out I/O equally, I think we should keep the current mechanism
that the autovacuum launcher sets workers' delay parameters and they
update accordingly.

Yes, agreed, it should go in the same place as where we update
wi_cost_limit (autovac_balance_cost()). I think we should potentially
rename autovac_balance_cost() because its name and all its comments
point to its only purpose being to balance the total of the workers
wi_cost_limits to no more than autovacuum_vacuum_cost_limit. And the
autovacuum_vacuum_cost_delay doesn't need to be balanced in this way.

Though, since this change on its own would make autovacuum pick up new
values of autovacuum_vacuum_cost_limit (without having the worker reload
the config file), I wonder if it makes sense to try and have
vacuum_delay_point() only reload the config file if it is an explicit
vacuum or an analyze not being run in an outer transaction (to avoid
overhead of reloading config file)?

The lifecycle of this different vacuum delay-related gucs and how it
differs between autovacuum workers and explicit vacuum is quite tangled
already, though.

So, I've attached a new version of the patch which is quite different
from the previous versions.

In this version I've removed wi_cost_delay from WorkerInfoData. There is
no synchronization of cost_delay amongst workers, so there is no reason
to keep it in shared memory.

One consequence of not updating VacuumCostDelay from wi_cost_delay is
that we have to have a way to keep track of whether or not autovacuum
table options are in use.

This patch does this in a cringeworthy way. I added two global
variables, one to track whether or not cost delay table options are in
use and the other to store the value of the table option cost delay. I
didn't want to use a single variable with a special value to indicate
that table option cost delay is in use because
autovacuum_vacuum_cost_delay already has special values that mean
certain things. My code needs a better solution.

It is worth mentioning that I think that in master,
AutoVacuumUpdateDelay() was incorrectly reading wi_cost_limit and
wi_cost_delay from shared memory without holding a lock.

I've added in a shared lock for reading from wi_cost_limit in this
patch. However, AutoVacuumUpdateLimit() is called unconditionally in
vacuum_delay_point(), which is called quite often (per block-ish), so I
was trying to think if there is a way we could avoid having to check
this shared memory variable on every call to vacuum_delay_point().
Rebalances shouldn't happen very often (done by the launcher when a new
worker is launched and by workers between vacuuming tables). Maybe we
can read from it less frequently?

Also not sure how the patch interacts with failsafe autovac and parallel
vacuum.

- Melanie

Attachments:

v2-0001-Reload-config-file-more-often-while-vacuuming.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Reload-config-file-more-often-while-vacuuming.patchDownload
From 9b5cbbc0c8f892dde3e220f0945b2c1e0d175b84 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sun, 5 Mar 2023 14:39:16 -0500
Subject: [PATCH v2] Reload config file more often while vacuuming

---
 src/backend/commands/vacuum.c       | 38 ++++++++---
 src/backend/postmaster/autovacuum.c | 97 ++++++++++++++++++++++-------
 src/include/postmaster/autovacuum.h |  2 +
 3 files changed, 104 insertions(+), 33 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index aa79d9de4d..f6cea30168 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -48,6 +48,7 @@
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/interrupt.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -75,6 +76,7 @@ int			vacuum_multixact_failsafe_age;
 /* A few variables that don't seem worth passing around as parameters */
 static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
+static bool analyze_in_outer_xact = false;
 
 
 /*
@@ -309,8 +311,7 @@ vacuum(List *relations, VacuumParams *params,
 	static bool in_vacuum = false;
 
 	const char *stmttype;
-	volatile bool in_outer_xact,
-				use_own_xacts;
+	volatile bool use_own_xacts;
 
 	Assert(params != NULL);
 
@@ -327,10 +328,10 @@ vacuum(List *relations, VacuumParams *params,
 	if (params->options & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
-		in_outer_xact = false;
+		analyze_in_outer_xact = false;
 	}
 	else
-		in_outer_xact = IsInTransactionBlock(isTopLevel);
+		analyze_in_outer_xact = IsInTransactionBlock(isTopLevel);
 
 	/*
 	 * Due to static variables vac_context, anl_context and vac_strategy,
@@ -451,7 +452,7 @@ vacuum(List *relations, VacuumParams *params,
 		Assert(params->options & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
-		else if (in_outer_xact)
+		else if (analyze_in_outer_xact)
 			use_own_xacts = false;
 		else if (list_length(relations) > 1)
 			use_own_xacts = true;
@@ -469,7 +470,7 @@ vacuum(List *relations, VacuumParams *params,
 	 */
 	if (use_own_xacts)
 	{
-		Assert(!in_outer_xact);
+		Assert(!analyze_in_outer_xact);
 
 		/* ActiveSnapshot is not set by autovacuum */
 		if (ActiveSnapshotSet())
@@ -521,7 +522,7 @@ vacuum(List *relations, VacuumParams *params,
 				}
 
 				analyze_rel(vrel->oid, vrel->relation, params,
-							vrel->va_cols, in_outer_xact, vac_strategy);
+							vrel->va_cols, analyze_in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
 				{
@@ -544,6 +545,7 @@ vacuum(List *relations, VacuumParams *params,
 	{
 		in_vacuum = false;
 		VacuumCostActive = false;
+		analyze_in_outer_xact = false;
 	}
 	PG_END_TRY();
 
@@ -2214,10 +2216,28 @@ vacuum_delay_point(void)
 						 WAIT_EVENT_VACUUM_DELAY);
 		ResetLatch(MyLatch);
 
+		/*
+		 * Reload the configuration file if requested. This allows changes to
+		 * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay
+		 * to take effect while a table is being vacuumed or analyzed.
+		 */
+		if (ConfigReloadPending && !analyze_in_outer_xact)
+		{
+			ConfigReloadPending = false;
+			ProcessConfigFile(PGC_SIGHUP);
+			AutoVacuumUpdateDelay();
+		}
+
 		VacuumCostBalance = 0;
 
-		/* update balance values for workers */
-		AutoVacuumUpdateDelay();
+		/*
+		 * Update balance values for workers. We must always do this in case
+		 * the autovacuum launcher has done a rebalance (as it does when
+		 * launching a new worker).
+		 */
+		AutoVacuumUpdateLimit();
+
+		VacuumCostActive = (VacuumCostDelay > 0);
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index ff6149a179..78b4233241 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -139,6 +139,9 @@ int			Log_autovacuum_min_duration = 600000;
 static bool am_autovacuum_launcher = false;
 static bool am_autovacuum_worker = false;
 
+static bool av_use_table_option_cost_delay = false;
+static double av_table_option_cost_delay = 0;
+
 /* Flags set by signal handlers */
 static volatile sig_atomic_t got_SIGUSR2 = false;
 
@@ -189,7 +192,6 @@ typedef struct autovac_table
 {
 	Oid			at_relid;
 	VacuumParams at_params;
-	double		at_vacuum_cost_delay;
 	int			at_vacuum_cost_limit;
 	bool		at_dobalance;
 	bool		at_sharedrel;
@@ -225,7 +227,6 @@ typedef struct WorkerInfoData
 	TimestampTz wi_launchtime;
 	bool		wi_dobalance;
 	bool		wi_sharedrel;
-	double		wi_cost_delay;
 	int			wi_cost_limit;
 	int			wi_cost_limit_base;
 } WorkerInfoData;
@@ -1756,7 +1757,6 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
 		MyWorkerInfo->wi_dobalance = false;
-		MyWorkerInfo->wi_cost_delay = 0;
 		MyWorkerInfo->wi_cost_limit = 0;
 		MyWorkerInfo->wi_cost_limit_base = 0;
 		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
@@ -1780,11 +1780,37 @@ FreeWorkerInfo(int code, Datum arg)
 void
 AutoVacuumUpdateDelay(void)
 {
-	if (MyWorkerInfo)
+	/*
+	 * We are using autovacuum-related GUCs to update VacuumCostDelay, so we
+	 * only want autovacuum workers and autovacuum launcher to do this.
+	 */
+	if (!(am_autovacuum_worker || am_autovacuum_launcher))
+		return;
+
+	if (av_use_table_option_cost_delay)
 	{
-		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
-		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+		VacuumCostDelay = av_table_option_cost_delay;
 	}
+	else
+	{
+		VacuumCostDelay = autovacuum_vac_cost_delay >= 0 ?
+			autovacuum_vac_cost_delay : VacuumCostDelay;
+	}
+}
+
+/*
+ * Helper for vacuum_delay_point() to allow workers to read their
+ * wi_cost_limit.
+ */
+void
+AutoVacuumUpdateLimit(void)
+{
+	if (!MyWorkerInfo)
+		return;
+
+	LWLockAcquire(AutovacuumLock, LW_SHARED);
+	VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+	LWLockRelease(AutovacuumLock);
 }
 
 /*
@@ -1824,9 +1850,9 @@ autovac_balance_cost(void)
 
 		if (worker->wi_proc != NULL &&
 			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
+			worker->wi_cost_limit_base > 0 && vac_cost_delay > 0)
 			cost_total +=
-				(double) worker->wi_cost_limit_base / worker->wi_cost_delay;
+				(double) worker->wi_cost_limit_base / vac_cost_delay;
 	}
 
 	/* there are no cost limits -- nothing to do */
@@ -1844,7 +1870,7 @@ autovac_balance_cost(void)
 
 		if (worker->wi_proc != NULL &&
 			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
+			worker->wi_cost_limit_base > 0 && vac_cost_delay > 0)
 		{
 			int			limit = (int)
 			(cost_avail * worker->wi_cost_limit_base / cost_total);
@@ -1861,11 +1887,10 @@ autovac_balance_cost(void)
 		}
 
 		if (worker->wi_proc != NULL)
-			elog(DEBUG2, "autovac_balance_cost(pid=%d db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d, cost_delay=%g)",
+			elog(DEBUG2, "autovac_balance_cost(pid=%d db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d)",
 				 worker->wi_proc->pid, worker->wi_dboid, worker->wi_tableoid,
 				 worker->wi_dobalance ? "yes" : "no",
-				 worker->wi_cost_limit, worker->wi_cost_limit_base,
-				 worker->wi_cost_delay);
+				 worker->wi_cost_limit, worker->wi_cost_limit_base);
 	}
 }
 
@@ -2326,6 +2351,15 @@ do_autovacuum(void)
 			ConfigReloadPending = false;
 			ProcessConfigFile(PGC_SIGHUP);
 
+			/*
+			 * Autovacuum workers should always update VacuumCostDelay and
+			 * VacuumCostLimit in case they were overridden by the reload.
+			 */
+			AutoVacuumUpdateDelay();
+			LWLockAcquire(AutovacuumLock, LW_SHARED);
+			VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+			LWLockRelease(AutovacuumLock);
+
 			/*
 			 * You might be tempted to bail out if we see autovacuum is now
 			 * disabled.  Must resist that temptation -- this might be a
@@ -2424,21 +2458,20 @@ do_autovacuum(void)
 		stdVacuumCostDelay = VacuumCostDelay;
 		stdVacuumCostLimit = VacuumCostLimit;
 
+		AutoVacuumUpdateDelay();
+
 		/* Must hold AutovacuumLock while mucking with cost balance info */
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 
 		/* advertise my cost delay parameters for the balancing algorithm */
 		MyWorkerInfo->wi_dobalance = tab->at_dobalance;
-		MyWorkerInfo->wi_cost_delay = tab->at_vacuum_cost_delay;
 		MyWorkerInfo->wi_cost_limit = tab->at_vacuum_cost_limit;
 		MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
+		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
 
 		/* do a balance */
 		autovac_balance_cost();
 
-		/* set the active cost parameters from the result of that */
-		AutoVacuumUpdateDelay();
-
 		/* done */
 		LWLockRelease(AutovacuumLock);
 
@@ -2569,6 +2602,11 @@ deleted:
 		{
 			ConfigReloadPending = false;
 			ProcessConfigFile(PGC_SIGHUP);
+			AutoVacuumUpdateDelay();
+
+			LWLockAcquire(AutovacuumLock, LW_SHARED);
+			VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+			LWLockRelease(AutovacuumLock);
 		}
 
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
@@ -2771,7 +2809,11 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 	/* fetch the relation's relcache entry */
 	classTup = SearchSysCacheCopy1(RELOID, ObjectIdGetDatum(relid));
 	if (!HeapTupleIsValid(classTup))
+	{
+		av_use_table_option_cost_delay = false;
+		av_table_option_cost_delay = 0;
 		return NULL;
+	}
 	classForm = (Form_pg_class) GETSTRUCT(classTup);
 
 	/*
@@ -2802,7 +2844,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		int			multixact_freeze_min_age;
 		int			multixact_freeze_table_age;
 		int			vac_cost_limit;
-		double		vac_cost_delay;
 		int			log_min_duration;
 
 		/*
@@ -2812,12 +2853,16 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 * defaults, autovacuum's own first and plain vacuum second.
 		 */
 
-		/* -1 in autovac setting means use plain vacuum_cost_delay */
-		vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
-			? avopts->vacuum_cost_delay
-			: (autovacuum_vac_cost_delay >= 0)
-			? autovacuum_vac_cost_delay
-			: VacuumCostDelay;
+		if (avopts && avopts->vacuum_cost_delay >= 0)
+		{
+			av_use_table_option_cost_delay = true;
+			av_table_option_cost_delay = avopts->vacuum_cost_delay;
+		}
+		else
+		{
+			av_use_table_option_cost_delay = false;
+			av_table_option_cost_delay = 0;
+		}
 
 		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
 		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
@@ -2880,7 +2925,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab->at_params.is_wraparound = wraparound;
 		tab->at_params.log_min_duration = log_min_duration;
 		tab->at_vacuum_cost_limit = vac_cost_limit;
-		tab->at_vacuum_cost_delay = vac_cost_delay;
 		tab->at_relname = NULL;
 		tab->at_nspname = NULL;
 		tab->at_datname = NULL;
@@ -2893,6 +2937,11 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			!(avopts && (avopts->vacuum_cost_limit > 0 ||
 						 avopts->vacuum_cost_delay > 0));
 	}
+	else
+	{
+		av_use_table_option_cost_delay = false;
+		av_table_option_cost_delay = 0;
+	}
 
 	heap_freetuple(classTup);
 	return tab;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index c140371b51..558358911c 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -66,6 +66,8 @@ extern void AutoVacWorkerFailed(void);
 /* autovacuum cost-delay balancer */
 extern void AutoVacuumUpdateDelay(void);
 
+extern void AutoVacuumUpdateLimit(void);
+
 #ifdef EXEC_BACKEND
 extern void AutoVacLauncherMain(int argc, char *argv[]) pg_attribute_noreturn();
 extern void AutoVacWorkerMain(int argc, char *argv[]) pg_attribute_noreturn();
-- 
2.37.2

#13Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Melanie Plageman (#12)
Re: Should vacuum process config file reload more often

On Mon, Mar 6, 2023 at 5:26 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Thu, Mar 2, 2023 at 6:37 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Thu, Mar 2, 2023 at 2:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Mar 2, 2023 at 10:41 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On another topic, I've just realized that when autovacuuming we only
update tab->at_vacuum_cost_delay/limit from
autovacuum_vacuum_cost_delay/limit for each table (in
table_recheck_autovac()) and then use that to update
MyWorkerInfo->wi_cost_delay/limit. MyWorkerInfo->wi_cost_delay/limit is
what is used to update VacuumCostDelay/Limit in AutoVacuumUpdateDelay().
So, even if we reload the config file in vacuum_delay_point(), if we
don't use the new value of autovacuum_vacuum_cost_delay/limit it will
have no effect for autovacuum.

Right, but IIUC wi_cost_limit (and VacuumCostDelayLimit) might be
updated. After the autovacuum launcher reloads the config file, it
calls autovac_balance_cost() that updates that value of active
workers. I'm not sure why we don't update workers' wi_cost_delay,
though.

Ah yes, I didn't realize this. Thanks. I went back and did more code
reading/analysis, and I see no reason why we shouldn't update
worker->wi_cost_delay to the new value of autovacuum_vac_cost_delay in
autovac_balance_cost(). Then, as you said, the autovac launcher will
call autovac_balance_cost() when it reloads the configuration file.
Then, the next time the autovac worker calls AutoVacuumUpdateDelay(), it
will update VacuumCostDelay.

I started writing a little helper that could be used to update these
workerinfo->wi_cost_delay/limit in vacuum_delay_point(),

Since we set vacuum delay parameters for autovacuum workers so that we
ration out I/O equally, I think we should keep the current mechanism
that the autovacuum launcher sets workers' delay parameters and they
update accordingly.

Yes, agreed, it should go in the same place as where we update
wi_cost_limit (autovac_balance_cost()). I think we should potentially
rename autovac_balance_cost() because its name and all its comments
point to its only purpose being to balance the total of the workers
wi_cost_limits to no more than autovacuum_vacuum_cost_limit. And the
autovacuum_vacuum_cost_delay doesn't need to be balanced in this way.

Though, since this change on its own would make autovacuum pick up new
values of autovacuum_vacuum_cost_limit (without having the worker reload
the config file), I wonder if it makes sense to try and have
vacuum_delay_point() only reload the config file if it is an explicit
vacuum or an analyze not being run in an outer transaction (to avoid
overhead of reloading config file)?

The lifecycle of this different vacuum delay-related gucs and how it
differs between autovacuum workers and explicit vacuum is quite tangled
already, though.

So, I've attached a new version of the patch which is quite different
from the previous versions.

Thank you for updating the patch!

In this version I've removed wi_cost_delay from WorkerInfoData. There is
no synchronization of cost_delay amongst workers, so there is no reason
to keep it in shared memory.

One consequence of not updating VacuumCostDelay from wi_cost_delay is
that we have to have a way to keep track of whether or not autovacuum
table options are in use.

This patch does this in a cringeworthy way. I added two global
variables, one to track whether or not cost delay table options are in
use and the other to store the value of the table option cost delay. I
didn't want to use a single variable with a special value to indicate
that table option cost delay is in use because
autovacuum_vacuum_cost_delay already has special values that mean
certain things. My code needs a better solution.

While it's true that wi_cost_delay doesn't need to be shared, it seems
to make the logic somewhat complex. We need to handle cost_delay in a
different way from other vacuum-related parameters and we need to make
sure av[_use]_table_option_cost_delay are set properly. Removing
wi_cost_delay from WorkerInfoData saves 8 bytes shared memory per
autovacuum worker but it might be worth considering to keep
wi_cost_delay for simplicity.

---
 void
 AutoVacuumUpdateDelay(void)
 {
-        if (MyWorkerInfo)
+        /*
+         * We are using autovacuum-related GUCs to update
VacuumCostDelay, so we
+         * only want autovacuum workers and autovacuum launcher to do this.
+         */
+        if (!(am_autovacuum_worker || am_autovacuum_launcher))
+                return;

Is there any case where the autovacuum launcher calls
AutoVacuumUpdateDelay() function?

---
In at autovac_balance_cost(), we have,

int vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
autovacuum_vac_cost_limit : VacuumCostLimit);
double vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
autovacuum_vac_cost_delay : VacuumCostDelay);
:
/* not set? nothing to do */
if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
return;

IIUC if autovacuum_vac_cost_delay is changed to 0 during autovacuums
running, their vacuum delay parameters are not changed. It's not a bug
of the patch but I think we can fix it in this patch.

It is worth mentioning that I think that in master,
AutoVacuumUpdateDelay() was incorrectly reading wi_cost_limit and
wi_cost_delay from shared memory without holding a lock.

Indeed.

I've added in a shared lock for reading from wi_cost_limit in this
patch. However, AutoVacuumUpdateLimit() is called unconditionally in
vacuum_delay_point(), which is called quite often (per block-ish), so I
was trying to think if there is a way we could avoid having to check
this shared memory variable on every call to vacuum_delay_point().
Rebalances shouldn't happen very often (done by the launcher when a new
worker is launched and by workers between vacuuming tables). Maybe we
can read from it less frequently?

Yeah, acquiring the lwlock for every call to vacuum_delay_point()
seems to be harmful. One idea would be to have one sig_atomic_t
variable in WorkerInfoData and autovac_balance_cost() set it to true
after rebalancing the worker's cost-limit. The worker can check it
without locking and update its delay parameters if the flag is true.

Also not sure how the patch interacts with failsafe autovac and parallel
vacuum.

Good point.

When entering the failsafe mode, we disable the vacuum delays (see
lazy_check_wraparound_failsafe()). We need to keep disabling the
vacuum delays even after reloading the config file. One idea is to
have another global variable indicating we're in the failsafe mode.
vacuum_delay_point() doesn't update VacuumCostActive if the flag is
true.

As far as I can see we don't need special treatments for parallel
vacuum cases since it works only in manual vacuum. It calculates the
sleep time based on the shared cost balance and how much the worker
did I/O but the basic mechanism is the same as non-parallel case.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#14Jim Nasby
nasbyj@amazon.com
In reply to: Masahiko Sawada (#10)
Re: Should vacuum process config file reload more often

On 3/2/23 1:36 AM, Masahiko Sawada wrote:

For example, I guess we will need to take care of changes of
maintenance_work_mem. Currently we initialize the dead tuple space at
the beginning of lazy vacuum, but perhaps we would need to
enlarge/shrink it based on the new value?

Doesn't the dead tuple space grow as needed? Last I looked we don't
allocate up to 1GB right off the bat.

I don't think we need to do anything about that initially, just because the
config can be changed in a more granular way, doesn't mean we have to react to
every change for the current operation.

Perhaps we can mention in the docs that a change to maintenance_work_mem
will not take effect in the middle of vacuuming a table. But, Ithink it probably
isn't needed.

Agreed.

I disagree that there's no need for this. Sure, if
maintenance_work_memory is 10MB then it's no big deal to just abandon
your current vacuum and start a new one, but the index vacuuming phase
with maintenance_work_mem set to say 500MB can take quite a while.
Forcing a user to either suck it up or throw everything in the phase
away isn't terribly good.

Of course, if the patch that eliminates the 1GB vacuum limit gets
committed the situation will be even worse.

While it'd be nice to also honor maintenance_work_mem getting set lower,
I don't see any need to go through heroics to accomplish that. Simply
recording the change and honoring it for future attempts to grow the
memory and on future passes through the heap would be plenty.

All that said, don't let these suggestions get in the way of committing
this. Just having the ability to tweak cost parameters would be a win.

#15Andres Freund
andres@anarazel.de
In reply to: Jim Nasby (#14)
Re: Should vacuum process config file reload more often

Hi,

On 2023-03-08 11:42:31 -0600, Jim Nasby wrote:

On 3/2/23 1:36 AM, Masahiko Sawada wrote:

For example, I guess we will need to take care of changes of
maintenance_work_mem. Currently we initialize the dead tuple space at
the beginning of lazy vacuum, but perhaps we would need to
enlarge/shrink it based on the new value?

Doesn't the dead tuple space grow as needed? Last I looked we don't allocate
up to 1GB right off the bat.

I don't think we need to do anything about that initially, just because the
config can be changed in a more granular way, doesn't mean we have to react to
every change for the current operation.

Perhaps we can mention in the docs that a change to maintenance_work_mem
will not take effect in the middle of vacuuming a table. But, Ithink it probably
isn't needed.

Agreed.

I disagree that there's no need for this. Sure, if maintenance_work_memory
is 10MB then it's no big deal to just abandon your current vacuum and start
a new one, but the index vacuuming phase with maintenance_work_mem set to
say 500MB can take quite a while. Forcing a user to either suck it up or
throw everything in the phase away isn't terribly good.

Of course, if the patch that eliminates the 1GB vacuum limit gets committed
the situation will be even worse.

While it'd be nice to also honor maintenance_work_mem getting set lower, I
don't see any need to go through heroics to accomplish that. Simply
recording the change and honoring it for future attempts to grow the memory
and on future passes through the heap would be plenty.

All that said, don't let these suggestions get in the way of committing
this. Just having the ability to tweak cost parameters would be a win.

Nobody said anything about it not being useful to react to m_w_m changes, just
that it's not required to make some progress . So I really don't understand
what the point of your comment is.

Greetings,

Andres Freund

#16John Naylor
john.naylor@enterprisedb.com
In reply to: Jim Nasby (#14)
Re: Should vacuum process config file reload more often

On Thu, Mar 9, 2023 at 12:42 AM Jim Nasby <nasbyj@amazon.com> wrote:

Doesn't the dead tuple space grow as needed? Last I looked we don't

allocate up to 1GB right off the bat.

Incorrect.

Of course, if the patch that eliminates the 1GB vacuum limit gets

committed the situation will be even worse.

If you're referring to the proposed tid store, I'd be interested in seeing
a reproducible test case with a m_w_m over 1GB where it makes things worse
than the current state of affairs.

--
John Naylor
EDB: http://www.enterprisedb.com

#17Masahiko Sawada
sawada.mshk@gmail.com
In reply to: John Naylor (#16)
Re: Should vacuum process config file reload more often

On Thu, Mar 9, 2023 at 4:47 PM John Naylor <john.naylor@enterprisedb.com> wrote:

On Thu, Mar 9, 2023 at 12:42 AM Jim Nasby <nasbyj@amazon.com> wrote:

Doesn't the dead tuple space grow as needed? Last I looked we don't allocate up to 1GB right off the bat.

Incorrect.

Of course, if the patch that eliminates the 1GB vacuum limit gets committed the situation will be even worse.

If you're referring to the proposed tid store, I'd be interested in seeing a reproducible test case with a m_w_m over 1GB where it makes things worse than the current state of affairs.

And I think that the tidstore makes it easy to react to
maintenance_work_mem changes. We don't need to enlarge it and just
update its memory limit at an appropriate time.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#18Melanie Plageman
melanieplageman@gmail.com
In reply to: Masahiko Sawada (#13)
Re: Should vacuum process config file reload more often

On Tue, Mar 7, 2023 at 12:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Mar 6, 2023 at 5:26 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Thu, Mar 2, 2023 at 6:37 PM Melanie Plageman
In this version I've removed wi_cost_delay from WorkerInfoData. There is
no synchronization of cost_delay amongst workers, so there is no reason
to keep it in shared memory.

One consequence of not updating VacuumCostDelay from wi_cost_delay is
that we have to have a way to keep track of whether or not autovacuum
table options are in use.

This patch does this in a cringeworthy way. I added two global
variables, one to track whether or not cost delay table options are in
use and the other to store the value of the table option cost delay. I
didn't want to use a single variable with a special value to indicate
that table option cost delay is in use because
autovacuum_vacuum_cost_delay already has special values that mean
certain things. My code needs a better solution.

While it's true that wi_cost_delay doesn't need to be shared, it seems
to make the logic somewhat complex. We need to handle cost_delay in a
different way from other vacuum-related parameters and we need to make
sure av[_use]_table_option_cost_delay are set properly. Removing
wi_cost_delay from WorkerInfoData saves 8 bytes shared memory per
autovacuum worker but it might be worth considering to keep
wi_cost_delay for simplicity.

Ah, it turns out we can't really remove wi_cost_delay from WorkerInfo
anyway because the launcher doesn't know anything about table options
and so the workers have to keep an updated wi_cost_delay that the
launcher or other autovac workers who are not vacuuming that table can
read from when calculating the new limit in autovac_balance_cost().

However, wi_cost_delay is a double, so if we start updating it on config
reload in vacuum_delay_point(), we definitely need some protection
against torn reads.

The table options can only change when workers start vacuuming a new
table, so maybe there is some way to use this to solve this problem?

It is worth mentioning that I think that in master,
AutoVacuumUpdateDelay() was incorrectly reading wi_cost_limit and
wi_cost_delay from shared memory without holding a lock.

Indeed.

I've added in a shared lock for reading from wi_cost_limit in this
patch. However, AutoVacuumUpdateLimit() is called unconditionally in
vacuum_delay_point(), which is called quite often (per block-ish), so I
was trying to think if there is a way we could avoid having to check
this shared memory variable on every call to vacuum_delay_point().
Rebalances shouldn't happen very often (done by the launcher when a new
worker is launched and by workers between vacuuming tables). Maybe we
can read from it less frequently?

Yeah, acquiring the lwlock for every call to vacuum_delay_point()
seems to be harmful. One idea would be to have one sig_atomic_t
variable in WorkerInfoData and autovac_balance_cost() set it to true
after rebalancing the worker's cost-limit. The worker can check it
without locking and update its delay parameters if the flag is true.

Maybe we can do something like this with the table options values?

- Melanie

#19Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Melanie Plageman (#18)
Re: Should vacuum process config file reload more often

On Fri, Mar 10, 2023 at 11:23 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Tue, Mar 7, 2023 at 12:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Mar 6, 2023 at 5:26 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Thu, Mar 2, 2023 at 6:37 PM Melanie Plageman
In this version I've removed wi_cost_delay from WorkerInfoData. There is
no synchronization of cost_delay amongst workers, so there is no reason
to keep it in shared memory.

One consequence of not updating VacuumCostDelay from wi_cost_delay is
that we have to have a way to keep track of whether or not autovacuum
table options are in use.

This patch does this in a cringeworthy way. I added two global
variables, one to track whether or not cost delay table options are in
use and the other to store the value of the table option cost delay. I
didn't want to use a single variable with a special value to indicate
that table option cost delay is in use because
autovacuum_vacuum_cost_delay already has special values that mean
certain things. My code needs a better solution.

While it's true that wi_cost_delay doesn't need to be shared, it seems
to make the logic somewhat complex. We need to handle cost_delay in a
different way from other vacuum-related parameters and we need to make
sure av[_use]_table_option_cost_delay are set properly. Removing
wi_cost_delay from WorkerInfoData saves 8 bytes shared memory per
autovacuum worker but it might be worth considering to keep
wi_cost_delay for simplicity.

Ah, it turns out we can't really remove wi_cost_delay from WorkerInfo
anyway because the launcher doesn't know anything about table options
and so the workers have to keep an updated wi_cost_delay that the
launcher or other autovac workers who are not vacuuming that table can
read from when calculating the new limit in autovac_balance_cost().

IIUC if any of the cost delay parameters has been set individually,
the autovacuum worker is excluded from the balance algorithm.

However, wi_cost_delay is a double, so if we start updating it on config
reload in vacuum_delay_point(), we definitely need some protection
against torn reads.

The table options can only change when workers start vacuuming a new
table, so maybe there is some way to use this to solve this problem?

It is worth mentioning that I think that in master,
AutoVacuumUpdateDelay() was incorrectly reading wi_cost_limit and
wi_cost_delay from shared memory without holding a lock.

Indeed.

I've added in a shared lock for reading from wi_cost_limit in this
patch. However, AutoVacuumUpdateLimit() is called unconditionally in
vacuum_delay_point(), which is called quite often (per block-ish), so I
was trying to think if there is a way we could avoid having to check
this shared memory variable on every call to vacuum_delay_point().
Rebalances shouldn't happen very often (done by the launcher when a new
worker is launched and by workers between vacuuming tables). Maybe we
can read from it less frequently?

Yeah, acquiring the lwlock for every call to vacuum_delay_point()
seems to be harmful. One idea would be to have one sig_atomic_t
variable in WorkerInfoData and autovac_balance_cost() set it to true
after rebalancing the worker's cost-limit. The worker can check it
without locking and update its delay parameters if the flag is true.

Maybe we can do something like this with the table options values?

Since an autovacuum that uses any of table option cost delay
parameters is excluded from the balancing algorithm, the launcher
doesn't need to notify such workers of changes of the cost-limit, no?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#20Melanie Plageman
melanieplageman@gmail.com
In reply to: Masahiko Sawada (#19)
1 attachment(s)
Re: Should vacuum process config file reload more often

Quotes below are combined from two of Sawada-san's emails.

I've also attached a patch with my suggested current version.

On Thu, Mar 9, 2023 at 10:27 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Mar 10, 2023 at 11:23 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Tue, Mar 7, 2023 at 12:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Mar 6, 2023 at 5:26 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Thu, Mar 2, 2023 at 6:37 PM Melanie Plageman
In this version I've removed wi_cost_delay from WorkerInfoData. There is
no synchronization of cost_delay amongst workers, so there is no reason
to keep it in shared memory.

One consequence of not updating VacuumCostDelay from wi_cost_delay is
that we have to have a way to keep track of whether or not autovacuum
table options are in use.

This patch does this in a cringeworthy way. I added two global
variables, one to track whether or not cost delay table options are in
use and the other to store the value of the table option cost delay. I
didn't want to use a single variable with a special value to indicate
that table option cost delay is in use because
autovacuum_vacuum_cost_delay already has special values that mean
certain things. My code needs a better solution.

While it's true that wi_cost_delay doesn't need to be shared, it seems
to make the logic somewhat complex. We need to handle cost_delay in a
different way from other vacuum-related parameters and we need to make
sure av[_use]_table_option_cost_delay are set properly. Removing
wi_cost_delay from WorkerInfoData saves 8 bytes shared memory per
autovacuum worker but it might be worth considering to keep
wi_cost_delay for simplicity.

Ah, it turns out we can't really remove wi_cost_delay from WorkerInfo
anyway because the launcher doesn't know anything about table options
and so the workers have to keep an updated wi_cost_delay that the
launcher or other autovac workers who are not vacuuming that table can
read from when calculating the new limit in autovac_balance_cost().

IIUC if any of the cost delay parameters has been set individually,
the autovacuum worker is excluded from the balance algorithm.

Ah, yes! That's right. So it is not a problem. Then I still think
removing wi_cost_delay from the worker info makes sense. wi_cost_delay
is a double and can't easily be accessed atomically the way
wi_cost_limit can be.

Keeping the cost delay local to the backends also makes it clear that
cost delay is not something that should be written to by other backends
or that can differ from worker to worker. Without table options in the
picture, the cost delay should be the same for any worker who has
reloaded the config file.

As for the cost limit safe access issue, maybe we can avoid a LWLock
acquisition for reading wi_cost_limit by using an atomic similar to what
you suggested here for "did_rebalance".

I've added in a shared lock for reading from wi_cost_limit in this
patch. However, AutoVacuumUpdateLimit() is called unconditionally in
vacuum_delay_point(), which is called quite often (per block-ish), so I
was trying to think if there is a way we could avoid having to check
this shared memory variable on every call to vacuum_delay_point().
Rebalances shouldn't happen very often (done by the launcher when a new
worker is launched and by workers between vacuuming tables). Maybe we
can read from it less frequently?

Yeah, acquiring the lwlock for every call to vacuum_delay_point()
seems to be harmful. One idea would be to have one sig_atomic_t
variable in WorkerInfoData and autovac_balance_cost() set it to true
after rebalancing the worker's cost-limit. The worker can check it
without locking and update its delay parameters if the flag is true.

Instead of having the atomic indicate whether or not someone (launcher
or another worker) did a rebalance, it would simply store the current
cost limit. Then the worker can normally access it with a simple read.

My rationale is that if we used an atomic to indicate whether or not we
did a rebalance ("did_rebalance"), we would have the same cache
coherency guarantees as if we just used the atomic for the cost limit.
If we read from the "did_rebalance" variable and missed someone having
written to it on another core, we still wouldn't get around to checking
the wi_cost_limit variable in shared memory, so it doesn't matter that
we bothered to keep it in shared memory and use a lock to access it.

I noticed we don't allow wi_cost_limit to ever be less than 0, so we
could store wi_cost_limit in an atomic uint32.

I'm not sure if it is okay to do pg_atomic_read_u32() and
pg_atomic_unlocked_write_u32() or if we need pg_atomic_write_u32() in
most cases.

I've implemented the atomic cost limit in the attached patch. Though,
I'm pretty unsure about how I initialized the atomics in
AutoVacuumShmemInit()...

If the consensus is that it is simply too confusing to take
wi_cost_delay out of WorkerInfo, we might be able to afford using a
shared lock to access it because we won't call AutoVacuumUpdateDelay()
on every invocation of vacuum_delay_point() -- only when we've reloaded
the config file.

One potential option to avoid taking a shared lock on every call to
AutoVacuumUpdateDelay() is to set a global variable to indicate that we
did update it (since we are the only ones updating it) and then only
take the shared LWLock in AutoVacuumUpdateDelay() if that flag is true.

---
void
AutoVacuumUpdateDelay(void)
{
-        if (MyWorkerInfo)
+        /*
+         * We are using autovacuum-related GUCs to update
VacuumCostDelay, so we
+         * only want autovacuum workers and autovacuum launcher to do this.
+         */
+        if (!(am_autovacuum_worker || am_autovacuum_launcher))
+                return;

Is there any case where the autovacuum launcher calls
AutoVacuumUpdateDelay() function?

I had meant to add it to HandleAutoVacLauncherInterrupts() after
reloading the config file (done in attached patch). When using the
global variables for cost delay (instead of wi_cost_delay in worker
info), the autovac launcher also has to do the check in the else branch
of AutoVacuumUpdateDelay()

VacuumCostDelay = autovacuum_vac_cost_delay >= 0 ?
autovacuum_vac_cost_delay : VacuumCostDelay;

to make sure VacuumCostDelay is correct for when it calls
autovac_balance_cost().

This also made me think about whether or not we still need cost_limit_base.
It is used to ensure that autovac_balance_cost() never ends up setting
workers' wi_cost_limits above the current autovacuum_vacuum_cost_limit
(or VacuumCostLimit). However, the launcher and all the workers should
know what the value is without cost_limit_base, no?

---
In at autovac_balance_cost(), we have,

int vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
autovacuum_vac_cost_limit : VacuumCostLimit);
double vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
autovacuum_vac_cost_delay : VacuumCostDelay);
:
/* not set? nothing to do */
if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
return;

IIUC if autovacuum_vac_cost_delay is changed to 0 during autovacuums
running, their vacuum delay parameters are not changed. It's not a bug
of the patch but I think we can fix it in this patch.

Yes, currently (in master) wi_cost_delay does not get updated anywhere.
In my patch, the global variable we are using for delay is updated but
it is not done in autovac_balance_cost().

Also not sure how the patch interacts with failsafe autovac and parallel
vacuum.

Good point.

When entering the failsafe mode, we disable the vacuum delays (see
lazy_check_wraparound_failsafe()). We need to keep disabling the
vacuum delays even after reloading the config file. One idea is to
have another global variable indicating we're in the failsafe mode.
vacuum_delay_point() doesn't update VacuumCostActive if the flag is
true.

I think we might not need to do this. Other than in
lazy_check_wraparound_failsafe(), VacuumCostActive is only updated in
two places:

1) in vacuum() which autovacuum will call per table. And failsafe is
reset per table as well.

2) in vacuum_delay_point(), but, since VacuumCostActive will already be
false when we enter vacuum_delay_point() the next time after
lazy_check_wraparound_failsafe(), we won't set VacuumCostActive there.

Thanks again for the detailed feedback!

- Melanie

Attachments:

v3-0001-Reload-config-file-more-often-while-vacuuming.patchtext/x-patch; charset=US-ASCII; name=v3-0001-Reload-config-file-more-often-while-vacuuming.patchDownload
From 97b6ebc3eaa3d8ec53c25822c6cad2d2bbe22837 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sun, 5 Mar 2023 14:39:16 -0500
Subject: [PATCH v3] Reload config file more often while vacuuming

---
 src/backend/commands/vacuum.c       |  38 +++++++---
 src/backend/postmaster/autovacuum.c | 113 ++++++++++++++++++++--------
 src/include/postmaster/autovacuum.h |   2 +
 3 files changed, 114 insertions(+), 39 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 2e12baf8eb..9d5ce846a5 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -48,6 +48,7 @@
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/interrupt.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -75,6 +76,7 @@ int			vacuum_multixact_failsafe_age;
 /* A few variables that don't seem worth passing around as parameters */
 static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
+static bool analyze_in_outer_xact = false;
 
 
 /*
@@ -313,8 +315,7 @@ vacuum(List *relations, VacuumParams *params,
 	static bool in_vacuum = false;
 
 	const char *stmttype;
-	volatile bool in_outer_xact,
-				use_own_xacts;
+	volatile bool use_own_xacts;
 
 	Assert(params != NULL);
 
@@ -331,10 +332,10 @@ vacuum(List *relations, VacuumParams *params,
 	if (params->options & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
-		in_outer_xact = false;
+		analyze_in_outer_xact = false;
 	}
 	else
-		in_outer_xact = IsInTransactionBlock(isTopLevel);
+		analyze_in_outer_xact = IsInTransactionBlock(isTopLevel);
 
 	/*
 	 * Due to static variables vac_context, anl_context and vac_strategy,
@@ -456,7 +457,7 @@ vacuum(List *relations, VacuumParams *params,
 		Assert(params->options & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
-		else if (in_outer_xact)
+		else if (analyze_in_outer_xact)
 			use_own_xacts = false;
 		else if (list_length(relations) > 1)
 			use_own_xacts = true;
@@ -474,7 +475,7 @@ vacuum(List *relations, VacuumParams *params,
 	 */
 	if (use_own_xacts)
 	{
-		Assert(!in_outer_xact);
+		Assert(!analyze_in_outer_xact);
 
 		/* ActiveSnapshot is not set by autovacuum */
 		if (ActiveSnapshotSet())
@@ -526,7 +527,7 @@ vacuum(List *relations, VacuumParams *params,
 				}
 
 				analyze_rel(vrel->oid, vrel->relation, params,
-							vrel->va_cols, in_outer_xact, vac_strategy);
+							vrel->va_cols, analyze_in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
 				{
@@ -549,6 +550,7 @@ vacuum(List *relations, VacuumParams *params,
 	{
 		in_vacuum = false;
 		VacuumCostActive = false;
+		analyze_in_outer_xact = false;
 	}
 	PG_END_TRY();
 
@@ -2238,10 +2240,28 @@ vacuum_delay_point(void)
 						 WAIT_EVENT_VACUUM_DELAY);
 		ResetLatch(MyLatch);
 
+		/*
+		 * Reload the configuration file if requested. This allows changes to
+		 * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay
+		 * to take effect while a table is being vacuumed or analyzed.
+		 */
+		if (ConfigReloadPending && !analyze_in_outer_xact)
+		{
+			ConfigReloadPending = false;
+			ProcessConfigFile(PGC_SIGHUP);
+			AutoVacuumUpdateDelay();
+		}
+
 		VacuumCostBalance = 0;
 
-		/* update balance values for workers */
-		AutoVacuumUpdateDelay();
+		/*
+		 * Update balance values for workers. We must always do this in case
+		 * the autovacuum launcher has done a rebalance (as it does when
+		 * launching a new worker).
+		 */
+		AutoVacuumUpdateLimit();
+
+		VacuumCostActive = (VacuumCostDelay > 0);
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c0e2e00a7e..a9b7217638 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -139,6 +139,9 @@ int			Log_autovacuum_min_duration = 600000;
 static bool am_autovacuum_launcher = false;
 static bool am_autovacuum_worker = false;
 
+static bool av_use_table_option_cost_delay = false;
+static double av_table_option_cost_delay = 0;
+
 /* Flags set by signal handlers */
 static volatile sig_atomic_t got_SIGUSR2 = false;
 
@@ -189,7 +192,6 @@ typedef struct autovac_table
 {
 	Oid			at_relid;
 	VacuumParams at_params;
-	double		at_vacuum_cost_delay;
 	int			at_vacuum_cost_limit;
 	bool		at_dobalance;
 	bool		at_sharedrel;
@@ -225,8 +227,7 @@ typedef struct WorkerInfoData
 	TimestampTz wi_launchtime;
 	bool		wi_dobalance;
 	bool		wi_sharedrel;
-	double		wi_cost_delay;
-	int			wi_cost_limit;
+	pg_atomic_uint32 wi_cost_limit;
 	int			wi_cost_limit_base;
 } WorkerInfoData;
 
@@ -743,6 +744,7 @@ AutoVacLauncherMain(int argc, char *argv[])
 					worker = AutoVacuumShmem->av_startingWorker;
 					worker->wi_dboid = InvalidOid;
 					worker->wi_tableoid = InvalidOid;
+					pg_atomic_write_u32(&MyWorkerInfo->wi_cost_limit, 0);
 					worker->wi_sharedrel = false;
 					worker->wi_proc = NULL;
 					worker->wi_launchtime = 0;
@@ -815,6 +817,7 @@ HandleAutoVacLauncherInterrupts(void)
 		ConfigReloadPending = false;
 		ProcessConfigFile(PGC_SIGHUP);
 
+		AutoVacuumUpdateDelay();
 		/* shutdown requested in config file? */
 		if (!AutoVacuumingActive())
 			AutoVacLauncherShutdown();
@@ -1756,8 +1759,7 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
 		MyWorkerInfo->wi_dobalance = false;
-		MyWorkerInfo->wi_cost_delay = 0;
-		MyWorkerInfo->wi_cost_limit = 0;
+		pg_atomic_write_u32(&MyWorkerInfo->wi_cost_limit, 0);
 		MyWorkerInfo->wi_cost_limit_base = 0;
 		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 						&MyWorkerInfo->wi_links);
@@ -1780,13 +1782,37 @@ FreeWorkerInfo(int code, Datum arg)
 void
 AutoVacuumUpdateDelay(void)
 {
-	if (MyWorkerInfo)
+	/*
+	 * We are using autovacuum-related GUCs to update VacuumCostDelay, so we
+	 * only want autovacuum workers and autovacuum launcher to do this.
+	 */
+	if (!(am_autovacuum_worker || am_autovacuum_launcher))
+		return;
+
+	if (av_use_table_option_cost_delay)
 	{
-		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
-		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+		VacuumCostDelay = av_table_option_cost_delay;
+	}
+	else
+	{
+		VacuumCostDelay = autovacuum_vac_cost_delay >= 0 ?
+			autovacuum_vac_cost_delay : VacuumCostDelay;
 	}
 }
 
+/*
+ * Helper for vacuum_delay_point() to allow workers to read their
+ * wi_cost_limit.
+ */
+void
+AutoVacuumUpdateLimit(void)
+{
+	if (!MyWorkerInfo)
+		return;
+
+	VacuumCostLimit = pg_atomic_read_u32(&MyWorkerInfo->wi_cost_limit);
+}
+
 /*
  * autovac_balance_cost
  *		Recalculate the cost limit setting for each active worker.
@@ -1824,9 +1850,9 @@ autovac_balance_cost(void)
 
 		if (worker->wi_proc != NULL &&
 			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
+			worker->wi_cost_limit_base > 0 && vac_cost_delay > 0)
 			cost_total +=
-				(double) worker->wi_cost_limit_base / worker->wi_cost_delay;
+				(double) worker->wi_cost_limit_base / vac_cost_delay;
 	}
 
 	/* there are no cost limits -- nothing to do */
@@ -1844,7 +1870,7 @@ autovac_balance_cost(void)
 
 		if (worker->wi_proc != NULL &&
 			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
+			worker->wi_cost_limit_base > 0 && vac_cost_delay > 0)
 		{
 			int			limit = (int)
 			(cost_avail * worker->wi_cost_limit_base / cost_total);
@@ -1855,17 +1881,15 @@ autovac_balance_cost(void)
 			 * in these calculations, let's be sure we don't ever set
 			 * cost_limit to more than the base value.
 			 */
-			worker->wi_cost_limit = Max(Min(limit,
-											worker->wi_cost_limit_base),
-										1);
+			pg_atomic_unlocked_write_u32(&worker->wi_cost_limit,
+										 Max(Min(limit, worker->wi_cost_limit_base), 1));
 		}
 
 		if (worker->wi_proc != NULL)
-			elog(DEBUG2, "autovac_balance_cost(pid=%d db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d, cost_delay=%g)",
+			elog(DEBUG2, "autovac_balance_cost(pid=%d db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d)",
 				 worker->wi_proc->pid, worker->wi_dboid, worker->wi_tableoid,
 				 worker->wi_dobalance ? "yes" : "no",
-				 worker->wi_cost_limit, worker->wi_cost_limit_base,
-				 worker->wi_cost_delay);
+				 pg_atomic_read_u32(&worker->wi_cost_limit), worker->wi_cost_limit_base);
 	}
 }
 
@@ -2326,6 +2350,13 @@ do_autovacuum(void)
 			ConfigReloadPending = false;
 			ProcessConfigFile(PGC_SIGHUP);
 
+			/*
+			 * Autovacuum workers should always update VacuumCostDelay and
+			 * VacuumCostLimit in case they were overridden by the reload.
+			 */
+			AutoVacuumUpdateDelay();
+			AutoVacuumUpdateLimit();
+
 			/*
 			 * You might be tempted to bail out if we see autovacuum is now
 			 * disabled.  Must resist that temptation -- this might be a
@@ -2424,21 +2455,22 @@ do_autovacuum(void)
 		stdVacuumCostDelay = VacuumCostDelay;
 		stdVacuumCostLimit = VacuumCostLimit;
 
+		AutoVacuumUpdateDelay();
+
 		/* Must hold AutovacuumLock while mucking with cost balance info */
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 
 		/* advertise my cost delay parameters for the balancing algorithm */
 		MyWorkerInfo->wi_dobalance = tab->at_dobalance;
-		MyWorkerInfo->wi_cost_delay = tab->at_vacuum_cost_delay;
-		MyWorkerInfo->wi_cost_limit = tab->at_vacuum_cost_limit;
+		/* this cannot exceed 10000 anyway */
+		pg_atomic_unlocked_write_u32(&MyWorkerInfo->wi_cost_limit,
+									 tab->at_vacuum_cost_limit);
 		MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
+		AutoVacuumUpdateLimit();
 
 		/* do a balance */
 		autovac_balance_cost();
 
-		/* set the active cost parameters from the result of that */
-		AutoVacuumUpdateDelay();
-
 		/* done */
 		LWLockRelease(AutovacuumLock);
 
@@ -2569,6 +2601,8 @@ deleted:
 		{
 			ConfigReloadPending = false;
 			ProcessConfigFile(PGC_SIGHUP);
+			AutoVacuumUpdateDelay();
+			AutoVacuumUpdateLimit();
 		}
 
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
@@ -2771,7 +2805,11 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 	/* fetch the relation's relcache entry */
 	classTup = SearchSysCacheCopy1(RELOID, ObjectIdGetDatum(relid));
 	if (!HeapTupleIsValid(classTup))
+	{
+		av_use_table_option_cost_delay = false;
+		av_table_option_cost_delay = 0;
 		return NULL;
+	}
 	classForm = (Form_pg_class) GETSTRUCT(classTup);
 
 	/*
@@ -2802,7 +2840,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		int			multixact_freeze_min_age;
 		int			multixact_freeze_table_age;
 		int			vac_cost_limit;
-		double		vac_cost_delay;
 		int			log_min_duration;
 
 		/*
@@ -2812,12 +2849,16 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 * defaults, autovacuum's own first and plain vacuum second.
 		 */
 
-		/* -1 in autovac setting means use plain vacuum_cost_delay */
-		vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
-			? avopts->vacuum_cost_delay
-			: (autovacuum_vac_cost_delay >= 0)
-			? autovacuum_vac_cost_delay
-			: VacuumCostDelay;
+		if (avopts && avopts->vacuum_cost_delay >= 0)
+		{
+			av_use_table_option_cost_delay = true;
+			av_table_option_cost_delay = avopts->vacuum_cost_delay;
+		}
+		else
+		{
+			av_use_table_option_cost_delay = false;
+			av_table_option_cost_delay = 0;
+		}
 
 		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
 		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
@@ -2882,7 +2923,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab->at_params.is_wraparound = wraparound;
 		tab->at_params.log_min_duration = log_min_duration;
 		tab->at_vacuum_cost_limit = vac_cost_limit;
-		tab->at_vacuum_cost_delay = vac_cost_delay;
 		tab->at_relname = NULL;
 		tab->at_nspname = NULL;
 		tab->at_datname = NULL;
@@ -2895,6 +2935,11 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			!(avopts && (avopts->vacuum_cost_limit > 0 ||
 						 avopts->vacuum_cost_delay > 0));
 	}
+	else
+	{
+		av_use_table_option_cost_delay = false;
+		av_table_option_cost_delay = 0;
+	}
 
 	heap_freetuple(classTup);
 	return tab;
@@ -3361,6 +3406,7 @@ AutoVacuumShmemInit(void)
 	{
 		WorkerInfo	worker;
 		int			i;
+		dlist_iter	iter;
 
 		Assert(!found);
 
@@ -3374,10 +3420,17 @@ AutoVacuumShmemInit(void)
 		worker = (WorkerInfo) ((char *) AutoVacuumShmem +
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
+
 		/* initialize the WorkerInfo free list */
 		for (i = 0; i < autovacuum_max_workers; i++)
 			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 							&worker[i].wi_links);
+
+		dlist_foreach(iter, &AutoVacuumShmem->av_freeWorkers)
+			pg_atomic_init_u32(
+							   &(dlist_container(WorkerInfoData, wi_links, iter.cur))->wi_cost_limit,
+							   0);
+
 	}
 	else
 		Assert(found);
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index c140371b51..558358911c 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -66,6 +66,8 @@ extern void AutoVacWorkerFailed(void);
 /* autovacuum cost-delay balancer */
 extern void AutoVacuumUpdateDelay(void);
 
+extern void AutoVacuumUpdateLimit(void);
+
 #ifdef EXEC_BACKEND
 extern void AutoVacLauncherMain(int argc, char *argv[]) pg_attribute_noreturn();
 extern void AutoVacWorkerMain(int argc, char *argv[]) pg_attribute_noreturn();
-- 
2.37.2

#21Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#20)
1 attachment(s)
Re: Should vacuum process config file reload more often

On Fri, Mar 10, 2023 at 6:11 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Quotes below are combined from two of Sawada-san's emails.

I've also attached a patch with my suggested current version.

On Thu, Mar 9, 2023 at 10:27 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Mar 10, 2023 at 11:23 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Tue, Mar 7, 2023 at 12:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Mar 6, 2023 at 5:26 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Thu, Mar 2, 2023 at 6:37 PM Melanie Plageman
In this version I've removed wi_cost_delay from WorkerInfoData. There is
no synchronization of cost_delay amongst workers, so there is no reason
to keep it in shared memory.

One consequence of not updating VacuumCostDelay from wi_cost_delay is
that we have to have a way to keep track of whether or not autovacuum
table options are in use.

This patch does this in a cringeworthy way. I added two global
variables, one to track whether or not cost delay table options are in
use and the other to store the value of the table option cost delay. I
didn't want to use a single variable with a special value to indicate
that table option cost delay is in use because
autovacuum_vacuum_cost_delay already has special values that mean
certain things. My code needs a better solution.

While it's true that wi_cost_delay doesn't need to be shared, it seems
to make the logic somewhat complex. We need to handle cost_delay in a
different way from other vacuum-related parameters and we need to make
sure av[_use]_table_option_cost_delay are set properly. Removing
wi_cost_delay from WorkerInfoData saves 8 bytes shared memory per
autovacuum worker but it might be worth considering to keep
wi_cost_delay for simplicity.

Ah, it turns out we can't really remove wi_cost_delay from WorkerInfo
anyway because the launcher doesn't know anything about table options
and so the workers have to keep an updated wi_cost_delay that the
launcher or other autovac workers who are not vacuuming that table can
read from when calculating the new limit in autovac_balance_cost().

IIUC if any of the cost delay parameters has been set individually,
the autovacuum worker is excluded from the balance algorithm.

Ah, yes! That's right. So it is not a problem. Then I still think
removing wi_cost_delay from the worker info makes sense. wi_cost_delay
is a double and can't easily be accessed atomically the way
wi_cost_limit can be.

Keeping the cost delay local to the backends also makes it clear that
cost delay is not something that should be written to by other backends
or that can differ from worker to worker. Without table options in the
picture, the cost delay should be the same for any worker who has
reloaded the config file.

As for the cost limit safe access issue, maybe we can avoid a LWLock
acquisition for reading wi_cost_limit by using an atomic similar to what
you suggested here for "did_rebalance".

I've added in a shared lock for reading from wi_cost_limit in this
patch. However, AutoVacuumUpdateLimit() is called unconditionally in
vacuum_delay_point(), which is called quite often (per block-ish), so I
was trying to think if there is a way we could avoid having to check
this shared memory variable on every call to vacuum_delay_point().
Rebalances shouldn't happen very often (done by the launcher when a new
worker is launched and by workers between vacuuming tables). Maybe we
can read from it less frequently?

Yeah, acquiring the lwlock for every call to vacuum_delay_point()
seems to be harmful. One idea would be to have one sig_atomic_t
variable in WorkerInfoData and autovac_balance_cost() set it to true
after rebalancing the worker's cost-limit. The worker can check it
without locking and update its delay parameters if the flag is true.

Instead of having the atomic indicate whether or not someone (launcher
or another worker) did a rebalance, it would simply store the current
cost limit. Then the worker can normally access it with a simple read.

My rationale is that if we used an atomic to indicate whether or not we
did a rebalance ("did_rebalance"), we would have the same cache
coherency guarantees as if we just used the atomic for the cost limit.
If we read from the "did_rebalance" variable and missed someone having
written to it on another core, we still wouldn't get around to checking
the wi_cost_limit variable in shared memory, so it doesn't matter that
we bothered to keep it in shared memory and use a lock to access it.

I noticed we don't allow wi_cost_limit to ever be less than 0, so we
could store wi_cost_limit in an atomic uint32.

I'm not sure if it is okay to do pg_atomic_read_u32() and
pg_atomic_unlocked_write_u32() or if we need pg_atomic_write_u32() in
most cases.

I've implemented the atomic cost limit in the attached patch. Though,
I'm pretty unsure about how I initialized the atomics in
AutoVacuumShmemInit()...

If the consensus is that it is simply too confusing to take
wi_cost_delay out of WorkerInfo, we might be able to afford using a
shared lock to access it because we won't call AutoVacuumUpdateDelay()
on every invocation of vacuum_delay_point() -- only when we've reloaded
the config file.

One such implementation is attached.

- Melanie

Attachments:

v4-0001-vacuum-reloads-config-file-more-often.patchtext/x-patch; charset=US-ASCII; name=v4-0001-vacuum-reloads-config-file-more-often.patchDownload
From f0029f1e410852b41b5562939051ede86235508b Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Fri, 10 Mar 2023 18:33:47 -0500
Subject: [PATCH v4] vacuum reloads config file more often

---
 src/backend/commands/vacuum.c       | 38 ++++++++++++----
 src/backend/postmaster/autovacuum.c | 69 +++++++++++++++++++++++------
 src/include/postmaster/autovacuum.h |  2 +
 3 files changed, 87 insertions(+), 22 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 2e12baf8eb..9d5ce846a5 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -48,6 +48,7 @@
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/interrupt.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/proc.h"
@@ -75,6 +76,7 @@ int			vacuum_multixact_failsafe_age;
 /* A few variables that don't seem worth passing around as parameters */
 static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
+static bool analyze_in_outer_xact = false;
 
 
 /*
@@ -313,8 +315,7 @@ vacuum(List *relations, VacuumParams *params,
 	static bool in_vacuum = false;
 
 	const char *stmttype;
-	volatile bool in_outer_xact,
-				use_own_xacts;
+	volatile bool use_own_xacts;
 
 	Assert(params != NULL);
 
@@ -331,10 +332,10 @@ vacuum(List *relations, VacuumParams *params,
 	if (params->options & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
-		in_outer_xact = false;
+		analyze_in_outer_xact = false;
 	}
 	else
-		in_outer_xact = IsInTransactionBlock(isTopLevel);
+		analyze_in_outer_xact = IsInTransactionBlock(isTopLevel);
 
 	/*
 	 * Due to static variables vac_context, anl_context and vac_strategy,
@@ -456,7 +457,7 @@ vacuum(List *relations, VacuumParams *params,
 		Assert(params->options & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
-		else if (in_outer_xact)
+		else if (analyze_in_outer_xact)
 			use_own_xacts = false;
 		else if (list_length(relations) > 1)
 			use_own_xacts = true;
@@ -474,7 +475,7 @@ vacuum(List *relations, VacuumParams *params,
 	 */
 	if (use_own_xacts)
 	{
-		Assert(!in_outer_xact);
+		Assert(!analyze_in_outer_xact);
 
 		/* ActiveSnapshot is not set by autovacuum */
 		if (ActiveSnapshotSet())
@@ -526,7 +527,7 @@ vacuum(List *relations, VacuumParams *params,
 				}
 
 				analyze_rel(vrel->oid, vrel->relation, params,
-							vrel->va_cols, in_outer_xact, vac_strategy);
+							vrel->va_cols, analyze_in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
 				{
@@ -549,6 +550,7 @@ vacuum(List *relations, VacuumParams *params,
 	{
 		in_vacuum = false;
 		VacuumCostActive = false;
+		analyze_in_outer_xact = false;
 	}
 	PG_END_TRY();
 
@@ -2238,10 +2240,28 @@ vacuum_delay_point(void)
 						 WAIT_EVENT_VACUUM_DELAY);
 		ResetLatch(MyLatch);
 
+		/*
+		 * Reload the configuration file if requested. This allows changes to
+		 * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay
+		 * to take effect while a table is being vacuumed or analyzed.
+		 */
+		if (ConfigReloadPending && !analyze_in_outer_xact)
+		{
+			ConfigReloadPending = false;
+			ProcessConfigFile(PGC_SIGHUP);
+			AutoVacuumUpdateDelay();
+		}
+
 		VacuumCostBalance = 0;
 
-		/* update balance values for workers */
-		AutoVacuumUpdateDelay();
+		/*
+		 * Update balance values for workers. We must always do this in case
+		 * the autovacuum launcher has done a rebalance (as it does when
+		 * launching a new worker).
+		 */
+		AutoVacuumUpdateLimit();
+
+		VacuumCostActive = (VacuumCostDelay > 0);
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c0e2e00a7e..7a202b2bd9 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -226,7 +226,7 @@ typedef struct WorkerInfoData
 	bool		wi_dobalance;
 	bool		wi_sharedrel;
 	double		wi_cost_delay;
-	int			wi_cost_limit;
+	pg_atomic_uint32 wi_cost_limit;
 	int			wi_cost_limit_base;
 } WorkerInfoData;
 
@@ -743,6 +743,7 @@ AutoVacLauncherMain(int argc, char *argv[])
 					worker = AutoVacuumShmem->av_startingWorker;
 					worker->wi_dboid = InvalidOid;
 					worker->wi_tableoid = InvalidOid;
+					pg_atomic_write_u32(&MyWorkerInfo->wi_cost_limit, 0);
 					worker->wi_sharedrel = false;
 					worker->wi_proc = NULL;
 					worker->wi_launchtime = 0;
@@ -1757,7 +1758,7 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_launchtime = 0;
 		MyWorkerInfo->wi_dobalance = false;
 		MyWorkerInfo->wi_cost_delay = 0;
-		MyWorkerInfo->wi_cost_limit = 0;
+		pg_atomic_write_u32(&MyWorkerInfo->wi_cost_limit, 0);
 		MyWorkerInfo->wi_cost_limit_base = 0;
 		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 						&MyWorkerInfo->wi_links);
@@ -1780,11 +1781,36 @@ FreeWorkerInfo(int code, Datum arg)
 void
 AutoVacuumUpdateDelay(void)
 {
-	if (MyWorkerInfo)
+	if (!MyWorkerInfo)
+		return;
+
+	LWLockAcquire(AutovacuumLock, LW_SHARED);
+
+	if (MyWorkerInfo->wi_dobalance)
 	{
-		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
-		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+		VacuumCostDelay = autovacuum_vac_cost_delay >= 0 ?
+			autovacuum_vac_cost_delay : VacuumCostDelay;
+
+		MyWorkerInfo->wi_cost_delay = VacuumCostDelay;
 	}
+	else
+		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
+
+	LWLockRelease(AutovacuumLock);
+
+}
+
+/*
+ * Helper for vacuum_delay_point() to allow workers to read their
+ * wi_cost_limit.
+ */
+void
+AutoVacuumUpdateLimit(void)
+{
+	if (!MyWorkerInfo)
+		return;
+
+	VacuumCostLimit = pg_atomic_read_u32(&MyWorkerInfo->wi_cost_limit);
 }
 
 /*
@@ -1855,16 +1881,15 @@ autovac_balance_cost(void)
 			 * in these calculations, let's be sure we don't ever set
 			 * cost_limit to more than the base value.
 			 */
-			worker->wi_cost_limit = Max(Min(limit,
-											worker->wi_cost_limit_base),
-										1);
+			pg_atomic_unlocked_write_u32(&worker->wi_cost_limit,
+										 Max(Min(limit, worker->wi_cost_limit_base), 1));
 		}
 
 		if (worker->wi_proc != NULL)
 			elog(DEBUG2, "autovac_balance_cost(pid=%d db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d, cost_delay=%g)",
 				 worker->wi_proc->pid, worker->wi_dboid, worker->wi_tableoid,
 				 worker->wi_dobalance ? "yes" : "no",
-				 worker->wi_cost_limit, worker->wi_cost_limit_base,
+				 pg_atomic_read_u32(&worker->wi_cost_limit), worker->wi_cost_limit_base,
 				 worker->wi_cost_delay);
 	}
 }
@@ -2326,6 +2351,13 @@ do_autovacuum(void)
 			ConfigReloadPending = false;
 			ProcessConfigFile(PGC_SIGHUP);
 
+			/*
+			 * Autovacuum workers should always update VacuumCostDelay and
+			 * VacuumCostLimit in case they were overridden by the reload.
+			 */
+			AutoVacuumUpdateDelay();
+			AutoVacuumUpdateLimit();
+
 			/*
 			 * You might be tempted to bail out if we see autovacuum is now
 			 * disabled.  Must resist that temptation -- this might be a
@@ -2424,21 +2456,22 @@ do_autovacuum(void)
 		stdVacuumCostDelay = VacuumCostDelay;
 		stdVacuumCostLimit = VacuumCostLimit;
 
+
 		/* Must hold AutovacuumLock while mucking with cost balance info */
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 
 		/* advertise my cost delay parameters for the balancing algorithm */
 		MyWorkerInfo->wi_dobalance = tab->at_dobalance;
 		MyWorkerInfo->wi_cost_delay = tab->at_vacuum_cost_delay;
-		MyWorkerInfo->wi_cost_limit = tab->at_vacuum_cost_limit;
+		/* this cannot exceed 10000 anyway */
+		pg_atomic_unlocked_write_u32(&MyWorkerInfo->wi_cost_limit,
+									 tab->at_vacuum_cost_limit);
 		MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
+		AutoVacuumUpdateLimit();
 
 		/* do a balance */
 		autovac_balance_cost();
 
-		/* set the active cost parameters from the result of that */
-		AutoVacuumUpdateDelay();
-
 		/* done */
 		LWLockRelease(AutovacuumLock);
 
@@ -2569,6 +2602,8 @@ deleted:
 		{
 			ConfigReloadPending = false;
 			ProcessConfigFile(PGC_SIGHUP);
+			AutoVacuumUpdateDelay();
+			AutoVacuumUpdateLimit();
 		}
 
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
@@ -3361,6 +3396,7 @@ AutoVacuumShmemInit(void)
 	{
 		WorkerInfo	worker;
 		int			i;
+		dlist_iter	iter;
 
 		Assert(!found);
 
@@ -3374,10 +3410,17 @@ AutoVacuumShmemInit(void)
 		worker = (WorkerInfo) ((char *) AutoVacuumShmem +
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
+
 		/* initialize the WorkerInfo free list */
 		for (i = 0; i < autovacuum_max_workers; i++)
 			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 							&worker[i].wi_links);
+
+		dlist_foreach(iter, &AutoVacuumShmem->av_freeWorkers)
+			pg_atomic_init_u32(
+							   &(dlist_container(WorkerInfoData, wi_links, iter.cur))->wi_cost_limit,
+							   0);
+
 	}
 	else
 		Assert(found);
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index c140371b51..558358911c 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -66,6 +66,8 @@ extern void AutoVacWorkerFailed(void);
 /* autovacuum cost-delay balancer */
 extern void AutoVacuumUpdateDelay(void);
 
+extern void AutoVacuumUpdateLimit(void);
+
 #ifdef EXEC_BACKEND
 extern void AutoVacLauncherMain(int argc, char *argv[]) pg_attribute_noreturn();
 extern void AutoVacWorkerMain(int argc, char *argv[]) pg_attribute_noreturn();
-- 
2.37.2

#22Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Melanie Plageman (#20)
Re: Should vacuum process config file reload more often

On Sat, Mar 11, 2023 at 8:11 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Quotes below are combined from two of Sawada-san's emails.

I've also attached a patch with my suggested current version.

On Thu, Mar 9, 2023 at 10:27 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Mar 10, 2023 at 11:23 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Tue, Mar 7, 2023 at 12:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Mar 6, 2023 at 5:26 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Thu, Mar 2, 2023 at 6:37 PM Melanie Plageman
In this version I've removed wi_cost_delay from WorkerInfoData. There is
no synchronization of cost_delay amongst workers, so there is no reason
to keep it in shared memory.

One consequence of not updating VacuumCostDelay from wi_cost_delay is
that we have to have a way to keep track of whether or not autovacuum
table options are in use.

This patch does this in a cringeworthy way. I added two global
variables, one to track whether or not cost delay table options are in
use and the other to store the value of the table option cost delay. I
didn't want to use a single variable with a special value to indicate
that table option cost delay is in use because
autovacuum_vacuum_cost_delay already has special values that mean
certain things. My code needs a better solution.

While it's true that wi_cost_delay doesn't need to be shared, it seems
to make the logic somewhat complex. We need to handle cost_delay in a
different way from other vacuum-related parameters and we need to make
sure av[_use]_table_option_cost_delay are set properly. Removing
wi_cost_delay from WorkerInfoData saves 8 bytes shared memory per
autovacuum worker but it might be worth considering to keep
wi_cost_delay for simplicity.

Ah, it turns out we can't really remove wi_cost_delay from WorkerInfo
anyway because the launcher doesn't know anything about table options
and so the workers have to keep an updated wi_cost_delay that the
launcher or other autovac workers who are not vacuuming that table can
read from when calculating the new limit in autovac_balance_cost().

IIUC if any of the cost delay parameters has been set individually,
the autovacuum worker is excluded from the balance algorithm.

Ah, yes! That's right. So it is not a problem. Then I still think
removing wi_cost_delay from the worker info makes sense. wi_cost_delay
is a double and can't easily be accessed atomically the way
wi_cost_limit can be.

Keeping the cost delay local to the backends also makes it clear that
cost delay is not something that should be written to by other backends
or that can differ from worker to worker. Without table options in the
picture, the cost delay should be the same for any worker who has
reloaded the config file.

Agreed.

As for the cost limit safe access issue, maybe we can avoid a LWLock
acquisition for reading wi_cost_limit by using an atomic similar to what
you suggested here for "did_rebalance".

I've added in a shared lock for reading from wi_cost_limit in this
patch. However, AutoVacuumUpdateLimit() is called unconditionally in
vacuum_delay_point(), which is called quite often (per block-ish), so I
was trying to think if there is a way we could avoid having to check
this shared memory variable on every call to vacuum_delay_point().
Rebalances shouldn't happen very often (done by the launcher when a new
worker is launched and by workers between vacuuming tables). Maybe we
can read from it less frequently?

Yeah, acquiring the lwlock for every call to vacuum_delay_point()
seems to be harmful. One idea would be to have one sig_atomic_t
variable in WorkerInfoData and autovac_balance_cost() set it to true
after rebalancing the worker's cost-limit. The worker can check it
without locking and update its delay parameters if the flag is true.

Instead of having the atomic indicate whether or not someone (launcher
or another worker) did a rebalance, it would simply store the current
cost limit. Then the worker can normally access it with a simple read.

My rationale is that if we used an atomic to indicate whether or not we
did a rebalance ("did_rebalance"), we would have the same cache
coherency guarantees as if we just used the atomic for the cost limit.
If we read from the "did_rebalance" variable and missed someone having
written to it on another core, we still wouldn't get around to checking
the wi_cost_limit variable in shared memory, so it doesn't matter that
we bothered to keep it in shared memory and use a lock to access it.

I noticed we don't allow wi_cost_limit to ever be less than 0, so we
could store wi_cost_limit in an atomic uint32.

I'm not sure if it is okay to do pg_atomic_read_u32() and
pg_atomic_unlocked_write_u32() or if we need pg_atomic_write_u32() in
most cases.

I agree to use pg_atomic_uin32. Given that the comment of
pg_atomic_unlocked_write_u32() says:

* pg_atomic_compare_exchange_u32. This should only be used in cases where
* minor performance regressions due to atomics emulation are unacceptable.

I think pg_atomic_write_u32() is enough for our use case.

I've implemented the atomic cost limit in the attached patch. Though,
I'm pretty unsure about how I initialized the atomics in
AutoVacuumShmemInit()...

+
                 /* initialize the WorkerInfo free list */
                 for (i = 0; i < autovacuum_max_workers; i++)
                         dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
                                                         &worker[i].wi_links);
+
+                dlist_foreach(iter, &AutoVacuumShmem->av_freeWorkers)
+                        pg_atomic_init_u32(
+
&(dlist_container(WorkerInfoData, wi_links, iter.cur))->wi_cost_limit,
+                                                           0);
+

I think we can do like:

/* initialize the WorkerInfo free list */
for (i = 0; i < autovacuum_max_workers; i++)
{
dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
&worker[i].wi_links);
pg_atomic_init_u32(&(worker[i].wi_cost_limit));
}

If the consensus is that it is simply too confusing to take
wi_cost_delay out of WorkerInfo, we might be able to afford using a
shared lock to access it because we won't call AutoVacuumUpdateDelay()
on every invocation of vacuum_delay_point() -- only when we've reloaded
the config file.

One potential option to avoid taking a shared lock on every call to
AutoVacuumUpdateDelay() is to set a global variable to indicate that we
did update it (since we are the only ones updating it) and then only
take the shared LWLock in AutoVacuumUpdateDelay() if that flag is true.

If we remove wi_cost_delay from WorkerInfo, probably we don't need to
acquire the lwlock in AutoVacuumUpdateDelay()? The shared field we
access in that function will be only wi_dobalance, but this field is
updated only by its owner autovacuum worker.

---
void
AutoVacuumUpdateDelay(void)
{
-        if (MyWorkerInfo)
+        /*
+         * We are using autovacuum-related GUCs to update
VacuumCostDelay, so we
+         * only want autovacuum workers and autovacuum launcher to do this.
+         */
+        if (!(am_autovacuum_worker || am_autovacuum_launcher))
+                return;

Is there any case where the autovacuum launcher calls
AutoVacuumUpdateDelay() function?

I had meant to add it to HandleAutoVacLauncherInterrupts() after
reloading the config file (done in attached patch). When using the
global variables for cost delay (instead of wi_cost_delay in worker
info), the autovac launcher also has to do the check in the else branch
of AutoVacuumUpdateDelay()

VacuumCostDelay = autovacuum_vac_cost_delay >= 0 ?
autovacuum_vac_cost_delay : VacuumCostDelay;

to make sure VacuumCostDelay is correct for when it calls
autovac_balance_cost().

But doesn't the launcher do a similar thing at the beginning of
autovac_balance_cost()?

double vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
autovacuum_vac_cost_delay : VacuumCostDelay);

Related to this point, I think autovac_balance_cost() should use
globally-set cost_limit and cost_delay values to calculate worker's
vacuum-delay parameters. IOW, vac_cost_limit and vac_cost_delay should
come from the config file setting, not table option etc:

int vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
autovacuum_vac_cost_limit : VacuumCostLimit);
double vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
autovacuum_vac_cost_delay : VacuumCostDelay);

If my understanding is right, the following change is not right;
AutoVacUpdateLimit() updates the VacuumCostLimit based on the value in
MyWorkerInfo:

MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
+ AutoVacuumUpdateLimit();

/* do a balance */
autovac_balance_cost();

- /* set the active cost parameters from the result of that */
- AutoVacuumUpdateDelay();

Also, even when using the global variables for cost delay, the
launcher doesn't need to check the global variable. It should always
be able to use either autovacuum_vac_cost_delay/limit or
VacuumCostDelay/Limit.

This also made me think about whether or not we still need cost_limit_base.
It is used to ensure that autovac_balance_cost() never ends up setting
workers' wi_cost_limits above the current autovacuum_vacuum_cost_limit
(or VacuumCostLimit). However, the launcher and all the workers should
know what the value is without cost_limit_base, no?

Yeah, the current balancing algorithm looks to respect the cost_limit
value set when starting to vacuum the table. The proportion of the
amount of I/O that a worker can consume is calculated based on the
base value and the new worker's cost_limit value cannot exceed the
base value. Given that we're trying to dynamically tune worker's cost
parameters (delay and limit), this concept seems to need to be
updated.

Also not sure how the patch interacts with failsafe autovac and parallel
vacuum.

Good point.

When entering the failsafe mode, we disable the vacuum delays (see
lazy_check_wraparound_failsafe()). We need to keep disabling the
vacuum delays even after reloading the config file. One idea is to
have another global variable indicating we're in the failsafe mode.
vacuum_delay_point() doesn't update VacuumCostActive if the flag is
true.

I think we might not need to do this. Other than in
lazy_check_wraparound_failsafe(), VacuumCostActive is only updated in
two places:

1) in vacuum() which autovacuum will call per table. And failsafe is
reset per table as well.

2) in vacuum_delay_point(), but, since VacuumCostActive will already be
false when we enter vacuum_delay_point() the next time after
lazy_check_wraparound_failsafe(), we won't set VacuumCostActive there.

Indeed. But does it mean that there is no code path to turn
vacuum-delay on, even when vacuum_cost_delay is updated from 0 to
non-0?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#23Melanie Plageman
melanieplageman@gmail.com
In reply to: Masahiko Sawada (#22)
1 attachment(s)
Re: Should vacuum process config file reload more often

On Wed, Mar 15, 2023 at 1:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Sat, Mar 11, 2023 at 8:11 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

I've implemented the atomic cost limit in the attached patch. Though,
I'm pretty unsure about how I initialized the atomics in
AutoVacuumShmemInit()...

+
/* initialize the WorkerInfo free list */
for (i = 0; i < autovacuum_max_workers; i++)
dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
&worker[i].wi_links);
+
+                dlist_foreach(iter, &AutoVacuumShmem->av_freeWorkers)
+                        pg_atomic_init_u32(
+
&(dlist_container(WorkerInfoData, wi_links, iter.cur))->wi_cost_limit,
+                                                           0);
+

I think we can do like:

/* initialize the WorkerInfo free list */
for (i = 0; i < autovacuum_max_workers; i++)
{
dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
&worker[i].wi_links);
pg_atomic_init_u32(&(worker[i].wi_cost_limit));
}

Ah, yes, I was distracted by the variable name "worker" (as opposed to
"workers").

If the consensus is that it is simply too confusing to take
wi_cost_delay out of WorkerInfo, we might be able to afford using a
shared lock to access it because we won't call AutoVacuumUpdateDelay()
on every invocation of vacuum_delay_point() -- only when we've reloaded
the config file.

One potential option to avoid taking a shared lock on every call to
AutoVacuumUpdateDelay() is to set a global variable to indicate that we
did update it (since we are the only ones updating it) and then only
take the shared LWLock in AutoVacuumUpdateDelay() if that flag is true.

If we remove wi_cost_delay from WorkerInfo, probably we don't need to
acquire the lwlock in AutoVacuumUpdateDelay()? The shared field we
access in that function will be only wi_dobalance, but this field is
updated only by its owner autovacuum worker.

I realized that we cannot use dobalance to decide whether or not to
update wi_cost_delay because dobalance could be false because of table
option cost limit being set (with no table option cost delay) and we
would still need to update VacuumCostDelay and wi_cost_delay with the
new value of autovacuum_vacuum_cost_delay.

But v5 skirts around this issue altogether.

---
void
AutoVacuumUpdateDelay(void)
{
-        if (MyWorkerInfo)
+        /*
+         * We are using autovacuum-related GUCs to update
VacuumCostDelay, so we
+         * only want autovacuum workers and autovacuum launcher to do this.
+         */
+        if (!(am_autovacuum_worker || am_autovacuum_launcher))
+                return;

Is there any case where the autovacuum launcher calls
AutoVacuumUpdateDelay() function?

I had meant to add it to HandleAutoVacLauncherInterrupts() after
reloading the config file (done in attached patch). When using the
global variables for cost delay (instead of wi_cost_delay in worker
info), the autovac launcher also has to do the check in the else branch
of AutoVacuumUpdateDelay()

VacuumCostDelay = autovacuum_vac_cost_delay >= 0 ?
autovacuum_vac_cost_delay : VacuumCostDelay;

to make sure VacuumCostDelay is correct for when it calls
autovac_balance_cost().

But doesn't the launcher do a similar thing at the beginning of
autovac_balance_cost()?

double vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
autovacuum_vac_cost_delay : VacuumCostDelay);

Ah, yes. You are right.

Related to this point, I think autovac_balance_cost() should use
globally-set cost_limit and cost_delay values to calculate worker's
vacuum-delay parameters. IOW, vac_cost_limit and vac_cost_delay should
come from the config file setting, not table option etc:

int vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
autovacuum_vac_cost_limit : VacuumCostLimit);
double vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
autovacuum_vac_cost_delay : VacuumCostDelay);

If my understanding is right, the following change is not right;
AutoVacUpdateLimit() updates the VacuumCostLimit based on the value in
MyWorkerInfo:

MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
+ AutoVacuumUpdateLimit();

/* do a balance */
autovac_balance_cost();

- /* set the active cost parameters from the result of that */
- AutoVacuumUpdateDelay();

Also, even when using the global variables for cost delay, the
launcher doesn't need to check the global variable. It should always
be able to use either autovacuum_vac_cost_delay/limit or
VacuumCostDelay/Limit.

Yes, that is true. But, I actually think we can do something more
radical, which relates to this point as well as the issue with
cost_limit_base below.

This also made me think about whether or not we still need cost_limit_base.
It is used to ensure that autovac_balance_cost() never ends up setting
workers' wi_cost_limits above the current autovacuum_vacuum_cost_limit
(or VacuumCostLimit). However, the launcher and all the workers should
know what the value is without cost_limit_base, no?

Yeah, the current balancing algorithm looks to respect the cost_limit
value set when starting to vacuum the table. The proportion of the
amount of I/O that a worker can consume is calculated based on the
base value and the new worker's cost_limit value cannot exceed the
base value. Given that we're trying to dynamically tune worker's cost
parameters (delay and limit), this concept seems to need to be
updated.

In master, autovacuum workers reload the config file at most once per
table vacuumed. And that is the same time that they update their
wi_cost_limit_base and wi_cost_delay. Thus, when autovac_balance_cost()
is called, there is a good chance that different workers will have
different values for wi_cost_limit_base and wi_cost_delay (and we are
only talking about workers not vacuuming a table with table option
cost-related gucs). So, it made sense that the balancing algorithm tried
to use a ratio to determine what to set the cost limit of each worker
to. It is clamped to the base value, as you say, but it also gives
workers a proportion of the new limit equal to what proportion their base
cost represents of the total cost.

I think all of this doesn't matter anymore now that everyone can reload
the config file often and dynamically change these values.

Thus, in the attached v5, I have removed both wi_cost_limit and wi_cost_delay
from WorkerInfo. I've added a new variable to AutoVacuumShmem called
nworkers_for_balance. Now, autovac_balance_cost() only recalculates this
number and updates it if it has changed. Then, in
AutoVacuumUpdateLimit() workers read from this atomic value and divide
the value of the cost limit gucs by that number to get their own cost limit.

I keep the table option value of cost limit and cost delay in
backend-local memory to reference when updating the worker cost limit.

One nice thing is autovac_balance_cost() only requires an access shared
lock now (though most callers are updating other members before calling
it and still take an exclusive lock).

What do you think?

Also not sure how the patch interacts with failsafe autovac and parallel
vacuum.

Good point.

When entering the failsafe mode, we disable the vacuum delays (see
lazy_check_wraparound_failsafe()). We need to keep disabling the
vacuum delays even after reloading the config file. One idea is to
have another global variable indicating we're in the failsafe mode.
vacuum_delay_point() doesn't update VacuumCostActive if the flag is
true.

I think we might not need to do this. Other than in
lazy_check_wraparound_failsafe(), VacuumCostActive is only updated in
two places:

1) in vacuum() which autovacuum will call per table. And failsafe is
reset per table as well.

2) in vacuum_delay_point(), but, since VacuumCostActive will already be
false when we enter vacuum_delay_point() the next time after
lazy_check_wraparound_failsafe(), we won't set VacuumCostActive there.

Indeed. But does it mean that there is no code path to turn
vacuum-delay on, even when vacuum_cost_delay is updated from 0 to
non-0?

Ah yes! Good point. This is true.
I'm not sure how to cheaply allow for re-enabling delays after disabling
them in the middle of a table vacuum.

I don't see a way around checking if we need to reload the config file
on every call to vacuum_delay_point() (currently, we are only doing this
when we have to wait anyway). It seems expensive to do this check every
time. If we do do this, we would update VacuumCostActive when updating
VacuumCostDelay, and we would need a global variable keeping the
failsafe status, as you mentioned.

It could be okay to say that you can only disable cost-based delays in
the middle of vacuuming a table (i.e. you cannot enable them if they are
already disabled until you start vacuuming the next table). Though maybe
it is weird that you can increase the delay but not re-enable it...

On an unrelated note, I was wondering if there were any docs anywhere
that should be updated to go along with this.

And, I was wondering if it was worth trying to split up the part that
reloads the config file and all of the autovacuum stuff. The reloading
of the config file by itself won't actually result in autovacuum workers
having updated cost delays because of them overwriting it with
wi_cost_delay, but it will allow VACUUM to have those updated values.

- Melanie

Attachments:

v5-0001-auto-vacuum-reloads-config-file-more-often.patchtext/x-patch; charset=US-ASCII; name=v5-0001-auto-vacuum-reloads-config-file-more-often.patchDownload
From f77dc11eb38c96efe8a0defd8cb89ac44481d8f5 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 18 Mar 2023 15:42:39 -0400
Subject: [PATCH v5] [auto]vacuum reloads config file more often

Previously, VACUUM and autovacuum workers would reload the configuration
file only between vacuuming tables. This precluded user updates to
cost-based delay parameters from taking effect while vacuuming a table.

Check if a reload is pending roughly once per block now, when checking
if we need to delay.

In order for this change to have the intended effect on autovacuum,
autovacuum workers must start updating their cost delay more frequently
as well.

With this new paradigm, balancing the cost limit amongst workers also
must work differently. Previously, a worker's wi_cost_limit was set only
at the beginning of vacuuming a table, after reloading the config file.
Therefore, at the time that autovac_balance_cost() is called, workers
vacuuming tables with no table options could still have different values
for their wi_cost_limit_base and wi_cost_delay. With this change,
workers will (within some margin of error) have no reason to have
different values for cost limit and cost delay (in the absence of table
options). Thus, remove cost limit and cost delay from shared memory and
keep track only of the number of workers actively vacuuming tables with
no cost-related table options. Then, use this value to determine what
each worker's effective cost limit should be.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_buP5wzsho3qNw5o9_R0pF69FRM5hgCmr-mvXmGXwdA7A%40mail.gmail.com#5e6771d4cdca4db6efc2acec2dce0bc7
---
 src/backend/commands/vacuum.c       |  40 ++++--
 src/backend/postmaster/autovacuum.c | 206 ++++++++++++----------------
 src/include/postmaster/autovacuum.h |   2 +
 3 files changed, 121 insertions(+), 127 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index c54360a6a0..6534fd748d 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -48,6 +48,7 @@
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/interrupt.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/pmsignal.h"
@@ -76,6 +77,7 @@ int			vacuum_multixact_failsafe_age;
 /* A few variables that don't seem worth passing around as parameters */
 static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
+static bool analyze_in_outer_xact = false;
 
 
 /*
@@ -314,8 +316,7 @@ vacuum(List *relations, VacuumParams *params,
 	static bool in_vacuum = false;
 
 	const char *stmttype;
-	volatile bool in_outer_xact,
-				use_own_xacts;
+	volatile bool use_own_xacts;
 
 	Assert(params != NULL);
 
@@ -332,10 +333,10 @@ vacuum(List *relations, VacuumParams *params,
 	if (params->options & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
-		in_outer_xact = false;
+		analyze_in_outer_xact = false;
 	}
 	else
-		in_outer_xact = IsInTransactionBlock(isTopLevel);
+		analyze_in_outer_xact = IsInTransactionBlock(isTopLevel);
 
 	/*
 	 * Due to static variables vac_context, anl_context and vac_strategy,
@@ -457,7 +458,7 @@ vacuum(List *relations, VacuumParams *params,
 		Assert(params->options & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
-		else if (in_outer_xact)
+		else if (analyze_in_outer_xact)
 			use_own_xacts = false;
 		else if (list_length(relations) > 1)
 			use_own_xacts = true;
@@ -475,7 +476,7 @@ vacuum(List *relations, VacuumParams *params,
 	 */
 	if (use_own_xacts)
 	{
-		Assert(!in_outer_xact);
+		Assert(!analyze_in_outer_xact);
 
 		/* ActiveSnapshot is not set by autovacuum */
 		if (ActiveSnapshotSet())
@@ -527,7 +528,7 @@ vacuum(List *relations, VacuumParams *params,
 				}
 
 				analyze_rel(vrel->oid, vrel->relation, params,
-							vrel->va_cols, in_outer_xact, vac_strategy);
+							vrel->va_cols, analyze_in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
 				{
@@ -550,6 +551,7 @@ vacuum(List *relations, VacuumParams *params,
 	{
 		in_vacuum = false;
 		VacuumCostActive = false;
+		analyze_in_outer_xact = false;
 	}
 	PG_END_TRY();
 
@@ -2246,10 +2248,30 @@ vacuum_delay_point(void)
 		if (IsUnderPostmaster && !PostmasterIsAlive())
 			exit(1);
 
+		/*
+		 * Reload the configuration file if requested. This allows changes to
+		 * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay
+		 * to take effect while a table is being vacuumed or analyzed.
+		 */
+		if (ConfigReloadPending && !analyze_in_outer_xact)
+		{
+			ConfigReloadPending = false;
+			ProcessConfigFile(PGC_SIGHUP);
+			AutoVacuumUpdateDelay();
+		}
+
 		VacuumCostBalance = 0;
 
-		/* update balance values for workers */
-		AutoVacuumUpdateDelay();
+		/*
+		 * Update balance values for workers. We must always do this in case
+		 * the autovacuum launcher or another autovacuum worker has
+		 * recalculated the number of workers across which we must balance the
+		 * limit. This is done by the launcher when launching a new worker and
+		 * by workers before vacuuming each table.
+		 */
+		AutoVacuumUpdateLimit();
+
+		VacuumCostActive = (VacuumCostDelay > 0);
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c0e2e00a7e..f69f011589 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -139,6 +139,9 @@ int			Log_autovacuum_min_duration = 600000;
 static bool am_autovacuum_launcher = false;
 static bool am_autovacuum_worker = false;
 
+static double av_table_option_cost_delay = -1;
+static int	av_table_option_cost_limit = 0;
+
 /* Flags set by signal handlers */
 static volatile sig_atomic_t got_SIGUSR2 = false;
 
@@ -189,8 +192,8 @@ typedef struct autovac_table
 {
 	Oid			at_relid;
 	VacuumParams at_params;
-	double		at_vacuum_cost_delay;
-	int			at_vacuum_cost_limit;
+	double		at_table_option_vac_cost_delay;
+	int			at_table_option_vac_cost_limit;
 	bool		at_dobalance;
 	bool		at_sharedrel;
 	char	   *at_relname;
@@ -209,7 +212,7 @@ typedef struct autovac_table
  * wi_sharedrel flag indicating whether table is marked relisshared
  * wi_proc		pointer to PGPROC of the running worker, NULL if not started
  * wi_launchtime Time at which this worker was launched
- * wi_cost_*	Vacuum cost-based delay parameters current in this worker
+ * wi_dobalance Whether this worker should be included in balance calculations
  *
  * All fields are protected by AutovacuumLock, except for wi_tableoid and
  * wi_sharedrel which are protected by AutovacuumScheduleLock (note these
@@ -225,9 +228,6 @@ typedef struct WorkerInfoData
 	TimestampTz wi_launchtime;
 	bool		wi_dobalance;
 	bool		wi_sharedrel;
-	double		wi_cost_delay;
-	int			wi_cost_limit;
-	int			wi_cost_limit_base;
 } WorkerInfoData;
 
 typedef struct WorkerInfoData *WorkerInfo;
@@ -286,6 +286,7 @@ typedef struct
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+	pg_atomic_uint32 nworkers_for_balance;
 } AutoVacuumShmemStruct;
 
 static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -820,7 +821,7 @@ HandleAutoVacLauncherInterrupts(void)
 			AutoVacLauncherShutdown();
 
 		/* rebalance in case the default cost parameters changed */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
 		autovac_balance_cost();
 		LWLockRelease(AutovacuumLock);
 
@@ -1756,9 +1757,6 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
 		MyWorkerInfo->wi_dobalance = false;
-		MyWorkerInfo->wi_cost_delay = 0;
-		MyWorkerInfo->wi_cost_limit = 0;
-		MyWorkerInfo->wi_cost_limit_base = 0;
 		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 						&MyWorkerInfo->wi_links);
 		/* not mine anymore */
@@ -1774,99 +1772,95 @@ FreeWorkerInfo(int code, Datum arg)
 }
 
 /*
- * Update the cost-based delay parameters, so that multiple workers consume
- * each a fraction of the total available I/O.
+ * Update VacuumCostDelay with the correct value for an autovacuum worker,
+ * given the value of other relevant cost-based delay parameters. Autovacuum
+ * workers should call this after every config reload, in case VacuumCostDelay
+ * was overwritten.
  */
 void
 AutoVacuumUpdateDelay(void)
 {
-	if (MyWorkerInfo)
+	if (!am_autovacuum_worker)
+		return;
+
+	if (av_table_option_cost_delay >= 0)
+		VacuumCostDelay = av_table_option_cost_delay;
+	else
+		VacuumCostDelay = autovacuum_vac_cost_delay >= 0 ?
+			autovacuum_vac_cost_delay : VacuumCostDelay;
+}
+
+/*
+ * Update VacuumCostLimit with the correct value for an autovacuum worker,
+ * given the value of other relevant cost limit parameters and the number of
+ * workers across which the limit must be balanced. Autovacuum workers must
+ * call this regularly in case nworkers_for_balance has been updated by another
+ * worker or by the autovacuum launcher. They also must call this after every
+ * config reload, in case VacuumCostLimit was overwritten.
+ */
+void
+AutoVacuumUpdateLimit(void)
+{
+	if (!am_autovacuum_worker)
+		return;
+
+	/*
+	 * note: in cost_limit, zero also means use value from elsewhere, because
+	 * zero is not a valid value.
+	 */
+	if (av_table_option_cost_limit > 0)
+		VacuumCostLimit = av_table_option_cost_limit;
+	else
 	{
-		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
-		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+		int			vac_cost_limit = autovacuum_vac_cost_limit > 0 ?
+		autovacuum_vac_cost_limit : VacuumCostLimit;
+
+		int			balanced_cost_limit = vac_cost_limit /
+		pg_atomic_read_u32(&AutoVacuumShmem->nworkers_for_balance);
+
+		VacuumCostLimit = Max(Min(balanced_cost_limit, vac_cost_limit), 1);
 	}
 }
 
 /*
  * autovac_balance_cost
- *		Recalculate the cost limit setting for each active worker.
+ *		Recalculate the number of workers to consider, given table options and
+ *		the current number of active workers.
  *
- * Caller must hold the AutovacuumLock in exclusive mode.
+ * Caller must hold the AutovacuumLock in at least shared mode.
  */
 static void
 autovac_balance_cost(void)
 {
-	/*
-	 * The idea here is that we ration out I/O equally.  The amount of I/O
-	 * that a worker can consume is determined by cost_limit/cost_delay, so we
-	 * try to equalize those ratios rather than the raw limit settings.
-	 *
-	 * note: in cost_limit, zero also means use value from elsewhere, because
-	 * zero is not a valid value.
-	 */
-	int			vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
-								  autovacuum_vac_cost_limit : VacuumCostLimit);
-	double		vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
-								  autovacuum_vac_cost_delay : VacuumCostDelay);
-	double		cost_total;
-	double		cost_avail;
 	dlist_iter	iter;
+	int			orig_nworkers_for_balance;
+	int			nworkers_for_balance = 0;
 
-	/* not set? nothing to do */
-	if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
+	if (autovacuum_vac_cost_delay == 0 ||
+		(autovacuum_vac_cost_delay == -1 && VacuumCostDelay == 0))
 		return;
 
-	/* calculate the total base cost limit of participating active workers */
-	cost_total = 0.0;
-	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
-	{
-		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
-
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-			cost_total +=
-				(double) worker->wi_cost_limit_base / worker->wi_cost_delay;
-	}
-
-	/* there are no cost limits -- nothing to do */
-	if (cost_total <= 0)
+	if (autovacuum_vac_cost_limit <= 0 && VacuumCostLimit <= 0)
 		return;
 
-	/*
-	 * Adjust cost limit of each active worker to balance the total of cost
-	 * limit to autovacuum_vacuum_cost_limit.
-	 */
-	cost_avail = (double) vac_cost_limit / vac_cost_delay;
+	orig_nworkers_for_balance =
+		pg_atomic_read_u32(&AutoVacuumShmem->nworkers_for_balance);
+
 	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
 	{
 		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-		{
-			int			limit = (int)
-			(cost_avail * worker->wi_cost_limit_base / cost_total);
-
-			/*
-			 * We put a lower bound of 1 on the cost_limit, to avoid division-
-			 * by-zero in the vacuum code.  Also, in case of roundoff trouble
-			 * in these calculations, let's be sure we don't ever set
-			 * cost_limit to more than the base value.
-			 */
-			worker->wi_cost_limit = Max(Min(limit,
-											worker->wi_cost_limit_base),
-										1);
-		}
+		if (worker->wi_proc == NULL || !worker->wi_dobalance)
+			continue;
 
-		if (worker->wi_proc != NULL)
-			elog(DEBUG2, "autovac_balance_cost(pid=%d db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d, cost_delay=%g)",
-				 worker->wi_proc->pid, worker->wi_dboid, worker->wi_tableoid,
-				 worker->wi_dobalance ? "yes" : "no",
-				 worker->wi_cost_limit, worker->wi_cost_limit_base,
-				 worker->wi_cost_delay);
+		nworkers_for_balance++;
 	}
+
+	nworkers_for_balance = Max(nworkers_for_balance, 1);
+
+	if (nworkers_for_balance != orig_nworkers_for_balance)
+		pg_atomic_write_u32(&AutoVacuumShmem->nworkers_for_balance,
+							nworkers_for_balance);
 }
 
 /*
@@ -2312,14 +2306,15 @@ do_autovacuum(void)
 		autovac_table *tab;
 		bool		isshared;
 		bool		skipit;
-		double		stdVacuumCostDelay;
-		int			stdVacuumCostLimit;
 		dlist_iter	iter;
 
 		CHECK_FOR_INTERRUPTS();
 
 		/*
 		 * Check for config changes before processing each collected table.
+		 * Autovacuum workers must update VacuumCostDelay and VacuumCostLimit
+		 * in case they were overridden by the reload. However, we will do
+		 * this as soon as we check table options a bit later.
 		 */
 		if (ConfigReloadPending)
 		{
@@ -2416,32 +2411,18 @@ do_autovacuum(void)
 			continue;
 		}
 
-		/*
-		 * Remember the prevailing values of the vacuum cost GUCs.  We have to
-		 * restore these at the bottom of the loop, else we'll compute wrong
-		 * values in the next iteration of autovac_balance_cost().
-		 */
-		stdVacuumCostDelay = VacuumCostDelay;
-		stdVacuumCostLimit = VacuumCostLimit;
+		av_table_option_cost_limit = tab->at_table_option_vac_cost_limit;
+		av_table_option_cost_delay = tab->at_table_option_vac_cost_delay;
 
 		/* Must hold AutovacuumLock while mucking with cost balance info */
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-
-		/* advertise my cost delay parameters for the balancing algorithm */
 		MyWorkerInfo->wi_dobalance = tab->at_dobalance;
-		MyWorkerInfo->wi_cost_delay = tab->at_vacuum_cost_delay;
-		MyWorkerInfo->wi_cost_limit = tab->at_vacuum_cost_limit;
-		MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
-
-		/* do a balance */
 		autovac_balance_cost();
+		LWLockRelease(AutovacuumLock);
 
-		/* set the active cost parameters from the result of that */
+		AutoVacuumUpdateLimit();
 		AutoVacuumUpdateDelay();
 
-		/* done */
-		LWLockRelease(AutovacuumLock);
-
 		/* clean up memory before each iteration */
 		MemoryContextResetAndDeleteChildren(PortalContext);
 
@@ -2534,10 +2515,6 @@ deleted:
 		MyWorkerInfo->wi_tableoid = InvalidOid;
 		MyWorkerInfo->wi_sharedrel = false;
 		LWLockRelease(AutovacuumScheduleLock);
-
-		/* restore vacuum cost GUCs for the next iteration */
-		VacuumCostDelay = stdVacuumCostDelay;
-		VacuumCostLimit = stdVacuumCostLimit;
 	}
 
 	/*
@@ -2569,6 +2546,8 @@ deleted:
 		{
 			ConfigReloadPending = false;
 			ProcessConfigFile(PGC_SIGHUP);
+			AutoVacuumUpdateDelay();
+			AutoVacuumUpdateLimit();
 		}
 
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
@@ -2801,8 +2780,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		int			freeze_table_age;
 		int			multixact_freeze_min_age;
 		int			multixact_freeze_table_age;
-		int			vac_cost_limit;
-		double		vac_cost_delay;
 		int			log_min_duration;
 
 		/*
@@ -2812,20 +2789,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 * defaults, autovacuum's own first and plain vacuum second.
 		 */
 
-		/* -1 in autovac setting means use plain vacuum_cost_delay */
-		vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
-			? avopts->vacuum_cost_delay
-			: (autovacuum_vac_cost_delay >= 0)
-			? autovacuum_vac_cost_delay
-			: VacuumCostDelay;
-
-		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
-		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
-			? avopts->vacuum_cost_limit
-			: (autovacuum_vac_cost_limit > 0)
-			? autovacuum_vac_cost_limit
-			: VacuumCostLimit;
-
 		/* -1 in autovac setting means use log_autovacuum_min_duration */
 		log_min_duration = (avopts && avopts->log_min_duration >= 0)
 			? avopts->log_min_duration
@@ -2881,8 +2844,10 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab->at_params.multixact_freeze_table_age = multixact_freeze_table_age;
 		tab->at_params.is_wraparound = wraparound;
 		tab->at_params.log_min_duration = log_min_duration;
-		tab->at_vacuum_cost_limit = vac_cost_limit;
-		tab->at_vacuum_cost_delay = vac_cost_delay;
+		tab->at_table_option_vac_cost_limit = avopts ?
+			avopts->vacuum_cost_limit : 0;
+		tab->at_table_option_vac_cost_delay = avopts ?
+			avopts->vacuum_cost_delay : -1;
 		tab->at_relname = NULL;
 		tab->at_nspname = NULL;
 		tab->at_datname = NULL;
@@ -3374,10 +3339,15 @@ AutoVacuumShmemInit(void)
 		worker = (WorkerInfo) ((char *) AutoVacuumShmem +
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
+
 		/* initialize the WorkerInfo free list */
 		for (i = 0; i < autovacuum_max_workers; i++)
 			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 							&worker[i].wi_links);
+
+		/* initialize to 1, as it should be a minimum of 1 */
+		pg_atomic_init_u32(&AutoVacuumShmem->nworkers_for_balance, 1);
+
 	}
 	else
 		Assert(found);
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index c140371b51..558358911c 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -66,6 +66,8 @@ extern void AutoVacWorkerFailed(void);
 /* autovacuum cost-delay balancer */
 extern void AutoVacuumUpdateDelay(void);
 
+extern void AutoVacuumUpdateLimit(void);
+
 #ifdef EXEC_BACKEND
 extern void AutoVacLauncherMain(int argc, char *argv[]) pg_attribute_noreturn();
 extern void AutoVacWorkerMain(int argc, char *argv[]) pg_attribute_noreturn();
-- 
2.37.2

#24Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#23)
1 attachment(s)
Re: Should vacuum process config file reload more often

On Sat, Mar 18, 2023 at 6:47 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Wed, Mar 15, 2023 at 1:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Sat, Mar 11, 2023 at 8:11 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Also not sure how the patch interacts with failsafe autovac and parallel
vacuum.

Good point.

When entering the failsafe mode, we disable the vacuum delays (see
lazy_check_wraparound_failsafe()). We need to keep disabling the
vacuum delays even after reloading the config file. One idea is to
have another global variable indicating we're in the failsafe mode.
vacuum_delay_point() doesn't update VacuumCostActive if the flag is
true.

I think we might not need to do this. Other than in
lazy_check_wraparound_failsafe(), VacuumCostActive is only updated in
two places:

1) in vacuum() which autovacuum will call per table. And failsafe is
reset per table as well.

2) in vacuum_delay_point(), but, since VacuumCostActive will already be
false when we enter vacuum_delay_point() the next time after
lazy_check_wraparound_failsafe(), we won't set VacuumCostActive there.

Indeed. But does it mean that there is no code path to turn
vacuum-delay on, even when vacuum_cost_delay is updated from 0 to
non-0?

Ah yes! Good point. This is true.
I'm not sure how to cheaply allow for re-enabling delays after disabling
them in the middle of a table vacuum.

I don't see a way around checking if we need to reload the config file
on every call to vacuum_delay_point() (currently, we are only doing this
when we have to wait anyway). It seems expensive to do this check every
time. If we do do this, we would update VacuumCostActive when updating
VacuumCostDelay, and we would need a global variable keeping the
failsafe status, as you mentioned.

It could be okay to say that you can only disable cost-based delays in
the middle of vacuuming a table (i.e. you cannot enable them if they are
already disabled until you start vacuuming the next table). Though maybe
it is weird that you can increase the delay but not re-enable it...

So, I thought about it some more, and I think it is a bit odd that you
can increase the delay and limit but not re-enable them if they were
disabled. And, perhaps it would be okay to check ConfigReloadPending at
the top of vacuum_delay_point() instead of only after sleeping. It is
just one more branch. We can check if VacuumCostActive is false after
checking if we should reload and doing so if needed and return early.
I've implemented that in attached v6.

I added in the global we discussed for VacuumFailsafeActive. If we keep
it, we can probably remove the one in LVRelState -- as it seems
redundant. Let me know what you think.

- Melanie

Attachments:

v6-0001-auto-vacuum-reloads-config-file-more-often.patchtext/x-patch; charset=US-ASCII; name=v6-0001-auto-vacuum-reloads-config-file-more-often.patchDownload
From 1218c1852794a1310d25359a37b87d068282500e Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 18 Mar 2023 15:42:39 -0400
Subject: [PATCH v6] [auto]vacuum reloads config file more often

Previously, VACUUM and autovacuum workers would reload the configuration
file only between vacuuming tables. This precluded user updates to
cost-based delay parameters from taking effect while vacuuming a table.

Check if a reload is pending roughly once per block now, when checking
if we need to delay.

In order for this change to have the intended effect on autovacuum,
autovacuum workers must start updating their cost delay more frequently
as well.

With this new paradigm, balancing the cost limit amongst workers also
must work differently. Previously, a worker's wi_cost_limit was set only
at the beginning of vacuuming a table, after reloading the config file.
Therefore, at the time that autovac_balance_cost() is called, workers
vacuuming tables with no table options could still have different values
for their wi_cost_limit_base and wi_cost_delay. With this change,
workers will (within some margin of error) have no reason to have
different values for cost limit and cost delay (in the absence of table
options). Thus, remove cost limit and cost delay from shared memory and
keep track only of the number of workers actively vacuuming tables with
no cost-related table options. Then, use this value to determine what
each worker's effective cost limit should be.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_buP5wzsho3qNw5o9_R0pF69FRM5hgCmr-mvXmGXwdA7A%40mail.gmail.com#5e6771d4cdca4db6efc2acec2dce0bc7
---
 src/backend/access/heap/vacuumlazy.c  |   1 +
 src/backend/commands/vacuum.c         |  55 +++++--
 src/backend/commands/vacuumparallel.c |   1 +
 src/backend/postmaster/autovacuum.c   | 209 +++++++++++---------------
 src/backend/utils/init/globals.c      |   1 +
 src/include/miscadmin.h               |   1 +
 src/include/postmaster/autovacuum.h   |   2 +
 7 files changed, 142 insertions(+), 128 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8f14cf85f3..38dce9ae66 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2622,6 +2622,7 @@ lazy_check_wraparound_failsafe(LVRelState *vacrel)
 	if (unlikely(vacuum_xid_failsafe_check(&vacrel->cutoffs)))
 	{
 		vacrel->failsafe_active = true;
+		VacuumFailsafeActive = true;
 
 		/* Disable index vacuuming, index cleanup, and heap rel truncation */
 		vacrel->do_index_vacuuming = false;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index c54360a6a0..a5eb22c5ca 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -48,6 +48,7 @@
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/interrupt.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/pmsignal.h"
@@ -76,6 +77,7 @@ int			vacuum_multixact_failsafe_age;
 /* A few variables that don't seem worth passing around as parameters */
 static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
+static bool analyze_in_outer_xact = false;
 
 
 /*
@@ -314,8 +316,7 @@ vacuum(List *relations, VacuumParams *params,
 	static bool in_vacuum = false;
 
 	const char *stmttype;
-	volatile bool in_outer_xact,
-				use_own_xacts;
+	volatile bool use_own_xacts;
 
 	Assert(params != NULL);
 
@@ -332,10 +333,10 @@ vacuum(List *relations, VacuumParams *params,
 	if (params->options & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
-		in_outer_xact = false;
+		analyze_in_outer_xact = false;
 	}
 	else
-		in_outer_xact = IsInTransactionBlock(isTopLevel);
+		analyze_in_outer_xact = IsInTransactionBlock(isTopLevel);
 
 	/*
 	 * Due to static variables vac_context, anl_context and vac_strategy,
@@ -457,7 +458,7 @@ vacuum(List *relations, VacuumParams *params,
 		Assert(params->options & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
-		else if (in_outer_xact)
+		else if (analyze_in_outer_xact)
 			use_own_xacts = false;
 		else if (list_length(relations) > 1)
 			use_own_xacts = true;
@@ -475,7 +476,7 @@ vacuum(List *relations, VacuumParams *params,
 	 */
 	if (use_own_xacts)
 	{
-		Assert(!in_outer_xact);
+		Assert(!analyze_in_outer_xact);
 
 		/* ActiveSnapshot is not set by autovacuum */
 		if (ActiveSnapshotSet())
@@ -492,6 +493,7 @@ vacuum(List *relations, VacuumParams *params,
 
 		in_vacuum = true;
 		VacuumCostActive = (VacuumCostDelay > 0);
+		VacuumFailsafeActive = false;
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
@@ -527,7 +529,7 @@ vacuum(List *relations, VacuumParams *params,
 				}
 
 				analyze_rel(vrel->oid, vrel->relation, params,
-							vrel->va_cols, in_outer_xact, vac_strategy);
+							vrel->va_cols, analyze_in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
 				{
@@ -550,6 +552,7 @@ vacuum(List *relations, VacuumParams *params,
 	{
 		in_vacuum = false;
 		VacuumCostActive = false;
+		analyze_in_outer_xact = false;
 	}
 	PG_END_TRY();
 
@@ -1850,6 +1853,8 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params, bool skip_privs)
 
 	Assert(params != NULL);
 
+	VacuumFailsafeActive = false;
+
 	/* Begin a transaction for vacuuming this relation */
 	StartTransactionCommand();
 
@@ -2215,9 +2220,33 @@ vacuum_delay_point(void)
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	if (!VacuumCostActive || InterruptPending)
+	if (InterruptPending || VacuumFailsafeActive ||
+		(!VacuumCostActive && !ConfigReloadPending))
 		return;
 
+	/*
+	 * Reload the configuration file if requested. This allows changes to
+	 * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay to
+	 * take effect while a table is being vacuumed or analyzed.
+	 */
+	if (ConfigReloadPending && !analyze_in_outer_xact)
+	{
+		ConfigReloadPending = false;
+		ProcessConfigFile(PGC_SIGHUP);
+		AutoVacuumUpdateDelay();
+		AutoVacuumUpdateLimit();
+	}
+
+	/*
+	 * If we disabled cost-based delays after reloading the config file,
+	 * return
+	 */
+	if (!VacuumCostActive)
+	{
+		VacuumCostBalance = 0;
+		return;
+	}
+
 	/*
 	 * For parallel vacuum, the delay is computed based on the shared cost
 	 * balance.  See compute_parallel_delay.
@@ -2248,8 +2277,14 @@ vacuum_delay_point(void)
 
 		VacuumCostBalance = 0;
 
-		/* update balance values for workers */
-		AutoVacuumUpdateDelay();
+		/*
+		 * Update limit values for workers. We must always do this in case the
+		 * autovacuum launcher or another autovacuum worker has recalculated
+		 * the number of workers across which we must balance the limit. This
+		 * is done by the launcher when launching a new worker and by workers
+		 * before vacuuming each table.
+		 */
+		AutoVacuumUpdateLimit();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index bcd40c80a1..57188500d0 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -990,6 +990,7 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												 false);
 
 	/* Set cost-based vacuum delay */
+	VacuumFailsafeActive = false;
 	VacuumCostActive = (VacuumCostDelay > 0);
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c0e2e00a7e..1033e6db62 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -139,6 +139,9 @@ int			Log_autovacuum_min_duration = 600000;
 static bool am_autovacuum_launcher = false;
 static bool am_autovacuum_worker = false;
 
+static double av_table_option_cost_delay = -1;
+static int	av_table_option_cost_limit = 0;
+
 /* Flags set by signal handlers */
 static volatile sig_atomic_t got_SIGUSR2 = false;
 
@@ -189,8 +192,8 @@ typedef struct autovac_table
 {
 	Oid			at_relid;
 	VacuumParams at_params;
-	double		at_vacuum_cost_delay;
-	int			at_vacuum_cost_limit;
+	double		at_table_option_vac_cost_delay;
+	int			at_table_option_vac_cost_limit;
 	bool		at_dobalance;
 	bool		at_sharedrel;
 	char	   *at_relname;
@@ -209,7 +212,7 @@ typedef struct autovac_table
  * wi_sharedrel flag indicating whether table is marked relisshared
  * wi_proc		pointer to PGPROC of the running worker, NULL if not started
  * wi_launchtime Time at which this worker was launched
- * wi_cost_*	Vacuum cost-based delay parameters current in this worker
+ * wi_dobalance Whether this worker should be included in balance calculations
  *
  * All fields are protected by AutovacuumLock, except for wi_tableoid and
  * wi_sharedrel which are protected by AutovacuumScheduleLock (note these
@@ -225,9 +228,6 @@ typedef struct WorkerInfoData
 	TimestampTz wi_launchtime;
 	bool		wi_dobalance;
 	bool		wi_sharedrel;
-	double		wi_cost_delay;
-	int			wi_cost_limit;
-	int			wi_cost_limit_base;
 } WorkerInfoData;
 
 typedef struct WorkerInfoData *WorkerInfo;
@@ -286,6 +286,7 @@ typedef struct
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+	pg_atomic_uint32 nworkers_for_balance;
 } AutoVacuumShmemStruct;
 
 static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -820,7 +821,7 @@ HandleAutoVacLauncherInterrupts(void)
 			AutoVacLauncherShutdown();
 
 		/* rebalance in case the default cost parameters changed */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
 		autovac_balance_cost();
 		LWLockRelease(AutovacuumLock);
 
@@ -1756,9 +1757,6 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
 		MyWorkerInfo->wi_dobalance = false;
-		MyWorkerInfo->wi_cost_delay = 0;
-		MyWorkerInfo->wi_cost_limit = 0;
-		MyWorkerInfo->wi_cost_limit_base = 0;
 		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 						&MyWorkerInfo->wi_links);
 		/* not mine anymore */
@@ -1774,99 +1772,98 @@ FreeWorkerInfo(int code, Datum arg)
 }
 
 /*
- * Update the cost-based delay parameters, so that multiple workers consume
- * each a fraction of the total available I/O.
+ * Update VacuumCostDelay with the correct value for an autovacuum worker,
+ * given the value of other relevant cost-based delay parameters. Autovacuum
+ * workers should call this after every config reload, in case VacuumCostDelay
+ * was overwritten.
  */
 void
 AutoVacuumUpdateDelay(void)
 {
-	if (MyWorkerInfo)
+	if (!am_autovacuum_worker)
+		return;
+
+	if (av_table_option_cost_delay >= 0)
+		VacuumCostDelay = av_table_option_cost_delay;
+	else
+		VacuumCostDelay = autovacuum_vac_cost_delay >= 0 ?
+			autovacuum_vac_cost_delay : VacuumCostDelay;
+
+	if (!VacuumFailsafeActive)
+		VacuumCostActive = (VacuumCostDelay > 0);
+}
+
+/*
+ * Update VacuumCostLimit with the correct value for an autovacuum worker,
+ * given the value of other relevant cost limit parameters and the number of
+ * workers across which the limit must be balanced. Autovacuum workers must
+ * call this regularly in case nworkers_for_balance has been updated by another
+ * worker or by the autovacuum launcher. They also must call this after every
+ * config reload, in case VacuumCostLimit was overwritten.
+ */
+void
+AutoVacuumUpdateLimit(void)
+{
+	if (!am_autovacuum_worker)
+		return;
+
+	/*
+	 * note: in cost_limit, zero also means use value from elsewhere, because
+	 * zero is not a valid value.
+	 */
+	if (av_table_option_cost_limit > 0)
+		VacuumCostLimit = av_table_option_cost_limit;
+	else
 	{
-		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
-		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+		int			vac_cost_limit = autovacuum_vac_cost_limit > 0 ?
+		autovacuum_vac_cost_limit : VacuumCostLimit;
+
+		int			balanced_cost_limit = vac_cost_limit /
+		pg_atomic_read_u32(&AutoVacuumShmem->nworkers_for_balance);
+
+		VacuumCostLimit = Max(Min(balanced_cost_limit, vac_cost_limit), 1);
 	}
 }
 
 /*
  * autovac_balance_cost
- *		Recalculate the cost limit setting for each active worker.
+ *		Recalculate the number of workers to consider, given table options and
+ *		the current number of active workers.
  *
- * Caller must hold the AutovacuumLock in exclusive mode.
+ * Caller must hold the AutovacuumLock in at least shared mode.
  */
 static void
 autovac_balance_cost(void)
 {
-	/*
-	 * The idea here is that we ration out I/O equally.  The amount of I/O
-	 * that a worker can consume is determined by cost_limit/cost_delay, so we
-	 * try to equalize those ratios rather than the raw limit settings.
-	 *
-	 * note: in cost_limit, zero also means use value from elsewhere, because
-	 * zero is not a valid value.
-	 */
-	int			vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
-								  autovacuum_vac_cost_limit : VacuumCostLimit);
-	double		vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
-								  autovacuum_vac_cost_delay : VacuumCostDelay);
-	double		cost_total;
-	double		cost_avail;
 	dlist_iter	iter;
+	int			orig_nworkers_for_balance;
+	int			nworkers_for_balance = 0;
 
-	/* not set? nothing to do */
-	if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
+	if (autovacuum_vac_cost_delay == 0 ||
+		(autovacuum_vac_cost_delay == -1 && VacuumCostDelay == 0))
 		return;
 
-	/* calculate the total base cost limit of participating active workers */
-	cost_total = 0.0;
-	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
-	{
-		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
-
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-			cost_total +=
-				(double) worker->wi_cost_limit_base / worker->wi_cost_delay;
-	}
-
-	/* there are no cost limits -- nothing to do */
-	if (cost_total <= 0)
+	if (autovacuum_vac_cost_limit <= 0 && VacuumCostLimit <= 0)
 		return;
 
-	/*
-	 * Adjust cost limit of each active worker to balance the total of cost
-	 * limit to autovacuum_vacuum_cost_limit.
-	 */
-	cost_avail = (double) vac_cost_limit / vac_cost_delay;
+	orig_nworkers_for_balance =
+		pg_atomic_read_u32(&AutoVacuumShmem->nworkers_for_balance);
+
 	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
 	{
 		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-		{
-			int			limit = (int)
-			(cost_avail * worker->wi_cost_limit_base / cost_total);
-
-			/*
-			 * We put a lower bound of 1 on the cost_limit, to avoid division-
-			 * by-zero in the vacuum code.  Also, in case of roundoff trouble
-			 * in these calculations, let's be sure we don't ever set
-			 * cost_limit to more than the base value.
-			 */
-			worker->wi_cost_limit = Max(Min(limit,
-											worker->wi_cost_limit_base),
-										1);
-		}
+		if (worker->wi_proc == NULL || !worker->wi_dobalance)
+			continue;
 
-		if (worker->wi_proc != NULL)
-			elog(DEBUG2, "autovac_balance_cost(pid=%d db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d, cost_delay=%g)",
-				 worker->wi_proc->pid, worker->wi_dboid, worker->wi_tableoid,
-				 worker->wi_dobalance ? "yes" : "no",
-				 worker->wi_cost_limit, worker->wi_cost_limit_base,
-				 worker->wi_cost_delay);
+		nworkers_for_balance++;
 	}
+
+	nworkers_for_balance = Max(nworkers_for_balance, 1);
+
+	if (nworkers_for_balance != orig_nworkers_for_balance)
+		pg_atomic_write_u32(&AutoVacuumShmem->nworkers_for_balance,
+							nworkers_for_balance);
 }
 
 /*
@@ -2312,14 +2309,15 @@ do_autovacuum(void)
 		autovac_table *tab;
 		bool		isshared;
 		bool		skipit;
-		double		stdVacuumCostDelay;
-		int			stdVacuumCostLimit;
 		dlist_iter	iter;
 
 		CHECK_FOR_INTERRUPTS();
 
 		/*
 		 * Check for config changes before processing each collected table.
+		 * Autovacuum workers must update VacuumCostDelay and VacuumCostLimit
+		 * in case they were overridden by the reload. However, we will do
+		 * this as soon as we check table options a bit later.
 		 */
 		if (ConfigReloadPending)
 		{
@@ -2416,32 +2414,18 @@ do_autovacuum(void)
 			continue;
 		}
 
-		/*
-		 * Remember the prevailing values of the vacuum cost GUCs.  We have to
-		 * restore these at the bottom of the loop, else we'll compute wrong
-		 * values in the next iteration of autovac_balance_cost().
-		 */
-		stdVacuumCostDelay = VacuumCostDelay;
-		stdVacuumCostLimit = VacuumCostLimit;
+		av_table_option_cost_limit = tab->at_table_option_vac_cost_limit;
+		av_table_option_cost_delay = tab->at_table_option_vac_cost_delay;
 
 		/* Must hold AutovacuumLock while mucking with cost balance info */
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-
-		/* advertise my cost delay parameters for the balancing algorithm */
 		MyWorkerInfo->wi_dobalance = tab->at_dobalance;
-		MyWorkerInfo->wi_cost_delay = tab->at_vacuum_cost_delay;
-		MyWorkerInfo->wi_cost_limit = tab->at_vacuum_cost_limit;
-		MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
-
-		/* do a balance */
 		autovac_balance_cost();
+		LWLockRelease(AutovacuumLock);
 
-		/* set the active cost parameters from the result of that */
+		AutoVacuumUpdateLimit();
 		AutoVacuumUpdateDelay();
 
-		/* done */
-		LWLockRelease(AutovacuumLock);
-
 		/* clean up memory before each iteration */
 		MemoryContextResetAndDeleteChildren(PortalContext);
 
@@ -2534,10 +2518,6 @@ deleted:
 		MyWorkerInfo->wi_tableoid = InvalidOid;
 		MyWorkerInfo->wi_sharedrel = false;
 		LWLockRelease(AutovacuumScheduleLock);
-
-		/* restore vacuum cost GUCs for the next iteration */
-		VacuumCostDelay = stdVacuumCostDelay;
-		VacuumCostLimit = stdVacuumCostLimit;
 	}
 
 	/*
@@ -2569,6 +2549,8 @@ deleted:
 		{
 			ConfigReloadPending = false;
 			ProcessConfigFile(PGC_SIGHUP);
+			AutoVacuumUpdateDelay();
+			AutoVacuumUpdateLimit();
 		}
 
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
@@ -2801,8 +2783,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		int			freeze_table_age;
 		int			multixact_freeze_min_age;
 		int			multixact_freeze_table_age;
-		int			vac_cost_limit;
-		double		vac_cost_delay;
 		int			log_min_duration;
 
 		/*
@@ -2812,20 +2792,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 * defaults, autovacuum's own first and plain vacuum second.
 		 */
 
-		/* -1 in autovac setting means use plain vacuum_cost_delay */
-		vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
-			? avopts->vacuum_cost_delay
-			: (autovacuum_vac_cost_delay >= 0)
-			? autovacuum_vac_cost_delay
-			: VacuumCostDelay;
-
-		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
-		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
-			? avopts->vacuum_cost_limit
-			: (autovacuum_vac_cost_limit > 0)
-			? autovacuum_vac_cost_limit
-			: VacuumCostLimit;
-
 		/* -1 in autovac setting means use log_autovacuum_min_duration */
 		log_min_duration = (avopts && avopts->log_min_duration >= 0)
 			? avopts->log_min_duration
@@ -2881,8 +2847,10 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab->at_params.multixact_freeze_table_age = multixact_freeze_table_age;
 		tab->at_params.is_wraparound = wraparound;
 		tab->at_params.log_min_duration = log_min_duration;
-		tab->at_vacuum_cost_limit = vac_cost_limit;
-		tab->at_vacuum_cost_delay = vac_cost_delay;
+		tab->at_table_option_vac_cost_limit = avopts ?
+			avopts->vacuum_cost_limit : 0;
+		tab->at_table_option_vac_cost_delay = avopts ?
+			avopts->vacuum_cost_delay : -1;
 		tab->at_relname = NULL;
 		tab->at_nspname = NULL;
 		tab->at_datname = NULL;
@@ -3374,10 +3342,15 @@ AutoVacuumShmemInit(void)
 		worker = (WorkerInfo) ((char *) AutoVacuumShmem +
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
+
 		/* initialize the WorkerInfo free list */
 		for (i = 0; i < autovacuum_max_workers; i++)
 			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 							&worker[i].wi_links);
+
+		/* initialize to 1, as it should be a minimum of 1 */
+		pg_atomic_init_u32(&AutoVacuumShmem->nworkers_for_balance, 1);
+
 	}
 	else
 		Assert(found);
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 1b1d814254..aeb8ed0e46 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -151,3 +151,4 @@ int64		VacuumPageDirty = 0;
 
 int			VacuumCostBalance = 0;	/* working state for vacuum */
 bool		VacuumCostActive = false;
+bool		VacuumFailsafeActive = false;
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 06a86f9ac1..b1297677d3 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -275,6 +275,7 @@ extern PGDLLIMPORT int64 VacuumPageDirty;
 
 extern PGDLLIMPORT int VacuumCostBalance;
 extern PGDLLIMPORT bool VacuumCostActive;
+extern PGDLLIMPORT bool VacuumFailsafeActive;
 
 
 /* in tcop/postgres.c */
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index c140371b51..558358911c 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -66,6 +66,8 @@ extern void AutoVacWorkerFailed(void);
 /* autovacuum cost-delay balancer */
 extern void AutoVacuumUpdateDelay(void);
 
+extern void AutoVacuumUpdateLimit(void);
+
 #ifdef EXEC_BACKEND
 extern void AutoVacLauncherMain(int argc, char *argv[]) pg_attribute_noreturn();
 extern void AutoVacWorkerMain(int argc, char *argv[]) pg_attribute_noreturn();
-- 
2.37.2

#25Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Melanie Plageman (#23)
Re: Should vacuum process config file reload more often

On Sun, Mar 19, 2023 at 7:47 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Wed, Mar 15, 2023 at 1:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Sat, Mar 11, 2023 at 8:11 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

I've implemented the atomic cost limit in the attached patch. Though,
I'm pretty unsure about how I initialized the atomics in
AutoVacuumShmemInit()...

+
/* initialize the WorkerInfo free list */
for (i = 0; i < autovacuum_max_workers; i++)
dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
&worker[i].wi_links);
+
+                dlist_foreach(iter, &AutoVacuumShmem->av_freeWorkers)
+                        pg_atomic_init_u32(
+
&(dlist_container(WorkerInfoData, wi_links, iter.cur))->wi_cost_limit,
+                                                           0);
+

I think we can do like:

/* initialize the WorkerInfo free list */
for (i = 0; i < autovacuum_max_workers; i++)
{
dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
&worker[i].wi_links);
pg_atomic_init_u32(&(worker[i].wi_cost_limit));
}

Ah, yes, I was distracted by the variable name "worker" (as opposed to
"workers").

If the consensus is that it is simply too confusing to take
wi_cost_delay out of WorkerInfo, we might be able to afford using a
shared lock to access it because we won't call AutoVacuumUpdateDelay()
on every invocation of vacuum_delay_point() -- only when we've reloaded
the config file.

One potential option to avoid taking a shared lock on every call to
AutoVacuumUpdateDelay() is to set a global variable to indicate that we
did update it (since we are the only ones updating it) and then only
take the shared LWLock in AutoVacuumUpdateDelay() if that flag is true.

If we remove wi_cost_delay from WorkerInfo, probably we don't need to
acquire the lwlock in AutoVacuumUpdateDelay()? The shared field we
access in that function will be only wi_dobalance, but this field is
updated only by its owner autovacuum worker.

I realized that we cannot use dobalance to decide whether or not to
update wi_cost_delay because dobalance could be false because of table
option cost limit being set (with no table option cost delay) and we
would still need to update VacuumCostDelay and wi_cost_delay with the
new value of autovacuum_vacuum_cost_delay.

But v5 skirts around this issue altogether.

---
void
AutoVacuumUpdateDelay(void)
{
-        if (MyWorkerInfo)
+        /*
+         * We are using autovacuum-related GUCs to update
VacuumCostDelay, so we
+         * only want autovacuum workers and autovacuum launcher to do this.
+         */
+        if (!(am_autovacuum_worker || am_autovacuum_launcher))
+                return;

Is there any case where the autovacuum launcher calls
AutoVacuumUpdateDelay() function?

I had meant to add it to HandleAutoVacLauncherInterrupts() after
reloading the config file (done in attached patch). When using the
global variables for cost delay (instead of wi_cost_delay in worker
info), the autovac launcher also has to do the check in the else branch
of AutoVacuumUpdateDelay()

VacuumCostDelay = autovacuum_vac_cost_delay >= 0 ?
autovacuum_vac_cost_delay : VacuumCostDelay;

to make sure VacuumCostDelay is correct for when it calls
autovac_balance_cost().

But doesn't the launcher do a similar thing at the beginning of
autovac_balance_cost()?

double vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
autovacuum_vac_cost_delay : VacuumCostDelay);

Ah, yes. You are right.

Related to this point, I think autovac_balance_cost() should use
globally-set cost_limit and cost_delay values to calculate worker's
vacuum-delay parameters. IOW, vac_cost_limit and vac_cost_delay should
come from the config file setting, not table option etc:

int vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
autovacuum_vac_cost_limit : VacuumCostLimit);
double vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
autovacuum_vac_cost_delay : VacuumCostDelay);

If my understanding is right, the following change is not right;
AutoVacUpdateLimit() updates the VacuumCostLimit based on the value in
MyWorkerInfo:

MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
+ AutoVacuumUpdateLimit();

/* do a balance */
autovac_balance_cost();

- /* set the active cost parameters from the result of that */
- AutoVacuumUpdateDelay();

Also, even when using the global variables for cost delay, the
launcher doesn't need to check the global variable. It should always
be able to use either autovacuum_vac_cost_delay/limit or
VacuumCostDelay/Limit.

Yes, that is true. But, I actually think we can do something more
radical, which relates to this point as well as the issue with
cost_limit_base below.

This also made me think about whether or not we still need cost_limit_base.
It is used to ensure that autovac_balance_cost() never ends up setting
workers' wi_cost_limits above the current autovacuum_vacuum_cost_limit
(or VacuumCostLimit). However, the launcher and all the workers should
know what the value is without cost_limit_base, no?

Yeah, the current balancing algorithm looks to respect the cost_limit
value set when starting to vacuum the table. The proportion of the
amount of I/O that a worker can consume is calculated based on the
base value and the new worker's cost_limit value cannot exceed the
base value. Given that we're trying to dynamically tune worker's cost
parameters (delay and limit), this concept seems to need to be
updated.

In master, autovacuum workers reload the config file at most once per
table vacuumed. And that is the same time that they update their
wi_cost_limit_base and wi_cost_delay. Thus, when autovac_balance_cost()
is called, there is a good chance that different workers will have
different values for wi_cost_limit_base and wi_cost_delay (and we are
only talking about workers not vacuuming a table with table option
cost-related gucs). So, it made sense that the balancing algorithm tried
to use a ratio to determine what to set the cost limit of each worker
to. It is clamped to the base value, as you say, but it also gives
workers a proportion of the new limit equal to what proportion their base
cost represents of the total cost.

I think all of this doesn't matter anymore now that everyone can reload
the config file often and dynamically change these values.

Thus, in the attached v5, I have removed both wi_cost_limit and wi_cost_delay
from WorkerInfo. I've added a new variable to AutoVacuumShmem called
nworkers_for_balance. Now, autovac_balance_cost() only recalculates this
number and updates it if it has changed. Then, in
AutoVacuumUpdateLimit() workers read from this atomic value and divide
the value of the cost limit gucs by that number to get their own cost limit.

I keep the table option value of cost limit and cost delay in
backend-local memory to reference when updating the worker cost limit.

One nice thing is autovac_balance_cost() only requires an access shared
lock now (though most callers are updating other members before calling
it and still take an exclusive lock).

What do you think?

I think this is a good idea.

Do we need to calculate the number of workers running with
nworkers_for_balance by iterating over the running worker list? I
guess autovacuum workers can increment/decrement it at the beginning
and end of vacuum.

Also not sure how the patch interacts with failsafe autovac and parallel
vacuum.

Good point.

When entering the failsafe mode, we disable the vacuum delays (see
lazy_check_wraparound_failsafe()). We need to keep disabling the
vacuum delays even after reloading the config file. One idea is to
have another global variable indicating we're in the failsafe mode.
vacuum_delay_point() doesn't update VacuumCostActive if the flag is
true.

I think we might not need to do this. Other than in
lazy_check_wraparound_failsafe(), VacuumCostActive is only updated in
two places:

1) in vacuum() which autovacuum will call per table. And failsafe is
reset per table as well.

2) in vacuum_delay_point(), but, since VacuumCostActive will already be
false when we enter vacuum_delay_point() the next time after
lazy_check_wraparound_failsafe(), we won't set VacuumCostActive there.

Indeed. But does it mean that there is no code path to turn
vacuum-delay on, even when vacuum_cost_delay is updated from 0 to
non-0?

Ah yes! Good point. This is true.
I'm not sure how to cheaply allow for re-enabling delays after disabling
them in the middle of a table vacuum.

I don't see a way around checking if we need to reload the config file
on every call to vacuum_delay_point() (currently, we are only doing this
when we have to wait anyway). It seems expensive to do this check every
time. If we do do this, we would update VacuumCostActive when updating
VacuumCostDelay, and we would need a global variable keeping the
failsafe status, as you mentioned.

It could be okay to say that you can only disable cost-based delays in
the middle of vacuuming a table (i.e. you cannot enable them if they are
already disabled until you start vacuuming the next table). Though maybe
it is weird that you can increase the delay but not re-enable it...

On Mon, Mar 20, 2023 at 1:48 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

So, I thought about it some more, and I think it is a bit odd that you
can increase the delay and limit but not re-enable them if they were
disabled. And, perhaps it would be okay to check ConfigReloadPending at
the top of vacuum_delay_point() instead of only after sleeping. It is
just one more branch. We can check if VacuumCostActive is false after
checking if we should reload and doing so if needed and return early.
I've implemented that in attached v6.

I added in the global we discussed for VacuumFailsafeActive. If we keep
it, we can probably remove the one in LVRelState -- as it seems
redundant. Let me know what you think.

I think the following change is related:

-        if (!VacuumCostActive || InterruptPending)
+        if (InterruptPending || VacuumFailsafeActive ||
+                (!VacuumCostActive && !ConfigReloadPending))
                 return;
+        /*
+         * Reload the configuration file if requested. This allows changes to
+         * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay to
+         * take effect while a table is being vacuumed or analyzed.
+         */
+        if (ConfigReloadPending && !analyze_in_outer_xact)
+        {
+                ConfigReloadPending = false;
+                ProcessConfigFile(PGC_SIGHUP);
+                AutoVacuumUpdateDelay();
+                AutoVacuumUpdateLimit();
+        }

It makes sense to me that we need to reload the config file even when
vacuum-delay is disabled. But I think it's not convenient for users
that we don't reload the configuration file once the failsafe is
triggered. I think users might want to change some GUCs such as
log_autovacuum_min_duration.

On an unrelated note, I was wondering if there were any docs anywhere
that should be updated to go along with this.

The current patch improves the internal mechanism of (re)balancing
vacuum-cost but doesn't change user-visible behavior. I don't have any
idea so far that we should update somewhere in the doc.

And, I was wondering if it was worth trying to split up the part that
reloads the config file and all of the autovacuum stuff. The reloading
of the config file by itself won't actually result in autovacuum workers
having updated cost delays because of them overwriting it with
wi_cost_delay, but it will allow VACUUM to have those updated values.

It makes sense to me to have changes for overhauling the rebalance
mechanism in a separate patch.

Looking back at the original concern you mentioned[1]/messages/by-id/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7+DXhrjDA0gw@mail.gmail.com:

speed up long-running vacuum of a large table by
decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
config file is only reloaded between tables (for autovacuum) or after
the statement (for explicit vacuum).

does it make sense to have autovac_balance_cost() update workers'
wi_cost_delay too? Autovacuum launcher already reloads the config file
and does the rebalance. So I thought autovac_balance_cost() can update
the cost_delay as well, and this might be a minimal change to deal
with your concern. This doesn't have the effect for manual VACUUM but
since vacuum delay is disabled by default it won't be a big problem.
As for manual VACUUMs, we would need to reload the config file in
vacuum_delay_point() as the part of your patch does. Overhauling the
rebalance mechanism would be another patch to improve it further.

Regards,

[1]: /messages/by-id/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7+DXhrjDA0gw@mail.gmail.com

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#26Daniel Gustafsson
daniel@yesql.se
In reply to: Masahiko Sawada (#25)
Re: Should vacuum process config file reload more often

On 23 Mar 2023, at 07:08, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sun, Mar 19, 2023 at 7:47 AM Melanie Plageman <melanieplageman@gmail.com> wrote:

It makes sense to me that we need to reload the config file even when
vacuum-delay is disabled. But I think it's not convenient for users
that we don't reload the configuration file once the failsafe is
triggered. I think users might want to change some GUCs such as
log_autovacuum_min_duration.

I agree with this.

On an unrelated note, I was wondering if there were any docs anywhere
that should be updated to go along with this.

The current patch improves the internal mechanism of (re)balancing
vacuum-cost but doesn't change user-visible behavior. I don't have any
idea so far that we should update somewhere in the doc.

I had a look as well and can't really spot anywhere where the current behavior
is detailed, so there is little to update. On top of that, I also don't think
it's worth adding this to the docs.

And, I was wondering if it was worth trying to split up the part that
reloads the config file and all of the autovacuum stuff. The reloading
of the config file by itself won't actually result in autovacuum workers
having updated cost delays because of them overwriting it with
wi_cost_delay, but it will allow VACUUM to have those updated values.

It makes sense to me to have changes for overhauling the rebalance
mechanism in a separate patch.

It would for sure be worth considering,

+bool VacuumFailsafeActive = false;
This needs documentation, how it's used and how it relates to failsafe_active
in LVRelState (which it might replace(?), but until then).

+ pg_atomic_uint32 nworkers_for_balance;
This needs a short oneline documentation update to the struct comment.

- double wi_cost_delay;
- int wi_cost_limit;
- int wi_cost_limit_base;
This change makes the below comment in do_autovacuum in need of an update:
/*
* Remove my info from shared memory. We could, but intentionally.
* don't, clear wi_cost_limit and friends --- this is on the
* assumption that we probably have more to do with similar cost
* settings, so we don't want to give up our share of I/O for a very
* short interval and thereby thrash the global balance.
*/

+   if (av_table_option_cost_delay >= 0)
+       VacuumCostDelay = av_table_option_cost_delay;
+   else
+       VacuumCostDelay = autovacuum_vac_cost_delay >= 0 ?
+           autovacuum_vac_cost_delay : VacuumCostDelay;
While it's a matter of personal preference, I for one would like if we reduced
the number of ternary operators in the vacuum code, especially those mixed into
if statements.  The vacuum code is full of this already though so this isn't
less of an objection (as it follows style) than an observation.
+    * note: in cost_limit, zero also means use value from elsewhere, because
+    * zero is not a valid value.
...
+       int         vac_cost_limit = autovacuum_vac_cost_limit > 0 ?
+       autovacuum_vac_cost_limit : VacuumCostLimit;
Not mentioning the fact that a magic value in a GUC means it's using the value
from another GUC (which is not great IMHO), it seems we are using zero as well
as -1 as that magic value here?  (not introduced in this patch.) The docs does
AFAICT only specify -1 as that value though.  Am I missing something or is the
code and documentation slightly out of sync?

I need another few readthroughs to figure out of VacuumFailsafeActive does what
I think it does, and should be doing, but in general I think this is a good
idea and a patch in good condition close to being committable.

--
Daniel Gustafsson

#27Melanie Plageman
melanieplageman@gmail.com
In reply to: Daniel Gustafsson (#26)
2 attachment(s)
Re: Should vacuum process config file reload more often

On Thu, Mar 23, 2023 at 2:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Sun, Mar 19, 2023 at 7:47 AM Melanie Plageman <melanieplageman@gmail.com> wrote:
Do we need to calculate the number of workers running with
nworkers_for_balance by iterating over the running worker list? I
guess autovacuum workers can increment/decrement it at the beginning
and end of vacuum.

I don't think we can do that because if a worker crashes, we have no way
of knowing if it had incremented or decremented the number, so we can't
adjust for it.

Also not sure how the patch interacts with failsafe autovac and parallel
vacuum.

Good point.

When entering the failsafe mode, we disable the vacuum delays (see
lazy_check_wraparound_failsafe()). We need to keep disabling the
vacuum delays even after reloading the config file. One idea is to
have another global variable indicating we're in the failsafe mode.
vacuum_delay_point() doesn't update VacuumCostActive if the flag is
true.

I think we might not need to do this. Other than in
lazy_check_wraparound_failsafe(), VacuumCostActive is only updated in
two places:

1) in vacuum() which autovacuum will call per table. And failsafe is
reset per table as well.

2) in vacuum_delay_point(), but, since VacuumCostActive will already be
false when we enter vacuum_delay_point() the next time after
lazy_check_wraparound_failsafe(), we won't set VacuumCostActive there.

Indeed. But does it mean that there is no code path to turn
vacuum-delay on, even when vacuum_cost_delay is updated from 0 to
non-0?

Ah yes! Good point. This is true.
I'm not sure how to cheaply allow for re-enabling delays after disabling
them in the middle of a table vacuum.

I don't see a way around checking if we need to reload the config file
on every call to vacuum_delay_point() (currently, we are only doing this
when we have to wait anyway). It seems expensive to do this check every
time. If we do do this, we would update VacuumCostActive when updating
VacuumCostDelay, and we would need a global variable keeping the
failsafe status, as you mentioned.

It could be okay to say that you can only disable cost-based delays in
the middle of vacuuming a table (i.e. you cannot enable them if they are
already disabled until you start vacuuming the next table). Though maybe
it is weird that you can increase the delay but not re-enable it...

On Mon, Mar 20, 2023 at 1:48 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

So, I thought about it some more, and I think it is a bit odd that you
can increase the delay and limit but not re-enable them if they were
disabled. And, perhaps it would be okay to check ConfigReloadPending at
the top of vacuum_delay_point() instead of only after sleeping. It is
just one more branch. We can check if VacuumCostActive is false after
checking if we should reload and doing so if needed and return early.
I've implemented that in attached v6.

I added in the global we discussed for VacuumFailsafeActive. If we keep
it, we can probably remove the one in LVRelState -- as it seems
redundant. Let me know what you think.

I think the following change is related:

-        if (!VacuumCostActive || InterruptPending)
+        if (InterruptPending || VacuumFailsafeActive ||
+                (!VacuumCostActive && !ConfigReloadPending))
return;
+        /*
+         * Reload the configuration file if requested. This allows changes to
+         * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay to
+         * take effect while a table is being vacuumed or analyzed.
+         */
+        if (ConfigReloadPending && !analyze_in_outer_xact)
+        {
+                ConfigReloadPending = false;
+                ProcessConfigFile(PGC_SIGHUP);
+                AutoVacuumUpdateDelay();
+                AutoVacuumUpdateLimit();
+        }

It makes sense to me that we need to reload the config file even when
vacuum-delay is disabled. But I think it's not convenient for users
that we don't reload the configuration file once the failsafe is
triggered. I think users might want to change some GUCs such as
log_autovacuum_min_duration.

Ah, okay. Attached v7 has this change (it reloads even if failsafe is
active).

And, I was wondering if it was worth trying to split up the part that
reloads the config file and all of the autovacuum stuff. The reloading
of the config file by itself won't actually result in autovacuum workers
having updated cost delays because of them overwriting it with
wi_cost_delay, but it will allow VACUUM to have those updated values.

It makes sense to me to have changes for overhauling the rebalance
mechanism in a separate patch.

Looking back at the original concern you mentioned[1]:

speed up long-running vacuum of a large table by
decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
config file is only reloaded between tables (for autovacuum) or after
the statement (for explicit vacuum).

does it make sense to have autovac_balance_cost() update workers'
wi_cost_delay too? Autovacuum launcher already reloads the config file
and does the rebalance. So I thought autovac_balance_cost() can update
the cost_delay as well, and this might be a minimal change to deal
with your concern. This doesn't have the effect for manual VACUUM but
since vacuum delay is disabled by default it won't be a big problem.
As for manual VACUUMs, we would need to reload the config file in
vacuum_delay_point() as the part of your patch does. Overhauling the
rebalance mechanism would be another patch to improve it further.

So, we can't do this without acquiring an access shared lock on every
call to vacuum_delay_point() because cost delay is a double.

I will work on a patchset with separate commits for reloading the config
file, though (with autovac not benefitting in the first commit).

On Thu, Mar 23, 2023 at 12:24 PM Daniel Gustafsson <daniel@yesql.se> wrote:

+bool VacuumFailsafeActive = false;
This needs documentation, how it's used and how it relates to failsafe_active
in LVRelState (which it might replace(?), but until then).

Thanks! I've removed LVRelState->failsafe_active.

I've also separated the VacuumFailsafeActive change into its own commit.
I will say that that commit message needs some work.

+ pg_atomic_uint32 nworkers_for_balance;
This needs a short oneline documentation update to the struct comment.

Done. I also prefixed with av to match the other members. I am thinking
that this variable name could be better. I want to convey that it is the
number of workers sharing a cost limit, so I considered
av_limit_sharers or something like that. I am looking to convey that
it is the number of workers amongst whom we must split the cost limit.

- double wi_cost_delay;
- int wi_cost_limit;
- int wi_cost_limit_base;
This change makes the below comment in do_autovacuum in need of an update:
/*
* Remove my info from shared memory. We could, but intentionally.
* don't, clear wi_cost_limit and friends --- this is on the
* assumption that we probably have more to do with similar cost
* settings, so we don't want to give up our share of I/O for a very
* short interval and thereby thrash the global balance.
*/

Updated to mention wi_dobalance instead.
On the topic of wi_dobalance, should we bother making it an atomic flag
instead? We would avoid taking a lock a few times, though probably not
frequently enough to matter. I was wondering if making it atomically
accessible would be less confusing than acquiring a lock only to set
one member in do_autovacuum() (and otherwise it is only read). I think
if I had to make it an atomic flag, I would reverse the logic and make
it wi_skip_balance or something like that.

+   if (av_table_option_cost_delay >= 0)
+       VacuumCostDelay = av_table_option_cost_delay;
+   else
+       VacuumCostDelay = autovacuum_vac_cost_delay >= 0 ?
+           autovacuum_vac_cost_delay : VacuumCostDelay;
While it's a matter of personal preference, I for one would like if we reduced
the number of ternary operators in the vacuum code, especially those mixed into
if statements.  The vacuum code is full of this already though so this isn't
less of an objection (as it follows style) than an observation.

I agree. This one was better served as an "else if" anyway -- updated!

+    * note: in cost_limit, zero also means use value from elsewhere, because
+    * zero is not a valid value.
...
+       int         vac_cost_limit = autovacuum_vac_cost_limit > 0 ?
+       autovacuum_vac_cost_limit : VacuumCostLimit;
Not mentioning the fact that a magic value in a GUC means it's using the value
from another GUC (which is not great IMHO), it seems we are using zero as well
as -1 as that magic value here?  (not introduced in this patch.) The docs does
AFAICT only specify -1 as that value though.  Am I missing something or is the
code and documentation slightly out of sync?

I copied that comment from elsewhere, but, yes it is a weird situation.
So, you can set autovacuum_vacuum_cost_limit to 0, -1 or a
positive number. You can only set vacuum_cost_limit to a positive value.
The documentation mentions that setting autovacuum_vacuum_cost_limit to
-1, the default, will have it use vacuum_cost_limit. However, it says
nothing about what setting it to 0 does. In the code, everywhere assumes
if autovacuum_vacuum_cost_limit is 0 OR -1, use vacuum_cost_limit.

This is in contrast to autovacuum_vacuum_cost_delay, for which 0 means
to disable it -- so setting autovacuum_vacuum_cost_delay to 0 will
specifically not fall back to vacuum_cost_limit.

I think the problem is that 0 is not a valid cost limit (i.e. it has no
meaning like infinity/no limit), so we basically don't want to allow the
cost limit to be set to 0, but GUC values have to be a range with a max
and a min, so we can't just exclude 0 if we want to allow -1 (as far as
I know). I think it would be nice to be able to specify multiple valid
ranges for GUCs to the GUC machinery.

So, to answer your question, yes, the code and docs are a bit
out-of-sync.

I need another few readthroughs to figure out of VacuumFailsafeActive does what
I think it does, and should be doing, but in general I think this is a good
idea and a patch in good condition close to being committable.

I will take a pass at splitting up the main commit into two. However, I
have attached a new version with the other specific updates discussed in
this thread. Feel free to provide review on this version in the meantime.

- Melanie

Attachments:

v7-0001-Make-failsafe_active-global.patchtext/x-patch; charset=US-ASCII; name=v7-0001-Make-failsafe_active-global.patchDownload
From c48b9cc5da87d980fff7a0131da72c28865ef310 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Thu, 23 Mar 2023 18:21:24 -0400
Subject: [PATCH v7 1/2] Make failsafe_active global

In preparation for future work to update the cost-based delay parameters
more frequently during vacuum, move the failsafe_active status into a
global variable which can be accessed from all parts of vacuum code. It
will be used in combination with VacuumCostDelay to keep
VacuumCostActive up-to-date during failsafe vacuuming..
---
 src/backend/access/heap/vacuumlazy.c  | 16 +++++++---------
 src/backend/commands/vacuum.c         |  4 ++++
 src/backend/commands/vacuumparallel.c |  1 +
 src/include/commands/vacuum.h         |  3 +++
 4 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8f14cf85f3..f4755bcc4b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -153,8 +153,6 @@ typedef struct LVRelState
 	bool		aggressive;
 	/* Use visibility map to skip? (disabled by DISABLE_PAGE_SKIPPING) */
 	bool		skipwithvm;
-	/* Wraparound failsafe has been triggered? */
-	bool		failsafe_active;
 	/* Consider index vacuuming bypass optimization? */
 	bool		consider_bypass_optimization;
 
@@ -391,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
 	Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
 		   params->truncate != VACOPTVALUE_AUTO);
-	vacrel->failsafe_active = false;
+	VacuumFailsafeActive = false;
 	vacrel->consider_bypass_optimization = true;
 	vacrel->do_index_vacuuming = true;
 	vacrel->do_index_cleanup = true;
@@ -709,7 +707,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 			}
 			else
 			{
-				if (!vacrel->failsafe_active)
+				if (!VacuumFailsafeActive)
 					appendStringInfoString(&buf, _("index scan bypassed: "));
 				else
 					appendStringInfoString(&buf, _("index scan bypassed by failsafe: "));
@@ -2293,7 +2291,7 @@ lazy_vacuum(LVRelState *vacrel)
 		 * vacuuming or heap vacuuming.  This VACUUM operation won't end up
 		 * back here again.
 		 */
-		Assert(vacrel->failsafe_active);
+		Assert(VacuumFailsafeActive);
 	}
 
 	/*
@@ -2374,7 +2372,7 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	 */
 	Assert(vacrel->num_index_scans > 0 ||
 		   vacrel->dead_items->num_items == vacrel->lpdead_items);
-	Assert(allindexes || vacrel->failsafe_active);
+	Assert(allindexes || VacuumFailsafeActive);
 
 	/*
 	 * Increase and report the number of index scans.
@@ -2616,12 +2614,12 @@ static bool
 lazy_check_wraparound_failsafe(LVRelState *vacrel)
 {
 	/* Don't warn more than once per VACUUM */
-	if (vacrel->failsafe_active)
+	if (VacuumFailsafeActive)
 		return true;
 
 	if (unlikely(vacuum_xid_failsafe_check(&vacrel->cutoffs)))
 	{
-		vacrel->failsafe_active = true;
+		VacuumFailsafeActive = true;
 
 		/* Disable index vacuuming, index cleanup, and heap rel truncation */
 		vacrel->do_index_vacuuming = false;
@@ -2811,7 +2809,7 @@ should_attempt_truncation(LVRelState *vacrel)
 {
 	BlockNumber possibly_freeable;
 
-	if (!vacrel->do_rel_truncate || vacrel->failsafe_active ||
+	if (!vacrel->do_rel_truncate || VacuumFailsafeActive ||
 		old_snapshot_threshold >= 0)
 		return false;
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index c54360a6a0..2d5ea570a2 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -77,6 +77,7 @@ int			vacuum_multixact_failsafe_age;
 static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
 
+bool		VacuumFailsafeActive = false;
 
 /*
  * Variables for cost-based parallel vacuum.  See comments atop
@@ -492,6 +493,7 @@ vacuum(List *relations, VacuumParams *params,
 
 		in_vacuum = true;
 		VacuumCostActive = (VacuumCostDelay > 0);
+		VacuumFailsafeActive = false;
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
@@ -1850,6 +1852,8 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params, bool skip_privs)
 
 	Assert(params != NULL);
 
+	VacuumFailsafeActive = false;
+
 	/* Begin a transaction for vacuuming this relation */
 	StartTransactionCommand();
 
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index bcd40c80a1..57188500d0 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -990,6 +990,7 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												 false);
 
 	/* Set cost-based vacuum delay */
+	VacuumFailsafeActive = false;
 	VacuumCostActive = (VacuumCostDelay > 0);
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index bdfd96cfec..a1bf3bfaa5 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -301,6 +301,9 @@ extern PGDLLIMPORT int vacuum_multixact_freeze_table_age;
 extern PGDLLIMPORT int vacuum_failsafe_age;
 extern PGDLLIMPORT int vacuum_multixact_failsafe_age;
 
+/* Indicates if wraparound failsafe has been triggered */
+extern bool VacuumFailsafeActive;
+
 /* Variables for cost-based parallel vacuum */
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
-- 
2.37.2

v7-0002-auto-vacuum-reloads-config-file-more-often.patchtext/x-patch; charset=US-ASCII; name=v7-0002-auto-vacuum-reloads-config-file-more-often.patchDownload
From 37b9db4404d51c7a4e695472f9ea9f96392f9dd1 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Thu, 23 Mar 2023 18:41:10 -0400
Subject: [PATCH v7 2/2] [auto]vacuum reloads config file more often

Previously, VACUUM and autovacuum workers would reload the configuration
file only between vacuuming tables. This precluded user updates to
cost-based delay parameters from taking effect while vacuuming a table.

Check if a reload is pending roughly once per block now, when checking
if we need to delay.

In order for this change to have the intended effect on autovacuum,
autovacuum workers must start updating their cost delay more frequently
as well.

With this new paradigm, balancing the cost limit amongst workers also
must work differently. Previously, a worker's wi_cost_limit was set only
at the beginning of vacuuming a table, after reloading the config file.
Therefore, at the time that autovac_balance_cost() is called, workers
vacuuming tables with no table options could still have different values
for their wi_cost_limit_base and wi_cost_delay. With this change,
workers will (within some margin of error) have no reason to have
different values for cost limit and cost delay (in the absence of table
options). Thus, remove cost limit and cost delay from shared memory and
keep track only of the number of workers actively vacuuming tables with
no cost-related table options. Then, use this value to determine what
each worker's effective cost limit should be.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_buP5wzsho3qNw5o9_R0pF69FRM5hgCmr-mvXmGXwdA7A%40mail.gmail.com#5e6771d4cdca4db6efc2acec2dce0bc7
---
 src/backend/commands/vacuum.c       |  52 +++++--
 src/backend/postmaster/autovacuum.c | 218 ++++++++++++----------------
 src/include/postmaster/autovacuum.h |   2 +
 3 files changed, 140 insertions(+), 132 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 2d5ea570a2..c4c361ae94 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -48,6 +48,7 @@
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/interrupt.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/pmsignal.h"
@@ -76,6 +77,7 @@ int			vacuum_multixact_failsafe_age;
 /* A few variables that don't seem worth passing around as parameters */
 static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
+static bool analyze_in_outer_xact = false;
 
 bool		VacuumFailsafeActive = false;
 
@@ -315,8 +317,7 @@ vacuum(List *relations, VacuumParams *params,
 	static bool in_vacuum = false;
 
 	const char *stmttype;
-	volatile bool in_outer_xact,
-				use_own_xacts;
+	volatile bool use_own_xacts;
 
 	Assert(params != NULL);
 
@@ -333,10 +334,10 @@ vacuum(List *relations, VacuumParams *params,
 	if (params->options & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
-		in_outer_xact = false;
+		analyze_in_outer_xact = false;
 	}
 	else
-		in_outer_xact = IsInTransactionBlock(isTopLevel);
+		analyze_in_outer_xact = IsInTransactionBlock(isTopLevel);
 
 	/*
 	 * Due to static variables vac_context, anl_context and vac_strategy,
@@ -458,7 +459,7 @@ vacuum(List *relations, VacuumParams *params,
 		Assert(params->options & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
-		else if (in_outer_xact)
+		else if (analyze_in_outer_xact)
 			use_own_xacts = false;
 		else if (list_length(relations) > 1)
 			use_own_xacts = true;
@@ -476,7 +477,7 @@ vacuum(List *relations, VacuumParams *params,
 	 */
 	if (use_own_xacts)
 	{
-		Assert(!in_outer_xact);
+		Assert(!analyze_in_outer_xact);
 
 		/* ActiveSnapshot is not set by autovacuum */
 		if (ActiveSnapshotSet())
@@ -529,7 +530,7 @@ vacuum(List *relations, VacuumParams *params,
 				}
 
 				analyze_rel(vrel->oid, vrel->relation, params,
-							vrel->va_cols, in_outer_xact, vac_strategy);
+							vrel->va_cols, analyze_in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
 				{
@@ -552,6 +553,7 @@ vacuum(List *relations, VacuumParams *params,
 	{
 		in_vacuum = false;
 		VacuumCostActive = false;
+		analyze_in_outer_xact = false;
 	}
 	PG_END_TRY();
 
@@ -2219,9 +2221,33 @@ vacuum_delay_point(void)
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	if (!VacuumCostActive || InterruptPending)
+	if (InterruptPending ||
+		(!VacuumCostActive && !ConfigReloadPending))
 		return;
 
+	/*
+	 * Reload the configuration file if requested. This allows changes to
+	 * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay to
+	 * take effect while a table is being vacuumed or analyzed.
+	 */
+	if (ConfigReloadPending && !analyze_in_outer_xact)
+	{
+		ConfigReloadPending = false;
+		ProcessConfigFile(PGC_SIGHUP);
+		AutoVacuumUpdateDelay();
+		AutoVacuumUpdateLimit();
+	}
+
+	/*
+	 * If we disabled cost-based delays after reloading the config file,
+	 * return
+	 */
+	if (!VacuumCostActive)
+	{
+		VacuumCostBalance = 0;
+		return;
+	}
+
 	/*
 	 * For parallel vacuum, the delay is computed based on the shared cost
 	 * balance.  See compute_parallel_delay.
@@ -2252,8 +2278,14 @@ vacuum_delay_point(void)
 
 		VacuumCostBalance = 0;
 
-		/* update balance values for workers */
-		AutoVacuumUpdateDelay();
+		/*
+		 * Update limit values for workers. We must always do this in case the
+		 * autovacuum launcher or another autovacuum worker has recalculated
+		 * the number of workers across which we must balance the limit. This
+		 * is done by the launcher when launching a new worker and by workers
+		 * before vacuuming each table.
+		 */
+		AutoVacuumUpdateLimit();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c0e2e00a7e..ad872d8c73 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -139,6 +139,9 @@ int			Log_autovacuum_min_duration = 600000;
 static bool am_autovacuum_launcher = false;
 static bool am_autovacuum_worker = false;
 
+static double av_table_option_cost_delay = -1;
+static int	av_table_option_cost_limit = 0;
+
 /* Flags set by signal handlers */
 static volatile sig_atomic_t got_SIGUSR2 = false;
 
@@ -189,8 +192,8 @@ typedef struct autovac_table
 {
 	Oid			at_relid;
 	VacuumParams at_params;
-	double		at_vacuum_cost_delay;
-	int			at_vacuum_cost_limit;
+	double		at_table_option_vac_cost_delay;
+	int			at_table_option_vac_cost_limit;
 	bool		at_dobalance;
 	bool		at_sharedrel;
 	char	   *at_relname;
@@ -209,7 +212,7 @@ typedef struct autovac_table
  * wi_sharedrel flag indicating whether table is marked relisshared
  * wi_proc		pointer to PGPROC of the running worker, NULL if not started
  * wi_launchtime Time at which this worker was launched
- * wi_cost_*	Vacuum cost-based delay parameters current in this worker
+ * wi_dobalance Whether this worker should be included in balance calculations
  *
  * All fields are protected by AutovacuumLock, except for wi_tableoid and
  * wi_sharedrel which are protected by AutovacuumScheduleLock (note these
@@ -225,9 +228,6 @@ typedef struct WorkerInfoData
 	TimestampTz wi_launchtime;
 	bool		wi_dobalance;
 	bool		wi_sharedrel;
-	double		wi_cost_delay;
-	int			wi_cost_limit;
-	int			wi_cost_limit_base;
 } WorkerInfoData;
 
 typedef struct WorkerInfoData *WorkerInfo;
@@ -273,6 +273,8 @@ typedef struct AutoVacuumWorkItem
  * av_startingWorker pointer to WorkerInfo currently being started (cleared by
  *					the worker itself as soon as it's up and running)
  * av_workItems		work item array
+ * av_nworkers_for_balance the number of autovacuum workers to use when
+ * 					calculating the per worker cost limit
  *
  * This struct is protected by AutovacuumLock, except for av_signal and parts
  * of the worker list (see above).
@@ -286,6 +288,7 @@ typedef struct
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+	pg_atomic_uint32 av_nworkers_for_balance;
 } AutoVacuumShmemStruct;
 
 static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -820,7 +823,7 @@ HandleAutoVacLauncherInterrupts(void)
 			AutoVacLauncherShutdown();
 
 		/* rebalance in case the default cost parameters changed */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
 		autovac_balance_cost();
 		LWLockRelease(AutovacuumLock);
 
@@ -1756,9 +1759,6 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
 		MyWorkerInfo->wi_dobalance = false;
-		MyWorkerInfo->wi_cost_delay = 0;
-		MyWorkerInfo->wi_cost_limit = 0;
-		MyWorkerInfo->wi_cost_limit_base = 0;
 		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 						&MyWorkerInfo->wi_links);
 		/* not mine anymore */
@@ -1774,99 +1774,97 @@ FreeWorkerInfo(int code, Datum arg)
 }
 
 /*
- * Update the cost-based delay parameters, so that multiple workers consume
- * each a fraction of the total available I/O.
+ * Update VacuumCostDelay with the correct value for an autovacuum worker,
+ * given the value of other relevant cost-based delay parameters. Autovacuum
+ * workers should call this after every config reload, in case VacuumCostDelay
+ * was overwritten.
  */
 void
 AutoVacuumUpdateDelay(void)
 {
-	if (MyWorkerInfo)
+	if (!am_autovacuum_worker)
+		return;
+
+	if (av_table_option_cost_delay >= 0)
+		VacuumCostDelay = av_table_option_cost_delay;
+	else if (autovacuum_vac_cost_delay >= 0)
+		VacuumCostDelay = autovacuum_vac_cost_delay;
+
+	if (!VacuumFailsafeActive)
+		VacuumCostActive = (VacuumCostDelay > 0);
+}
+
+/*
+ * Update VacuumCostLimit with the correct value for an autovacuum worker,
+ * given the value of other relevant cost limit parameters and the number of
+ * workers across which the limit must be balanced. Autovacuum workers must
+ * call this regularly in case av_nworkers_for_balance has been updated by
+ * another worker or by the autovacuum launcher. They also must call this after
+ * every config reload, in case VacuumCostLimit was overwritten.
+ */
+void
+AutoVacuumUpdateLimit(void)
+{
+	if (!am_autovacuum_worker)
+		return;
+
+	/*
+	 * note: in cost_limit, zero also means use value from elsewhere, because
+	 * zero is not a valid value.
+	 */
+	if (av_table_option_cost_limit > 0)
+		VacuumCostLimit = av_table_option_cost_limit;
+	else
 	{
-		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
-		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+		int			vac_cost_limit = autovacuum_vac_cost_limit > 0 ?
+		autovacuum_vac_cost_limit : VacuumCostLimit;
+
+		int			balanced_cost_limit = vac_cost_limit /
+		pg_atomic_read_u32(&AutoVacuumShmem->av_nworkers_for_balance);
+
+		VacuumCostLimit = Max(Min(balanced_cost_limit, vac_cost_limit), 1);
 	}
 }
 
 /*
  * autovac_balance_cost
- *		Recalculate the cost limit setting for each active worker.
+ *		Recalculate the number of workers to consider, given table options and
+ *		the current number of active workers.
  *
- * Caller must hold the AutovacuumLock in exclusive mode.
+ * Caller must hold the AutovacuumLock in at least shared mode.
  */
 static void
 autovac_balance_cost(void)
 {
-	/*
-	 * The idea here is that we ration out I/O equally.  The amount of I/O
-	 * that a worker can consume is determined by cost_limit/cost_delay, so we
-	 * try to equalize those ratios rather than the raw limit settings.
-	 *
-	 * note: in cost_limit, zero also means use value from elsewhere, because
-	 * zero is not a valid value.
-	 */
-	int			vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
-								  autovacuum_vac_cost_limit : VacuumCostLimit);
-	double		vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
-								  autovacuum_vac_cost_delay : VacuumCostDelay);
-	double		cost_total;
-	double		cost_avail;
 	dlist_iter	iter;
+	int			orig_nworkers_for_balance;
+	int			nworkers_for_balance = 0;
 
-	/* not set? nothing to do */
-	if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
+	if (autovacuum_vac_cost_delay == 0 ||
+		(autovacuum_vac_cost_delay == -1 && VacuumCostDelay == 0))
 		return;
 
-	/* calculate the total base cost limit of participating active workers */
-	cost_total = 0.0;
-	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
-	{
-		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
-
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-			cost_total +=
-				(double) worker->wi_cost_limit_base / worker->wi_cost_delay;
-	}
-
-	/* there are no cost limits -- nothing to do */
-	if (cost_total <= 0)
+	if (autovacuum_vac_cost_limit <= 0 && VacuumCostLimit <= 0)
 		return;
 
-	/*
-	 * Adjust cost limit of each active worker to balance the total of cost
-	 * limit to autovacuum_vacuum_cost_limit.
-	 */
-	cost_avail = (double) vac_cost_limit / vac_cost_delay;
+	orig_nworkers_for_balance =
+		pg_atomic_read_u32(&AutoVacuumShmem->av_nworkers_for_balance);
+
 	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
 	{
 		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-		{
-			int			limit = (int)
-			(cost_avail * worker->wi_cost_limit_base / cost_total);
-
-			/*
-			 * We put a lower bound of 1 on the cost_limit, to avoid division-
-			 * by-zero in the vacuum code.  Also, in case of roundoff trouble
-			 * in these calculations, let's be sure we don't ever set
-			 * cost_limit to more than the base value.
-			 */
-			worker->wi_cost_limit = Max(Min(limit,
-											worker->wi_cost_limit_base),
-										1);
-		}
+		if (worker->wi_proc == NULL || !worker->wi_dobalance)
+			continue;
 
-		if (worker->wi_proc != NULL)
-			elog(DEBUG2, "autovac_balance_cost(pid=%d db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d, cost_delay=%g)",
-				 worker->wi_proc->pid, worker->wi_dboid, worker->wi_tableoid,
-				 worker->wi_dobalance ? "yes" : "no",
-				 worker->wi_cost_limit, worker->wi_cost_limit_base,
-				 worker->wi_cost_delay);
+		nworkers_for_balance++;
 	}
+
+	nworkers_for_balance = Max(nworkers_for_balance, 1);
+
+	if (nworkers_for_balance != orig_nworkers_for_balance)
+		pg_atomic_write_u32(&AutoVacuumShmem->av_nworkers_for_balance,
+							nworkers_for_balance);
 }
 
 /*
@@ -2312,14 +2310,15 @@ do_autovacuum(void)
 		autovac_table *tab;
 		bool		isshared;
 		bool		skipit;
-		double		stdVacuumCostDelay;
-		int			stdVacuumCostLimit;
 		dlist_iter	iter;
 
 		CHECK_FOR_INTERRUPTS();
 
 		/*
 		 * Check for config changes before processing each collected table.
+		 * Autovacuum workers must update VacuumCostDelay and VacuumCostLimit
+		 * in case they were overridden by the reload. However, we will do
+		 * this as soon as we check table options a bit later.
 		 */
 		if (ConfigReloadPending)
 		{
@@ -2416,32 +2415,18 @@ do_autovacuum(void)
 			continue;
 		}
 
-		/*
-		 * Remember the prevailing values of the vacuum cost GUCs.  We have to
-		 * restore these at the bottom of the loop, else we'll compute wrong
-		 * values in the next iteration of autovac_balance_cost().
-		 */
-		stdVacuumCostDelay = VacuumCostDelay;
-		stdVacuumCostLimit = VacuumCostLimit;
+		av_table_option_cost_limit = tab->at_table_option_vac_cost_limit;
+		av_table_option_cost_delay = tab->at_table_option_vac_cost_delay;
 
 		/* Must hold AutovacuumLock while mucking with cost balance info */
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-
-		/* advertise my cost delay parameters for the balancing algorithm */
 		MyWorkerInfo->wi_dobalance = tab->at_dobalance;
-		MyWorkerInfo->wi_cost_delay = tab->at_vacuum_cost_delay;
-		MyWorkerInfo->wi_cost_limit = tab->at_vacuum_cost_limit;
-		MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
-
-		/* do a balance */
 		autovac_balance_cost();
+		LWLockRelease(AutovacuumLock);
 
-		/* set the active cost parameters from the result of that */
+		AutoVacuumUpdateLimit();
 		AutoVacuumUpdateDelay();
 
-		/* done */
-		LWLockRelease(AutovacuumLock);
-
 		/* clean up memory before each iteration */
 		MemoryContextResetAndDeleteChildren(PortalContext);
 
@@ -2525,19 +2510,15 @@ deleted:
 
 		/*
 		 * Remove my info from shared memory.  We could, but intentionally
-		 * don't, clear wi_cost_limit and friends --- this is on the
-		 * assumption that we probably have more to do with similar cost
-		 * settings, so we don't want to give up our share of I/O for a very
-		 * short interval and thereby thrash the global balance.
+		 * don't, set wi_dobalance to false on the assumption that we are more
+		 * likely than not to vacuum a table with no table options next, so we
+		 * don't want to give up our share of I/O for a very short interval
+		 * and thereby thrash the global balance.
 		 */
 		LWLockAcquire(AutovacuumScheduleLock, LW_EXCLUSIVE);
 		MyWorkerInfo->wi_tableoid = InvalidOid;
 		MyWorkerInfo->wi_sharedrel = false;
 		LWLockRelease(AutovacuumScheduleLock);
-
-		/* restore vacuum cost GUCs for the next iteration */
-		VacuumCostDelay = stdVacuumCostDelay;
-		VacuumCostLimit = stdVacuumCostLimit;
 	}
 
 	/*
@@ -2569,6 +2550,8 @@ deleted:
 		{
 			ConfigReloadPending = false;
 			ProcessConfigFile(PGC_SIGHUP);
+			AutoVacuumUpdateDelay();
+			AutoVacuumUpdateLimit();
 		}
 
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
@@ -2801,8 +2784,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		int			freeze_table_age;
 		int			multixact_freeze_min_age;
 		int			multixact_freeze_table_age;
-		int			vac_cost_limit;
-		double		vac_cost_delay;
 		int			log_min_duration;
 
 		/*
@@ -2812,20 +2793,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 * defaults, autovacuum's own first and plain vacuum second.
 		 */
 
-		/* -1 in autovac setting means use plain vacuum_cost_delay */
-		vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
-			? avopts->vacuum_cost_delay
-			: (autovacuum_vac_cost_delay >= 0)
-			? autovacuum_vac_cost_delay
-			: VacuumCostDelay;
-
-		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
-		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
-			? avopts->vacuum_cost_limit
-			: (autovacuum_vac_cost_limit > 0)
-			? autovacuum_vac_cost_limit
-			: VacuumCostLimit;
-
 		/* -1 in autovac setting means use log_autovacuum_min_duration */
 		log_min_duration = (avopts && avopts->log_min_duration >= 0)
 			? avopts->log_min_duration
@@ -2881,8 +2848,10 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab->at_params.multixact_freeze_table_age = multixact_freeze_table_age;
 		tab->at_params.is_wraparound = wraparound;
 		tab->at_params.log_min_duration = log_min_duration;
-		tab->at_vacuum_cost_limit = vac_cost_limit;
-		tab->at_vacuum_cost_delay = vac_cost_delay;
+		tab->at_table_option_vac_cost_limit = avopts ?
+			avopts->vacuum_cost_limit : 0;
+		tab->at_table_option_vac_cost_delay = avopts ?
+			avopts->vacuum_cost_delay : -1;
 		tab->at_relname = NULL;
 		tab->at_nspname = NULL;
 		tab->at_datname = NULL;
@@ -3374,10 +3343,15 @@ AutoVacuumShmemInit(void)
 		worker = (WorkerInfo) ((char *) AutoVacuumShmem +
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
+
 		/* initialize the WorkerInfo free list */
 		for (i = 0; i < autovacuum_max_workers; i++)
 			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 							&worker[i].wi_links);
+
+		/* initialize to 1, as it should be a minimum of 1 */
+		pg_atomic_init_u32(&AutoVacuumShmem->av_nworkers_for_balance, 1);
+
 	}
 	else
 		Assert(found);
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index c140371b51..558358911c 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -66,6 +66,8 @@ extern void AutoVacWorkerFailed(void);
 /* autovacuum cost-delay balancer */
 extern void AutoVacuumUpdateDelay(void);
 
+extern void AutoVacuumUpdateLimit(void);
+
 #ifdef EXEC_BACKEND
 extern void AutoVacLauncherMain(int argc, char *argv[]) pg_attribute_noreturn();
 extern void AutoVacWorkerMain(int argc, char *argv[]) pg_attribute_noreturn();
-- 
2.37.2

#28Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Melanie Plageman (#27)
Re: Should vacuum process config file reload more often

On Fri, Mar 24, 2023 at 9:27 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Thu, Mar 23, 2023 at 2:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Sun, Mar 19, 2023 at 7:47 AM Melanie Plageman <melanieplageman@gmail.com> wrote:
Do we need to calculate the number of workers running with
nworkers_for_balance by iterating over the running worker list? I
guess autovacuum workers can increment/decrement it at the beginning
and end of vacuum.

I don't think we can do that because if a worker crashes, we have no way
of knowing if it had incremented or decremented the number, so we can't
adjust for it.

What kind of crash are you concerned about? If a worker raises an
ERROR, we can catch it in PG_CATCH() block. If it's a FATAL, we can do
that in FreeWorkerInfo(). A PANIC error ends up crashing the entire
server.

Also not sure how the patch interacts with failsafe autovac and parallel
vacuum.

Good point.

When entering the failsafe mode, we disable the vacuum delays (see
lazy_check_wraparound_failsafe()). We need to keep disabling the
vacuum delays even after reloading the config file. One idea is to
have another global variable indicating we're in the failsafe mode.
vacuum_delay_point() doesn't update VacuumCostActive if the flag is
true.

I think we might not need to do this. Other than in
lazy_check_wraparound_failsafe(), VacuumCostActive is only updated in
two places:

1) in vacuum() which autovacuum will call per table. And failsafe is
reset per table as well.

2) in vacuum_delay_point(), but, since VacuumCostActive will already be
false when we enter vacuum_delay_point() the next time after
lazy_check_wraparound_failsafe(), we won't set VacuumCostActive there.

Indeed. But does it mean that there is no code path to turn
vacuum-delay on, even when vacuum_cost_delay is updated from 0 to
non-0?

Ah yes! Good point. This is true.
I'm not sure how to cheaply allow for re-enabling delays after disabling
them in the middle of a table vacuum.

I don't see a way around checking if we need to reload the config file
on every call to vacuum_delay_point() (currently, we are only doing this
when we have to wait anyway). It seems expensive to do this check every
time. If we do do this, we would update VacuumCostActive when updating
VacuumCostDelay, and we would need a global variable keeping the
failsafe status, as you mentioned.

It could be okay to say that you can only disable cost-based delays in
the middle of vacuuming a table (i.e. you cannot enable them if they are
already disabled until you start vacuuming the next table). Though maybe
it is weird that you can increase the delay but not re-enable it...

On Mon, Mar 20, 2023 at 1:48 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

So, I thought about it some more, and I think it is a bit odd that you
can increase the delay and limit but not re-enable them if they were
disabled. And, perhaps it would be okay to check ConfigReloadPending at
the top of vacuum_delay_point() instead of only after sleeping. It is
just one more branch. We can check if VacuumCostActive is false after
checking if we should reload and doing so if needed and return early.
I've implemented that in attached v6.

I added in the global we discussed for VacuumFailsafeActive. If we keep
it, we can probably remove the one in LVRelState -- as it seems
redundant. Let me know what you think.

I think the following change is related:

-        if (!VacuumCostActive || InterruptPending)
+        if (InterruptPending || VacuumFailsafeActive ||
+                (!VacuumCostActive && !ConfigReloadPending))
return;
+        /*
+         * Reload the configuration file if requested. This allows changes to
+         * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay to
+         * take effect while a table is being vacuumed or analyzed.
+         */
+        if (ConfigReloadPending && !analyze_in_outer_xact)
+        {
+                ConfigReloadPending = false;
+                ProcessConfigFile(PGC_SIGHUP);
+                AutoVacuumUpdateDelay();
+                AutoVacuumUpdateLimit();
+        }

It makes sense to me that we need to reload the config file even when
vacuum-delay is disabled. But I think it's not convenient for users
that we don't reload the configuration file once the failsafe is
triggered. I think users might want to change some GUCs such as
log_autovacuum_min_duration.

Ah, okay. Attached v7 has this change (it reloads even if failsafe is
active).

And, I was wondering if it was worth trying to split up the part that
reloads the config file and all of the autovacuum stuff. The reloading
of the config file by itself won't actually result in autovacuum workers
having updated cost delays because of them overwriting it with
wi_cost_delay, but it will allow VACUUM to have those updated values.

It makes sense to me to have changes for overhauling the rebalance
mechanism in a separate patch.

Looking back at the original concern you mentioned[1]:

speed up long-running vacuum of a large table by
decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
config file is only reloaded between tables (for autovacuum) or after
the statement (for explicit vacuum).

does it make sense to have autovac_balance_cost() update workers'
wi_cost_delay too? Autovacuum launcher already reloads the config file
and does the rebalance. So I thought autovac_balance_cost() can update
the cost_delay as well, and this might be a minimal change to deal
with your concern. This doesn't have the effect for manual VACUUM but
since vacuum delay is disabled by default it won't be a big problem.
As for manual VACUUMs, we would need to reload the config file in
vacuum_delay_point() as the part of your patch does. Overhauling the
rebalance mechanism would be another patch to improve it further.

So, we can't do this without acquiring an access shared lock on every
call to vacuum_delay_point() because cost delay is a double.

I will work on a patchset with separate commits for reloading the config
file, though (with autovac not benefitting in the first commit).

On Thu, Mar 23, 2023 at 12:24 PM Daniel Gustafsson <daniel@yesql.se> wrote:

+bool VacuumFailsafeActive = false;
This needs documentation, how it's used and how it relates to failsafe_active
in LVRelState (which it might replace(?), but until then).

Thanks! I've removed LVRelState->failsafe_active.

I've also separated the VacuumFailsafeActive change into its own commit.

@@ -492,6 +493,7 @@ vacuum(List *relations, VacuumParams *params,

in_vacuum = true;
VacuumCostActive = (VacuumCostDelay > 0);
+ VacuumFailsafeActive = false;
VacuumCostBalance = 0;
VacuumPageHit = 0;
VacuumPageMiss = 0;

I think we need to reset VacuumFailsafeActive also in PG_FINALLY()
block in vacuum().

One comment on 0002 patch:

+        /*
+         * Reload the configuration file if requested. This allows changes to
+         * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay to
+         * take effect while a table is being vacuumed or analyzed.
+         */
+        if (ConfigReloadPending && !analyze_in_outer_xact)
+        {
+                ConfigReloadPending = false;
+                ProcessConfigFile(PGC_SIGHUP);
+                AutoVacuumUpdateDelay();
+                AutoVacuumUpdateLimit();
+        }

I think we need comments on why we don't reload the config file if
we're analyzing a table in a user transaction.

I need another few readthroughs to figure out of VacuumFailsafeActive does what
I think it does, and should be doing, but in general I think this is a good
idea and a patch in good condition close to being committable.

Another approach would be to make VacuumCostActive a ternary value:
on, off, and never. When we trigger the failsafe mode we switch it to
never, meaning that it never becomes active even after reloading the
config file. A good point is that we don't need to add a new global
variable, but I'm not sure it's better than the current approach.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#29Melanie Plageman
melanieplageman@gmail.com
In reply to: Masahiko Sawada (#28)
Re: Should vacuum process config file reload more often

On Fri, Mar 24, 2023 at 1:21 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Mar 24, 2023 at 9:27 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Thu, Mar 23, 2023 at 2:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Sun, Mar 19, 2023 at 7:47 AM Melanie Plageman <melanieplageman@gmail.com> wrote:
Do we need to calculate the number of workers running with
nworkers_for_balance by iterating over the running worker list? I
guess autovacuum workers can increment/decrement it at the beginning
and end of vacuum.

I don't think we can do that because if a worker crashes, we have no way
of knowing if it had incremented or decremented the number, so we can't
adjust for it.

What kind of crash are you concerned about? If a worker raises an
ERROR, we can catch it in PG_CATCH() block. If it's a FATAL, we can do
that in FreeWorkerInfo(). A PANIC error ends up crashing the entire
server.

Yes, but what about a worker that segfaults? Since table AMs can define
relation_vacuum(), this seems like a real possibility.

I'll address your other code feedback in the next version.

I realized nworkers_for_balance should be initialized to 0 and not 1 --
1 is misleading since there are often 0 autovac workers. We just never
want to use nworkers_for_balance when it is 0. But, workers put a floor
of 1 on the number when they divide limit/nworkers_for_balance (since
they know there must be at least one worker right now since they are a
worker). I thought about whether or not they should call
autovac_balance_cost() if they find that nworkers_for_balance is 0 when
updating their own limit, but I'm not sure.

I need another few readthroughs to figure out of VacuumFailsafeActive does what
I think it does, and should be doing, but in general I think this is a good
idea and a patch in good condition close to being committable.

Another approach would be to make VacuumCostActive a ternary value:
on, off, and never. When we trigger the failsafe mode we switch it to
never, meaning that it never becomes active even after reloading the
config file. A good point is that we don't need to add a new global
variable, but I'm not sure it's better than the current approach.

Hmm, this is interesting. I don't love the word "never" since it kind of
implies a duration longer than the current table being vacuumed. But we
could find a different word or just document it well. For clarity, we
might want to call it failsafe_mode or something.

I wonder if the primary drawback to converting
LVRelState->failsafe_active to a global VacuumFailsafeActive is just the
general rule of limiting scope to the minimum needed.

- Melanie

#30Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#29)
4 attachment(s)
Re: Should vacuum process config file reload more often

On Thu, Mar 23, 2023 at 8:27 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Thu, Mar 23, 2023 at 2:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

And, I was wondering if it was worth trying to split up the part that
reloads the config file and all of the autovacuum stuff. The reloading
of the config file by itself won't actually result in autovacuum workers
having updated cost delays because of them overwriting it with
wi_cost_delay, but it will allow VACUUM to have those updated values.

It makes sense to me to have changes for overhauling the rebalance
mechanism in a separate patch.

Looking back at the original concern you mentioned[1]:

speed up long-running vacuum of a large table by
decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
config file is only reloaded between tables (for autovacuum) or after
the statement (for explicit vacuum).

does it make sense to have autovac_balance_cost() update workers'
wi_cost_delay too? Autovacuum launcher already reloads the config file
and does the rebalance. So I thought autovac_balance_cost() can update
the cost_delay as well, and this might be a minimal change to deal
with your concern. This doesn't have the effect for manual VACUUM but
since vacuum delay is disabled by default it won't be a big problem.
As for manual VACUUMs, we would need to reload the config file in
vacuum_delay_point() as the part of your patch does. Overhauling the
rebalance mechanism would be another patch to improve it further.

So, we can't do this without acquiring an access shared lock on every
call to vacuum_delay_point() because cost delay is a double.

I will work on a patchset with separate commits for reloading the config
file, though (with autovac not benefitting in the first commit).

So, I realized we could actually do as you say and have autovac workers
update their wi_cost_delay and keep the balance changes in a separate
commit. I've done this in attached v8.

Workers take the exclusive lock to update their wi_cost_delay and
wi_cost_limit only when there is a config reload. So, there is one
commit that implements this behavior and a separate commit to revise the
worker rebalancing.

Note that we must have the workers also update wi_cost_limit_base and
then call autovac_balance_cost() when they reload the config file
(instead of waiting for launcher to call autovac_balance_cost()) to
avoid potentially calculating the sleep with a new value of cost delay
and an old value of cost limit.

In the commit which revises the worker rebalancing, I'm still wondering
if wi_dobalance should be an atomic flag -- probably not worth it,
right?

On Fri, Mar 24, 2023 at 1:27 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

I realized nworkers_for_balance should be initialized to 0 and not 1 --
1 is misleading since there are often 0 autovac workers. We just never
want to use nworkers_for_balance when it is 0. But, workers put a floor
of 1 on the number when they divide limit/nworkers_for_balance (since
they know there must be at least one worker right now since they are a
worker). I thought about whether or not they should call
autovac_balance_cost() if they find that nworkers_for_balance is 0 when
updating their own limit, but I'm not sure.

I've gone ahead and updated this. I haven't made the workers call
autovac_balance_cost() if they find that nworkers_for_balance is 0 when
they try and use it when updating their limit because I'm not sure if
this can happen. I would be interested in input here.

I'm also still interested in feedback on the variable name
av_nworkers_for_balance.

On Fri, Mar 24, 2023 at 1:21 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I need another few readthroughs to figure out of VacuumFailsafeActive does what
I think it does, and should be doing, but in general I think this is a good
idea and a patch in good condition close to being committable.

Another approach would be to make VacuumCostActive a ternary value:
on, off, and never. When we trigger the failsafe mode we switch it to
never, meaning that it never becomes active even after reloading the
config file. A good point is that we don't need to add a new global
variable, but I'm not sure it's better than the current approach.

Hmm, this is interesting. I don't love the word "never" since it kind of
implies a duration longer than the current table being vacuumed. But we
could find a different word or just document it well. For clarity, we
might want to call it failsafe_mode or something.

I wonder if the primary drawback to converting
LVRelState->failsafe_active to a global VacuumFailsafeActive is just the
general rule of limiting scope to the minimum needed.

Okay, so I've changed my mind about this. I like having a ternary for
VacuumCostActive and keeping failsafe_active in LVRelState. What I
didn't like was having non-vacuum code have to care about the
distinction between failsafe + inactive and just inactive. To handle
this, I converted VacuumCostActive to VacuumCostInactive since there are
two inactive cases (inactive and failsafe and plain inactive) and only
one active case. Then, I defined VacuumCostInactive as an int but use
enum values for it in vacuum code to distinguish between failsafe +
inactive and just inactive (I call it VACUUM_COST_INACTIVE_AND_LOCKED
and VACUUM_COST_INACTIVE_AND_UNLOCKED). Non-vacuum code only needs to
check if VacuumCostInactive is 0 like if (!VacuumCostInactive). I'm
happy with the result, and I think it employs only well-defined C
behavior.

- Melanie

Attachments:

v8-0003-auto-vacuum-reloads-config-file-more-often.patchtext/x-patch; charset=US-ASCII; name=v8-0003-auto-vacuum-reloads-config-file-more-often.patchDownload
From 8b9fcf7c10353dcacb4ac16515aad0ce34565566 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Fri, 24 Mar 2023 13:38:20 -0400
Subject: [PATCH v8 3/4] [auto]vacuum reloads config file more often

Previously, VACUUM and autovacuum workers would reload the configuration
file only between vacuuming tables. This precluded user updates to
cost-based delay parameters from taking effect while vacuuming a table.

Check if a reload is pending roughly once per block now, when checking
if we need to delay.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_buP5wzsho3qNw5o9_R0pF69FRM5hgCmr-mvXmGXwdA7A%40mail.gmail.com#5e6771d4cdca4db6efc2acec2dce0bc7
---
 src/backend/commands/vacuum.c       | 72 +++++++++++++++++++++++++----
 src/backend/postmaster/autovacuum.c | 30 +++++++++++-
 src/include/postmaster/autovacuum.h |  3 +-
 3 files changed, 92 insertions(+), 13 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index eb126f2247..cb32078c19 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -48,6 +48,7 @@
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/interrupt.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/pmsignal.h"
@@ -76,6 +77,7 @@ int			vacuum_multixact_failsafe_age;
 /* A few variables that don't seem worth passing around as parameters */
 static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
+static bool analyze_in_outer_xact = false;
 
 
 /*
@@ -314,8 +316,7 @@ vacuum(List *relations, VacuumParams *params,
 	static bool in_vacuum = false;
 
 	const char *stmttype;
-	volatile bool in_outer_xact,
-				use_own_xacts;
+	volatile bool use_own_xacts;
 
 	Assert(params != NULL);
 
@@ -332,10 +333,10 @@ vacuum(List *relations, VacuumParams *params,
 	if (params->options & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
-		in_outer_xact = false;
+		analyze_in_outer_xact = false;
 	}
 	else
-		in_outer_xact = IsInTransactionBlock(isTopLevel);
+		analyze_in_outer_xact = IsInTransactionBlock(isTopLevel);
 
 	/*
 	 * Due to static variables vac_context, anl_context and vac_strategy,
@@ -457,7 +458,7 @@ vacuum(List *relations, VacuumParams *params,
 		Assert(params->options & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
-		else if (in_outer_xact)
+		else if (analyze_in_outer_xact)
 			use_own_xacts = false;
 		else if (list_length(relations) > 1)
 			use_own_xacts = true;
@@ -475,7 +476,7 @@ vacuum(List *relations, VacuumParams *params,
 	 */
 	if (use_own_xacts)
 	{
-		Assert(!in_outer_xact);
+		Assert(!analyze_in_outer_xact);
 
 		/* ActiveSnapshot is not set by autovacuum */
 		if (ActiveSnapshotSet())
@@ -544,7 +545,7 @@ vacuum(List *relations, VacuumParams *params,
 				}
 
 				analyze_rel(vrel->oid, vrel->relation, params,
-							vrel->va_cols, in_outer_xact, vac_strategy);
+							vrel->va_cols, analyze_in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
 				{
@@ -568,6 +569,7 @@ vacuum(List *relations, VacuumParams *params,
 		in_vacuum = false;
 		VacuumCostInactive = VACUUM_COST_INACTIVE_AND_UNLOCKED;
 		VacuumCostBalance = 0;
+		analyze_in_outer_xact = false;
 	}
 	PG_END_TRY();
 
@@ -2233,7 +2235,52 @@ vacuum_delay_point(void)
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	if (VacuumCostInactive || InterruptPending)
+	if (InterruptPending ||
+		(VacuumCostInactive && !ConfigReloadPending))
+		return;
+
+	/*
+	 * Reload the configuration file if requested. This allows changes to
+	 * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay to
+	 * take effect while a table is being vacuumed or analyzed. Analyze should
+	 * not reload configuration file if it is in an outer transaction, as GUC
+	 * values shouldn't be allowed to refer to some uncommitted state (e.g.
+	 * database objects created in this transaction).
+	 */
+	if (ConfigReloadPending && !analyze_in_outer_xact)
+	{
+		ConfigReloadPending = false;
+		ProcessConfigFile(PGC_SIGHUP);
+
+		/*
+		 * Autovacuum workers must restore the correct values of
+		 * VacuumCostLimit and VacuumCostDelay in case they were overwritten
+		 * by reload.
+		 */
+		AutoVacuumUpdateCosts();
+		AutoVacuumOverrideCosts();
+
+		/*
+		 * If configuration changes are allowed to impact VacuumCostInactive,
+		 * make sure it is updated.
+		 */
+		if (VacuumCostInactive == VACUUM_COST_INACTIVE_AND_LOCKED)
+			return;
+
+		if (VacuumCostDelay > 0)
+			VacuumCostInactive = VACUUM_COST_ACTIVE;
+		else
+		{
+			VacuumCostInactive = VACUUM_COST_INACTIVE_AND_UNLOCKED;
+			VacuumCostBalance = 0;
+		}
+	}
+
+	/*
+	 * If we disabled cost-based delays after reloading the config file,
+	 * return.
+	 */
+	if (VacuumCostInactive)
 		return;
 
 	/*
@@ -2266,8 +2313,13 @@ vacuum_delay_point(void)
 
 		VacuumCostBalance = 0;
 
-		/* update balance values for workers */
-		AutoVacuumUpdateDelay();
+		/*
+		 * For autovacuum workers, someone may have called
+		 * autovac_balance_cost() since they last updated their
+		 * VacuumCostLimit above. Do so again now to ensure they have a
+		 * current value.
+		 */
+		AutoVacuumOverrideCosts();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c0e2e00a7e..8ac14a44c8 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1778,7 +1778,7 @@ FreeWorkerInfo(int code, Datum arg)
  * each a fraction of the total available I/O.
  */
 void
-AutoVacuumUpdateDelay(void)
+AutoVacuumOverrideCosts(void)
 {
 	if (MyWorkerInfo)
 	{
@@ -1787,6 +1787,29 @@ AutoVacuumUpdateDelay(void)
 	}
 }
 
+/*
+ * Caller must not already hold the AutovacuumLock
+ */
+void
+AutoVacuumUpdateCosts(void)
+{
+	/*
+	 * Even though this autovacuum worker may be vacuuming a table with a cost
+	 * limit table option and not a cost delay table option, we still don't
+	 * refresh the cost delay value.
+	 */
+	if (!MyWorkerInfo || !MyWorkerInfo->wi_dobalance)
+		return;
+
+	LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+	MyWorkerInfo->wi_cost_delay = autovacuum_vac_cost_delay >= 0 ?
+		autovacuum_vac_cost_delay : VacuumCostDelay;
+	MyWorkerInfo->wi_cost_limit_base = autovacuum_vac_cost_limit > 0 ?
+		autovacuum_vac_cost_limit : VacuumCostLimit;
+	autovac_balance_cost();
+	LWLockRelease(AutovacuumLock);
+}
+
 /*
  * autovac_balance_cost
  *		Recalculate the cost limit setting for each active worker.
@@ -2320,6 +2343,9 @@ do_autovacuum(void)
 
 		/*
 		 * Check for config changes before processing each collected table.
+		 * Autovacuum workers must update VacuumCostDelay and VacuumCostLimit
+		 * in case they were overridden by the reload. However, we will do
+		 * this as soon as we check table options a bit later.
 		 */
 		if (ConfigReloadPending)
 		{
@@ -2437,7 +2463,7 @@ do_autovacuum(void)
 		autovac_balance_cost();
 
 		/* set the active cost parameters from the result of that */
-		AutoVacuumUpdateDelay();
+		AutoVacuumOverrideCosts();
 
 		/* done */
 		LWLockRelease(AutovacuumLock);
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index c140371b51..ee48e7123d 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -64,7 +64,8 @@ extern int	StartAutoVacWorker(void);
 extern void AutoVacWorkerFailed(void);
 
 /* autovacuum cost-delay balancer */
-extern void AutoVacuumUpdateDelay(void);
+extern void AutoVacuumOverrideCosts(void);
+extern void AutoVacuumUpdateCosts(void);
 
 #ifdef EXEC_BACKEND
 extern void AutoVacLauncherMain(int argc, char *argv[]) pg_attribute_noreturn();
-- 
2.37.2

v8-0001-Zero-out-VacuumCostBalance.patchtext/x-patch; charset=US-ASCII; name=v8-0001-Zero-out-VacuumCostBalance.patchDownload
From 9ade10882c6b2daafb846be667de04225046e157 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 25 Mar 2023 11:59:33 -0400
Subject: [PATCH v8 1/4] Zero out VacuumCostBalance

Though it is unlikely to matter, we should zero out VacuumCostBalance
whenever we may be transitioning the state of VacuumCostActive to false.
---
 src/backend/commands/vacuum.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index c54360a6a0..7d01df7e48 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -550,6 +550,7 @@ vacuum(List *relations, VacuumParams *params,
 	{
 		in_vacuum = false;
 		VacuumCostActive = false;
+		VacuumCostBalance = 0;
 	}
 	PG_END_TRY();
 
-- 
2.37.2

v8-0004-Improve-autovacuum-worker-cost-balancing.patchtext/x-patch; charset=US-ASCII; name=v8-0004-Improve-autovacuum-worker-cost-balancing.patchDownload
From 0a546a538b75ee1a736ff9a09a31412c0b323082 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 25 Mar 2023 14:14:55 -0400
Subject: [PATCH v8 4/4] Improve autovacuum worker cost balancing

Before the prior commit, an autovacuum worker's wi_cost_limit was set
only at the beginning of vacuuming a table, after reloading the config
file. Therefore, at the time that autovac_balance_cost() is called,
workers vacuuming tables with no table options could still have
different values for their wi_cost_limit_base and wi_cost_delay.

Now that the cost parameters can be updated while vacuuming a table,
workers will (within some margin of error) have no reason to have
different values for cost limit and cost delay (in the absence of table
options). This removes the rationale for keeping cost limit and cost
delay in shared memory. Balancing the cost limit requires only the
number of active autovacuum workers vacuuming a table with no cost-based
table options.
---
 src/backend/commands/vacuum.c       |  18 ++-
 src/backend/postmaster/autovacuum.c | 237 ++++++++++++----------------
 src/include/postmaster/autovacuum.h |   4 +-
 3 files changed, 112 insertions(+), 147 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index cb32078c19..54ad76a729 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2257,12 +2257,13 @@ vacuum_delay_point(void)
 		 * VacuumCostLimit and VacuumCostDelay in case they were overwritten
 		 * by reload.
 		 */
-		AutoVacuumUpdateCosts();
-		AutoVacuumOverrideCosts();
+		AutoVacuumUpdateDelay();
+		AutoVacuumUpdateLimit();
 
 		/*
 		 * If configuration changes are allowed to impact VacuumCostInactive,
-		 * make sure it is updated.
+		 * make sure it is updated. Autovacuum workers will have already done
+		 * this in AutoVacuumUpdateDelay()
 		 */
 		if (VacuumCostInactive == VACUUM_COST_INACTIVE_AND_LOCKED)
 			return;
@@ -2314,12 +2315,13 @@ vacuum_delay_point(void)
 		VacuumCostBalance = 0;
 
 		/*
-		 * For autovacuum workers, someone may have called
-		 * autovac_balance_cost() since they last updated their
-		 * VacuumCostLimit above. Do so again now to ensure they have a
-		 * current value.
+		 * Update limit values for autovacuum workers. We must always do this
+		 * in case the autovacuum launcher or another autovacuum worker has
+		 * recalculated the number of workers across which we must balance the
+		 * limit. This is done by the launcher when launching a new worker and
+		 * by workers before vacuuming each table.
 		 */
-		AutoVacuumOverrideCosts();
+		AutoVacuumUpdateLimit();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 8ac14a44c8..0c20442fb1 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -139,6 +139,9 @@ int			Log_autovacuum_min_duration = 600000;
 static bool am_autovacuum_launcher = false;
 static bool am_autovacuum_worker = false;
 
+static double av_table_option_cost_delay = -1;
+static int	av_table_option_cost_limit = 0;
+
 /* Flags set by signal handlers */
 static volatile sig_atomic_t got_SIGUSR2 = false;
 
@@ -189,8 +192,8 @@ typedef struct autovac_table
 {
 	Oid			at_relid;
 	VacuumParams at_params;
-	double		at_vacuum_cost_delay;
-	int			at_vacuum_cost_limit;
+	double		at_table_option_vac_cost_delay;
+	int			at_table_option_vac_cost_limit;
 	bool		at_dobalance;
 	bool		at_sharedrel;
 	char	   *at_relname;
@@ -209,7 +212,7 @@ typedef struct autovac_table
  * wi_sharedrel flag indicating whether table is marked relisshared
  * wi_proc		pointer to PGPROC of the running worker, NULL if not started
  * wi_launchtime Time at which this worker was launched
- * wi_cost_*	Vacuum cost-based delay parameters current in this worker
+ * wi_dobalance Whether this worker should be included in balance calculations
  *
  * All fields are protected by AutovacuumLock, except for wi_tableoid and
  * wi_sharedrel which are protected by AutovacuumScheduleLock (note these
@@ -225,9 +228,6 @@ typedef struct WorkerInfoData
 	TimestampTz wi_launchtime;
 	bool		wi_dobalance;
 	bool		wi_sharedrel;
-	double		wi_cost_delay;
-	int			wi_cost_limit;
-	int			wi_cost_limit_base;
 } WorkerInfoData;
 
 typedef struct WorkerInfoData *WorkerInfo;
@@ -273,6 +273,8 @@ typedef struct AutoVacuumWorkItem
  * av_startingWorker pointer to WorkerInfo currently being started (cleared by
  *					the worker itself as soon as it's up and running)
  * av_workItems		work item array
+ * av_nworkers_for_balance the number of autovacuum workers to use when
+ * 					calculating the per worker cost limit
  *
  * This struct is protected by AutovacuumLock, except for av_signal and parts
  * of the worker list (see above).
@@ -286,6 +288,7 @@ typedef struct
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+	pg_atomic_uint32 av_nworkers_for_balance;
 } AutoVacuumShmemStruct;
 
 static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -820,7 +823,7 @@ HandleAutoVacLauncherInterrupts(void)
 			AutoVacLauncherShutdown();
 
 		/* rebalance in case the default cost parameters changed */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
 		autovac_balance_cost();
 		LWLockRelease(AutovacuumLock);
 
@@ -1756,9 +1759,6 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
 		MyWorkerInfo->wi_dobalance = false;
-		MyWorkerInfo->wi_cost_delay = 0;
-		MyWorkerInfo->wi_cost_limit = 0;
-		MyWorkerInfo->wi_cost_limit_base = 0;
 		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 						&MyWorkerInfo->wi_links);
 		/* not mine anymore */
@@ -1773,123 +1773,114 @@ FreeWorkerInfo(int code, Datum arg)
 	}
 }
 
+
 /*
- * Update the cost-based delay parameters, so that multiple workers consume
- * each a fraction of the total available I/O.
+ * Update VacuumCostDelay with the correct value for an autovacuum worker,
+ * given the value of other relevant cost-based delay parameters. Autovacuum
+ * workers should call this after every config reload, in case VacuumCostDelay
+ * was overwritten.
  */
 void
-AutoVacuumOverrideCosts(void)
+AutoVacuumUpdateDelay(void)
 {
-	if (MyWorkerInfo)
+	if (!am_autovacuum_worker)
+		return;
+
+	if (av_table_option_cost_delay >= 0)
+		VacuumCostDelay = av_table_option_cost_delay;
+	else if (autovacuum_vac_cost_delay >= 0)
+		VacuumCostDelay = autovacuum_vac_cost_delay;
+
+	/*
+	 * If configuration changes are allowed to impact VacuumCostInactive, make
+	 * sure it is updated.
+	 */
+	if (VacuumCostInactive == VACUUM_COST_INACTIVE_AND_LOCKED)
+		return;
+
+	if (VacuumCostDelay > 0)
+		VacuumCostInactive = VACUUM_COST_ACTIVE;
+	else
 	{
-		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
-		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+		VacuumCostInactive = VACUUM_COST_INACTIVE_AND_UNLOCKED;
+		VacuumCostBalance = 0;
 	}
 }
 
+
 /*
- * Caller must not already hold the AutovacuumLock
+ * Update VacuumCostLimit with the correct value for an autovacuum worker,
+ * given the value of other relevant cost limit parameters and the number of
+ * workers across which the limit must be balanced. Autovacuum workers must
+ * call this regularly in case av_nworkers_for_balance has been updated by
+ * another worker or by the autovacuum launcher. They also must call this after
+ * every config reload, in case VacuumCostLimit was overwritten.
  */
 void
-AutoVacuumUpdateCosts(void)
+AutoVacuumUpdateLimit(void)
 {
+	if (!am_autovacuum_worker)
+		return;
+
 	/*
-	 * Even though this autovacuum worker may be vacuuming a table with a cost
-	 * limit table option and not a cost delay table option, we still don't
-	 * refresh the cost delay value.
+	 * note: in cost_limit, zero also means use value from elsewhere, because
+	 * zero is not a valid value.
 	 */
-	if (!MyWorkerInfo || !MyWorkerInfo->wi_dobalance)
-		return;
+	if (av_table_option_cost_limit > 0)
+		VacuumCostLimit = av_table_option_cost_limit;
+	else
+	{
+		/* There is at least 1 autovac worker (this worker). */
+		int			nworkers_for_balance = Max(pg_atomic_read_u32(
+					&AutoVacuumShmem->av_nworkers_for_balance), 1);
 
-	LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-	MyWorkerInfo->wi_cost_delay = autovacuum_vac_cost_delay >= 0 ?
-		autovacuum_vac_cost_delay : VacuumCostDelay;
-	MyWorkerInfo->wi_cost_limit_base = autovacuum_vac_cost_limit > 0 ?
+		int			vac_cost_limit = autovacuum_vac_cost_limit > 0 ?
 		autovacuum_vac_cost_limit : VacuumCostLimit;
-	autovac_balance_cost();
-	LWLockRelease(AutovacuumLock);
+
+		int			balanced_cost_limit = vac_cost_limit / nworkers_for_balance;
+
+		VacuumCostLimit = Max(Min(balanced_cost_limit, vac_cost_limit), 1);
+	}
 }
 
+
 /*
  * autovac_balance_cost
- *		Recalculate the cost limit setting for each active worker.
+ *		Recalculate the number of workers to consider, given table options and
+ *		the current number of active workers.
  *
- * Caller must hold the AutovacuumLock in exclusive mode.
+ * Caller must hold the AutovacuumLock in at least shared mode.
  */
 static void
 autovac_balance_cost(void)
 {
-	/*
-	 * The idea here is that we ration out I/O equally.  The amount of I/O
-	 * that a worker can consume is determined by cost_limit/cost_delay, so we
-	 * try to equalize those ratios rather than the raw limit settings.
-	 *
-	 * note: in cost_limit, zero also means use value from elsewhere, because
-	 * zero is not a valid value.
-	 */
-	int			vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
-								  autovacuum_vac_cost_limit : VacuumCostLimit);
-	double		vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
-								  autovacuum_vac_cost_delay : VacuumCostDelay);
-	double		cost_total;
-	double		cost_avail;
 	dlist_iter	iter;
+	int			orig_nworkers_for_balance;
+	int			nworkers_for_balance = 0;
 
-	/* not set? nothing to do */
-	if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
+	if (autovacuum_vac_cost_delay == 0 ||
+		(autovacuum_vac_cost_delay == -1 && VacuumCostDelay == 0))
 		return;
 
-	/* calculate the total base cost limit of participating active workers */
-	cost_total = 0.0;
-	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
-	{
-		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
-
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-			cost_total +=
-				(double) worker->wi_cost_limit_base / worker->wi_cost_delay;
-	}
-
-	/* there are no cost limits -- nothing to do */
-	if (cost_total <= 0)
+	if (autovacuum_vac_cost_limit <= 0 && VacuumCostLimit <= 0)
 		return;
 
-	/*
-	 * Adjust cost limit of each active worker to balance the total of cost
-	 * limit to autovacuum_vacuum_cost_limit.
-	 */
-	cost_avail = (double) vac_cost_limit / vac_cost_delay;
+	orig_nworkers_for_balance =
+		pg_atomic_read_u32(&AutoVacuumShmem->av_nworkers_for_balance);
+
 	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
 	{
 		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-		{
-			int			limit = (int)
-			(cost_avail * worker->wi_cost_limit_base / cost_total);
-
-			/*
-			 * We put a lower bound of 1 on the cost_limit, to avoid division-
-			 * by-zero in the vacuum code.  Also, in case of roundoff trouble
-			 * in these calculations, let's be sure we don't ever set
-			 * cost_limit to more than the base value.
-			 */
-			worker->wi_cost_limit = Max(Min(limit,
-											worker->wi_cost_limit_base),
-										1);
-		}
+		if (worker->wi_proc == NULL || !worker->wi_dobalance)
+			continue;
 
-		if (worker->wi_proc != NULL)
-			elog(DEBUG2, "autovac_balance_cost(pid=%d db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d, cost_delay=%g)",
-				 worker->wi_proc->pid, worker->wi_dboid, worker->wi_tableoid,
-				 worker->wi_dobalance ? "yes" : "no",
-				 worker->wi_cost_limit, worker->wi_cost_limit_base,
-				 worker->wi_cost_delay);
+		nworkers_for_balance++;
 	}
+
+	if (nworkers_for_balance != orig_nworkers_for_balance)
+		pg_atomic_write_u32(&AutoVacuumShmem->av_nworkers_for_balance,
+							nworkers_for_balance);
 }
 
 /*
@@ -2335,8 +2326,6 @@ do_autovacuum(void)
 		autovac_table *tab;
 		bool		isshared;
 		bool		skipit;
-		double		stdVacuumCostDelay;
-		int			stdVacuumCostLimit;
 		dlist_iter	iter;
 
 		CHECK_FOR_INTERRUPTS();
@@ -2442,32 +2431,18 @@ do_autovacuum(void)
 			continue;
 		}
 
-		/*
-		 * Remember the prevailing values of the vacuum cost GUCs.  We have to
-		 * restore these at the bottom of the loop, else we'll compute wrong
-		 * values in the next iteration of autovac_balance_cost().
-		 */
-		stdVacuumCostDelay = VacuumCostDelay;
-		stdVacuumCostLimit = VacuumCostLimit;
+		av_table_option_cost_limit = tab->at_table_option_vac_cost_limit;
+		av_table_option_cost_delay = tab->at_table_option_vac_cost_delay;
 
 		/* Must hold AutovacuumLock while mucking with cost balance info */
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-
-		/* advertise my cost delay parameters for the balancing algorithm */
 		MyWorkerInfo->wi_dobalance = tab->at_dobalance;
-		MyWorkerInfo->wi_cost_delay = tab->at_vacuum_cost_delay;
-		MyWorkerInfo->wi_cost_limit = tab->at_vacuum_cost_limit;
-		MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
-
-		/* do a balance */
 		autovac_balance_cost();
-
-		/* set the active cost parameters from the result of that */
-		AutoVacuumOverrideCosts();
-
-		/* done */
 		LWLockRelease(AutovacuumLock);
 
+		AutoVacuumUpdateDelay();
+		AutoVacuumUpdateLimit();
+
 		/* clean up memory before each iteration */
 		MemoryContextResetAndDeleteChildren(PortalContext);
 
@@ -2551,19 +2526,15 @@ deleted:
 
 		/*
 		 * Remove my info from shared memory.  We could, but intentionally
-		 * don't, clear wi_cost_limit and friends --- this is on the
-		 * assumption that we probably have more to do with similar cost
-		 * settings, so we don't want to give up our share of I/O for a very
-		 * short interval and thereby thrash the global balance.
+		 * don't, set wi_dobalance to false on the assumption that we are more
+		 * likely than not to vacuum a table with no table options next, so we
+		 * don't want to give up our share of I/O for a very short interval
+		 * and thereby thrash the global balance.
 		 */
 		LWLockAcquire(AutovacuumScheduleLock, LW_EXCLUSIVE);
 		MyWorkerInfo->wi_tableoid = InvalidOid;
 		MyWorkerInfo->wi_sharedrel = false;
 		LWLockRelease(AutovacuumScheduleLock);
-
-		/* restore vacuum cost GUCs for the next iteration */
-		VacuumCostDelay = stdVacuumCostDelay;
-		VacuumCostLimit = stdVacuumCostLimit;
 	}
 
 	/*
@@ -2595,6 +2566,8 @@ deleted:
 		{
 			ConfigReloadPending = false;
 			ProcessConfigFile(PGC_SIGHUP);
+			AutoVacuumUpdateDelay();
+			AutoVacuumUpdateLimit();
 		}
 
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
@@ -2827,8 +2800,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		int			freeze_table_age;
 		int			multixact_freeze_min_age;
 		int			multixact_freeze_table_age;
-		int			vac_cost_limit;
-		double		vac_cost_delay;
 		int			log_min_duration;
 
 		/*
@@ -2838,20 +2809,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 * defaults, autovacuum's own first and plain vacuum second.
 		 */
 
-		/* -1 in autovac setting means use plain vacuum_cost_delay */
-		vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
-			? avopts->vacuum_cost_delay
-			: (autovacuum_vac_cost_delay >= 0)
-			? autovacuum_vac_cost_delay
-			: VacuumCostDelay;
-
-		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
-		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
-			? avopts->vacuum_cost_limit
-			: (autovacuum_vac_cost_limit > 0)
-			? autovacuum_vac_cost_limit
-			: VacuumCostLimit;
-
 		/* -1 in autovac setting means use log_autovacuum_min_duration */
 		log_min_duration = (avopts && avopts->log_min_duration >= 0)
 			? avopts->log_min_duration
@@ -2907,8 +2864,10 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab->at_params.multixact_freeze_table_age = multixact_freeze_table_age;
 		tab->at_params.is_wraparound = wraparound;
 		tab->at_params.log_min_duration = log_min_duration;
-		tab->at_vacuum_cost_limit = vac_cost_limit;
-		tab->at_vacuum_cost_delay = vac_cost_delay;
+		tab->at_table_option_vac_cost_limit = avopts ?
+			avopts->vacuum_cost_limit : 0;
+		tab->at_table_option_vac_cost_delay = avopts ?
+			avopts->vacuum_cost_delay : -1;
 		tab->at_relname = NULL;
 		tab->at_nspname = NULL;
 		tab->at_datname = NULL;
@@ -3400,10 +3359,14 @@ AutoVacuumShmemInit(void)
 		worker = (WorkerInfo) ((char *) AutoVacuumShmem +
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
+
 		/* initialize the WorkerInfo free list */
 		for (i = 0; i < autovacuum_max_workers; i++)
 			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 							&worker[i].wi_links);
+
+		pg_atomic_init_u32(&AutoVacuumShmem->av_nworkers_for_balance, 0);
+
 	}
 	else
 		Assert(found);
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index ee48e7123d..7b462866c9 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -64,8 +64,8 @@ extern int	StartAutoVacWorker(void);
 extern void AutoVacWorkerFailed(void);
 
 /* autovacuum cost-delay balancer */
-extern void AutoVacuumOverrideCosts(void);
-extern void AutoVacuumUpdateCosts(void);
+extern void AutoVacuumUpdateDelay(void);
+extern void AutoVacuumUpdateLimit(void);
 
 #ifdef EXEC_BACKEND
 extern void AutoVacLauncherMain(int argc, char *argv[]) pg_attribute_noreturn();
-- 
2.37.2

v8-0002-Make-VacuumCostActive-failsafe-aware.patchtext/x-patch; charset=US-ASCII; name=v8-0002-Make-VacuumCostActive-failsafe-aware.patchDownload
From 3a9492b1f3eaacafc730aea0d53085046886ecad Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 25 Mar 2023 12:05:18 -0400
Subject: [PATCH v8 2/4] Make VacuumCostActive failsafe-aware

While vacuuming a table in failsafe mode, VacuumCostActive should not be
re-enabled. This currently isn't a problem because vacuum cost
parameters are only refreshed in between vacuuming tables and failsafe
status is reset for every table. In preparation for allowing vacuum cost
parameters to be updated more frequently, make vacuum cost status more
expressive.

VacuumCostActive is now VacuumCostInactive, as it can only be active in
one way but it can be inactive in two ways. If performing a failsafe
vacuum, the vacuum cost status cannot be enabled and is effectively
"locked". If performing a non-failsafe vacuum, the vacuum cost status
may be active or inactive. To express this, VacuumCostInactive can be
one of three statuses: VACUUM_COST_INACTIVE_AND_LOCKED,
VACUUM_COST_ACTIVE_AND_LOCKED, and VACUUM_COST_ACTIVE.

VacuumCostInactive is defined as an integer because we do not want
non-vacuum code concerning itself with the distinction between the three
statuses -- only with whether or not VacuumCostInactive == 0 or not.
---
 src/backend/access/heap/vacuumlazy.c  |  2 +-
 src/backend/commands/vacuum.c         | 23 ++++++++++++++++++++---
 src/backend/commands/vacuumparallel.c |  8 ++++++--
 src/backend/storage/buffer/bufmgr.c   |  8 ++++----
 src/backend/utils/init/globals.c      |  2 +-
 src/include/commands/vacuum.h         |  8 ++++++++
 src/include/miscadmin.h               |  3 +--
 7 files changed, 41 insertions(+), 13 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8f14cf85f3..040a4e931b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2637,7 +2637,7 @@ lazy_check_wraparound_failsafe(LVRelState *vacrel)
 						 "You might also need to consider other ways for VACUUM to keep up with the allocation of transaction IDs.")));
 
 		/* Stop applying cost limits from this point on */
-		VacuumCostActive = false;
+		VacuumCostInactive = VACUUM_COST_INACTIVE_AND_LOCKED;
 		VacuumCostBalance = 0;
 
 		return true;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7d01df7e48..eb126f2247 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -491,7 +491,6 @@ vacuum(List *relations, VacuumParams *params,
 		ListCell   *cur;
 
 		in_vacuum = true;
-		VacuumCostActive = (VacuumCostDelay > 0);
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
@@ -507,6 +506,24 @@ vacuum(List *relations, VacuumParams *params,
 		{
 			VacuumRelation *vrel = lfirst_node(VacuumRelation, cur);
 
+			/*
+			 * failsafe_active is reset per relation, so we must be sure that
+			 * VacuumCostInactive is set to either VACUUM_COST_INACTIVE or
+			 * VACUUM_COST_INACTIVE_AND_UNLOCKED in between vacuuming
+			 * relations.
+			 */
+			VacuumCostInactive = VacuumCostDelay > 0 ? VACUUM_COST_ACTIVE :
+				VACUUM_COST_INACTIVE_AND_UNLOCKED;
+
+			/*
+			 * We should not have transitioned VacuumCostInactive from
+			 * VACUUM_COST_ACTIVE to VACUUM_COST_INACTIVE_AND_UNLOCKED above,
+			 * as that should have happened when we changed the value of
+			 * VacuumCostDelay.
+			 */
+			Assert(VacuumCostInactive == VACUUM_COST_ACTIVE ||
+				   VacuumCostBalance == 0);
+
 			if (params->options & VACOPT_VACUUM)
 			{
 				if (!vacuum_rel(vrel->oid, vrel->relation, params, false))
@@ -549,7 +566,7 @@ vacuum(List *relations, VacuumParams *params,
 	PG_FINALLY();
 	{
 		in_vacuum = false;
-		VacuumCostActive = false;
+		VacuumCostInactive = VACUUM_COST_INACTIVE_AND_UNLOCKED;
 		VacuumCostBalance = 0;
 	}
 	PG_END_TRY();
@@ -2216,7 +2233,7 @@ vacuum_delay_point(void)
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	if (!VacuumCostActive || InterruptPending)
+	if (VacuumCostInactive || InterruptPending)
 		return;
 
 	/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index bcd40c80a1..266bf6bb4c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -989,8 +989,12 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												 PARALLEL_VACUUM_KEY_DEAD_ITEMS,
 												 false);
 
-	/* Set cost-based vacuum delay */
-	VacuumCostActive = (VacuumCostDelay > 0);
+	/*
+	 * Set cost-based vacuum delay Parallel vacuum workers will not execute
+	 * failsafe VACUUM.
+	 */
+	VacuumCostInactive = VacuumCostDelay > 0 ? VACUUM_COST_ACTIVE :
+		VACUUM_COST_INACTIVE_AND_UNLOCKED;
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
 	VacuumPageMiss = 0;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 95212a3941..6d3dd26fc7 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -893,7 +893,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 			*hit = true;
 			VacuumPageHit++;
 
-			if (VacuumCostActive)
+			if (!VacuumCostInactive)
 				VacuumCostBalance += VacuumCostPageHit;
 
 			TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
@@ -1098,7 +1098,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 	}
 
 	VacuumPageMiss++;
-	if (VacuumCostActive)
+	if (!VacuumCostInactive)
 		VacuumCostBalance += VacuumCostPageMiss;
 
 	TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
@@ -1672,7 +1672,7 @@ MarkBufferDirty(Buffer buffer)
 	{
 		VacuumPageDirty++;
 		pgBufferUsage.shared_blks_dirtied++;
-		if (VacuumCostActive)
+		if (!VacuumCostInactive)
 			VacuumCostBalance += VacuumCostPageDirty;
 	}
 }
@@ -4199,7 +4199,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
 		{
 			VacuumPageDirty++;
 			pgBufferUsage.shared_blks_dirtied++;
-			if (VacuumCostActive)
+			if (!VacuumCostInactive)
 				VacuumCostBalance += VacuumCostPageDirty;
 		}
 	}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 1b1d814254..608ebb9182 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -150,4 +150,4 @@ int64		VacuumPageMiss = 0;
 int64		VacuumPageDirty = 0;
 
 int			VacuumCostBalance = 0;	/* working state for vacuum */
-bool		VacuumCostActive = false;
+int			VacuumCostInactive = 1;
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index bdfd96cfec..5c3e250b06 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -302,6 +302,14 @@ extern PGDLLIMPORT int vacuum_failsafe_age;
 extern PGDLLIMPORT int vacuum_multixact_failsafe_age;
 
 /* Variables for cost-based parallel vacuum */
+
+typedef enum VacuumCostStatus
+{
+	VACUUM_COST_INACTIVE_AND_LOCKED = -1,
+	VACUUM_COST_ACTIVE = 0,
+	VACUUM_COST_INACTIVE_AND_UNLOCKED = 1,
+}			VacuumCostStatus;
+
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 06a86f9ac1..33e22733ae 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -274,8 +274,7 @@ extern PGDLLIMPORT int64 VacuumPageMiss;
 extern PGDLLIMPORT int64 VacuumPageDirty;
 
 extern PGDLLIMPORT int VacuumCostBalance;
-extern PGDLLIMPORT bool VacuumCostActive;
-
+extern PGDLLIMPORT int VacuumCostInactive;
 
 /* in tcop/postgres.c */
 
-- 
2.37.2

#31Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#30)
4 attachment(s)
Re: Should vacuum process config file reload more often

On Sat, Mar 25, 2023 at 3:03 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Thu, Mar 23, 2023 at 8:27 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Thu, Mar 23, 2023 at 2:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

And, I was wondering if it was worth trying to split up the part that
reloads the config file and all of the autovacuum stuff. The reloading
of the config file by itself won't actually result in autovacuum workers
having updated cost delays because of them overwriting it with
wi_cost_delay, but it will allow VACUUM to have those updated values.

It makes sense to me to have changes for overhauling the rebalance
mechanism in a separate patch.

Looking back at the original concern you mentioned[1]:

speed up long-running vacuum of a large table by
decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
config file is only reloaded between tables (for autovacuum) or after
the statement (for explicit vacuum).

does it make sense to have autovac_balance_cost() update workers'
wi_cost_delay too? Autovacuum launcher already reloads the config file
and does the rebalance. So I thought autovac_balance_cost() can update
the cost_delay as well, and this might be a minimal change to deal
with your concern. This doesn't have the effect for manual VACUUM but
since vacuum delay is disabled by default it won't be a big problem.
As for manual VACUUMs, we would need to reload the config file in
vacuum_delay_point() as the part of your patch does. Overhauling the
rebalance mechanism would be another patch to improve it further.

So, we can't do this without acquiring an access shared lock on every
call to vacuum_delay_point() because cost delay is a double.

I will work on a patchset with separate commits for reloading the config
file, though (with autovac not benefitting in the first commit).

So, I realized we could actually do as you say and have autovac workers
update their wi_cost_delay and keep the balance changes in a separate
commit. I've done this in attached v8.

Workers take the exclusive lock to update their wi_cost_delay and
wi_cost_limit only when there is a config reload. So, there is one
commit that implements this behavior and a separate commit to revise the
worker rebalancing.

So, I've attached an alternate version of the patchset which takes the
approach of having one commit which only enables cost-based delay GUC
refresh for VACUUM and another commit which enables it for autovacuum
and makes the changes to balancing variables.

I still think the commit which has workers updating their own
wi_cost_delay in vacuum_delay_point() is a bit weird. It relies on no one
else emulating our bad behavior and reading from wi_cost_delay without a
lock and on no one else deciding to ever write to wi_cost_delay (even
though it is in shared memory [this is the same as master]). It is only
safe because our process is the only one (right now) writing to
wi_cost_delay, so when we read from it without a lock, we know it isn't
being written to. And everyone else takes a lock when reading from
wi_cost_delay right now. So, it seems...not great.

This approach also introduces a function that is only around for
one commit until the next commit obsoletes it, which seems a bit silly.

Basically, I think it is probably better to just have one commit
enabling guc refresh for VACUUM and then another which correctly
implements what is needed for autovacuum to do the same.
Attached v9 does this.

I've provided both complete versions of both approaches (v9 and v8).

- Melanie

Attachments:

v9-0001-Zero-out-VacuumCostBalance.patchtext/x-patch; charset=US-ASCII; name=v9-0001-Zero-out-VacuumCostBalance.patchDownload
From 94c08c1b764619ad6cc3a0c75295f416e1863b26 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 25 Mar 2023 11:59:33 -0400
Subject: [PATCH v9 1/4] Zero out VacuumCostBalance

Though it is unlikely to matter, we should zero out VacuumCostBalance
whenever we may be transitioning the state of VacuumCostActive to false.
---
 src/backend/commands/vacuum.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index c54360a6a0..7d01df7e48 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -550,6 +550,7 @@ vacuum(List *relations, VacuumParams *params,
 	{
 		in_vacuum = false;
 		VacuumCostActive = false;
+		VacuumCostBalance = 0;
 	}
 	PG_END_TRY();
 
-- 
2.37.2

v9-0004-Autovacuum-refreshes-cost-based-delay-params-more.patchtext/x-patch; charset=US-ASCII; name=v9-0004-Autovacuum-refreshes-cost-based-delay-params-more.patchDownload
From 3bdf9414c5840bcc94a9c0292f25ec41d2e95b69 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 25 Mar 2023 14:14:55 -0400
Subject: [PATCH v9 4/4] Autovacuum refreshes cost-based delay params more
 often

The previous commit allowed VACUUM to reload the config file more often
so that cost-based delay parameters could take effect while VACUUMing a
relation. Autovacuum, however did not benefit from this change.

In order for autovacuum workers to safely update their own cost delay
and cost limit parameters without impacting performance, we had to
rethink when and how these values were accessed.

Previously, an autovacuum worker's wi_cost_limit was set only at the
beginning of vacuuming a table, after reloading the config file.
Therefore, at the time that autovac_balance_cost() is called, workers
vacuuming tables with no table options could still have different values
for their wi_cost_limit_base and wi_cost_delay.

Now that the cost parameters can be updated while vacuuming a table,
workers will (within some margin of error) have no reason to have
different values for cost limit and cost delay (in the absence of table
options). This removes the rationale for keeping cost limit and cost
delay in shared memory. Balancing the cost limit requires only the
number of active autovacuum workers vacuuming a table with no cost-based
table options.
---
 src/backend/commands/vacuum.c       |  24 ++-
 src/backend/postmaster/autovacuum.c | 228 +++++++++++++---------------
 src/include/postmaster/autovacuum.h |   1 +
 3 files changed, 124 insertions(+), 129 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 0e686c94b2..8e39a13285 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2241,11 +2241,11 @@ vacuum_delay_point(void)
 
 	/*
 	 * Reload the configuration file if requested. This allows changes to
-	 * vacuum_cost_limit and vacuum_cost_delay to take effect while a table is
-	 * being vacuumed or analyzed. Analyze should not reload configuration
-	 * file if it is in an outer transaction, as GUC values shouldn't be
-	 * allowed to refer to some uncommitted state (e.g. database objects
-	 * created in this transaction).
+	 * [autovacuum]_vacuum_cost_limit and [autovacuum]_vacuum_cost_delay to
+	 * take effect while a table is being vacuumed or analyzed. Analyze should
+	 * not reload configuration file if it is in an outer transaction, as GUC
+	 * values shouldn't be allowed to refer to some uncommitted state (e.g.
+	 * database objects created in this transaction).
 	 */
 	if (ConfigReloadPending && !analyze_in_outer_xact)
 	{
@@ -2258,10 +2258,12 @@ vacuum_delay_point(void)
 		 * by reload.
 		 */
 		AutoVacuumUpdateDelay();
+		AutoVacuumUpdateLimit();
 
 		/*
 		 * If configuration changes are allowed to impact VacuumCostInactive,
-		 * make sure it is updated.
+		 * make sure it is updated. Autovacuum workers will have already done
+		 * this in AutoVacuumUpdateDelay()
 		 */
 		if (VacuumCostInactive == VACUUM_COST_INACTIVE_AND_LOCKED)
 			return;
@@ -2312,8 +2314,14 @@ vacuum_delay_point(void)
 
 		VacuumCostBalance = 0;
 
-		/* update balance values for workers */
-		AutoVacuumUpdateDelay();
+		/*
+		 * Update limit values for autovacuum workers. We must always do this
+		 * in case the autovacuum launcher or another autovacuum worker has
+		 * recalculated the number of workers across which we must balance the
+		 * limit. This is done by the launcher when launching a new worker and
+		 * by workers before vacuuming each table.
+		 */
+		AutoVacuumUpdateLimit();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 585d28148c..6fe16aca3a 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -139,6 +139,9 @@ int			Log_autovacuum_min_duration = 600000;
 static bool am_autovacuum_launcher = false;
 static bool am_autovacuum_worker = false;
 
+static double av_table_option_cost_delay = -1;
+static int	av_table_option_cost_limit = 0;
+
 /* Flags set by signal handlers */
 static volatile sig_atomic_t got_SIGUSR2 = false;
 
@@ -189,8 +192,8 @@ typedef struct autovac_table
 {
 	Oid			at_relid;
 	VacuumParams at_params;
-	double		at_vacuum_cost_delay;
-	int			at_vacuum_cost_limit;
+	double		at_table_option_vac_cost_delay;
+	int			at_table_option_vac_cost_limit;
 	bool		at_dobalance;
 	bool		at_sharedrel;
 	char	   *at_relname;
@@ -209,7 +212,7 @@ typedef struct autovac_table
  * wi_sharedrel flag indicating whether table is marked relisshared
  * wi_proc		pointer to PGPROC of the running worker, NULL if not started
  * wi_launchtime Time at which this worker was launched
- * wi_cost_*	Vacuum cost-based delay parameters current in this worker
+ * wi_dobalance Whether this worker should be included in balance calculations
  *
  * All fields are protected by AutovacuumLock, except for wi_tableoid and
  * wi_sharedrel which are protected by AutovacuumScheduleLock (note these
@@ -225,9 +228,6 @@ typedef struct WorkerInfoData
 	TimestampTz wi_launchtime;
 	bool		wi_dobalance;
 	bool		wi_sharedrel;
-	double		wi_cost_delay;
-	int			wi_cost_limit;
-	int			wi_cost_limit_base;
 } WorkerInfoData;
 
 typedef struct WorkerInfoData *WorkerInfo;
@@ -273,6 +273,8 @@ typedef struct AutoVacuumWorkItem
  * av_startingWorker pointer to WorkerInfo currently being started (cleared by
  *					the worker itself as soon as it's up and running)
  * av_workItems		work item array
+ * av_nworkers_for_balance the number of autovacuum workers to use when
+ * 					calculating the per worker cost limit
  *
  * This struct is protected by AutovacuumLock, except for av_signal and parts
  * of the worker list (see above).
@@ -286,6 +288,7 @@ typedef struct
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+	pg_atomic_uint32 av_nworkers_for_balance;
 } AutoVacuumShmemStruct;
 
 static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -820,7 +823,7 @@ HandleAutoVacLauncherInterrupts(void)
 			AutoVacLauncherShutdown();
 
 		/* rebalance in case the default cost parameters changed */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
 		autovac_balance_cost();
 		LWLockRelease(AutovacuumLock);
 
@@ -1756,9 +1759,6 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
 		MyWorkerInfo->wi_dobalance = false;
-		MyWorkerInfo->wi_cost_delay = 0;
-		MyWorkerInfo->wi_cost_limit = 0;
-		MyWorkerInfo->wi_cost_limit_base = 0;
 		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 						&MyWorkerInfo->wi_links);
 		/* not mine anymore */
@@ -1773,100 +1773,114 @@ FreeWorkerInfo(int code, Datum arg)
 	}
 }
 
+
 /*
- * Update the cost-based delay parameters, so that multiple workers consume
- * each a fraction of the total available I/O.
+ * Update VacuumCostDelay with the correct value for an autovacuum worker,
+ * given the value of other relevant cost-based delay parameters. Autovacuum
+ * workers should call this after every config reload, in case VacuumCostDelay
+ * was overwritten.
  */
 void
 AutoVacuumUpdateDelay(void)
 {
-	if (MyWorkerInfo)
+	if (!am_autovacuum_worker)
+		return;
+
+	if (av_table_option_cost_delay >= 0)
+		VacuumCostDelay = av_table_option_cost_delay;
+	else if (autovacuum_vac_cost_delay >= 0)
+		VacuumCostDelay = autovacuum_vac_cost_delay;
+
+	/*
+	 * If configuration changes are allowed to impact VacuumCostInactive, make
+	 * sure it is updated.
+	 */
+	if (VacuumCostInactive == VACUUM_COST_INACTIVE_AND_LOCKED)
+		return;
+
+	if (VacuumCostDelay > 0)
+		VacuumCostInactive = VACUUM_COST_ACTIVE;
+	else
 	{
-		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
-		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+		VacuumCostInactive = VACUUM_COST_INACTIVE_AND_UNLOCKED;
+		VacuumCostBalance = 0;
 	}
 }
 
+
 /*
- * autovac_balance_cost
- *		Recalculate the cost limit setting for each active worker.
- *
- * Caller must hold the AutovacuumLock in exclusive mode.
+ * Update VacuumCostLimit with the correct value for an autovacuum worker,
+ * given the value of other relevant cost limit parameters and the number of
+ * workers across which the limit must be balanced. Autovacuum workers must
+ * call this regularly in case av_nworkers_for_balance has been updated by
+ * another worker or by the autovacuum launcher. They also must call this after
+ * every config reload, in case VacuumCostLimit was overwritten.
  */
-static void
-autovac_balance_cost(void)
+void
+AutoVacuumUpdateLimit(void)
 {
+	if (!am_autovacuum_worker)
+		return;
+
 	/*
-	 * The idea here is that we ration out I/O equally.  The amount of I/O
-	 * that a worker can consume is determined by cost_limit/cost_delay, so we
-	 * try to equalize those ratios rather than the raw limit settings.
-	 *
 	 * note: in cost_limit, zero also means use value from elsewhere, because
 	 * zero is not a valid value.
 	 */
-	int			vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
-								  autovacuum_vac_cost_limit : VacuumCostLimit);
-	double		vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
-								  autovacuum_vac_cost_delay : VacuumCostDelay);
-	double		cost_total;
-	double		cost_avail;
-	dlist_iter	iter;
+	if (av_table_option_cost_limit > 0)
+		VacuumCostLimit = av_table_option_cost_limit;
+	else
+	{
+		/* There is at least 1 autovac worker (this worker). */
+		int			nworkers_for_balance = Max(pg_atomic_read_u32(
+								&AutoVacuumShmem->av_nworkers_for_balance), 1);
 
-	/* not set? nothing to do */
-	if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
-		return;
+		int			vac_cost_limit = autovacuum_vac_cost_limit > 0 ?
+		autovacuum_vac_cost_limit : VacuumCostLimit;
 
-	/* calculate the total base cost limit of participating active workers */
-	cost_total = 0.0;
-	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
-	{
-		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
+		int			balanced_cost_limit = vac_cost_limit / nworkers_for_balance;
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-			cost_total +=
-				(double) worker->wi_cost_limit_base / worker->wi_cost_delay;
+		VacuumCostLimit = Max(Min(balanced_cost_limit, vac_cost_limit), 1);
 	}
+}
 
-	/* there are no cost limits -- nothing to do */
-	if (cost_total <= 0)
+
+/*
+ * autovac_balance_cost
+ *		Recalculate the number of workers to consider, given table options and
+ *		the current number of active workers.
+ *
+ * Caller must hold the AutovacuumLock in at least shared mode.
+ */
+static void
+autovac_balance_cost(void)
+{
+	dlist_iter	iter;
+	int			orig_nworkers_for_balance;
+	int			nworkers_for_balance = 0;
+
+	if (autovacuum_vac_cost_delay == 0 ||
+		(autovacuum_vac_cost_delay == -1 && VacuumCostDelay == 0))
 		return;
 
-	/*
-	 * Adjust cost limit of each active worker to balance the total of cost
-	 * limit to autovacuum_vacuum_cost_limit.
-	 */
-	cost_avail = (double) vac_cost_limit / vac_cost_delay;
+	if (autovacuum_vac_cost_limit <= 0 && VacuumCostLimit <= 0)
+		return;
+
+	orig_nworkers_for_balance =
+		pg_atomic_read_u32(&AutoVacuumShmem->av_nworkers_for_balance);
+
 	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
 	{
 		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-		{
-			int			limit = (int)
-			(cost_avail * worker->wi_cost_limit_base / cost_total);
-
-			/*
-			 * We put a lower bound of 1 on the cost_limit, to avoid division-
-			 * by-zero in the vacuum code.  Also, in case of roundoff trouble
-			 * in these calculations, let's be sure we don't ever set
-			 * cost_limit to more than the base value.
-			 */
-			worker->wi_cost_limit = Max(Min(limit,
-											worker->wi_cost_limit_base),
-										1);
-		}
+		if (worker->wi_proc == NULL || !worker->wi_dobalance)
+			continue;
 
-		if (worker->wi_proc != NULL)
-			elog(DEBUG2, "autovac_balance_cost(pid=%d db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d, cost_delay=%g)",
-				 worker->wi_proc->pid, worker->wi_dboid, worker->wi_tableoid,
-				 worker->wi_dobalance ? "yes" : "no",
-				 worker->wi_cost_limit, worker->wi_cost_limit_base,
-				 worker->wi_cost_delay);
+		nworkers_for_balance++;
 	}
+
+	if (nworkers_for_balance != orig_nworkers_for_balance)
+		pg_atomic_write_u32(&AutoVacuumShmem->av_nworkers_for_balance,
+							nworkers_for_balance);
 }
 
 /*
@@ -2312,8 +2326,6 @@ do_autovacuum(void)
 		autovac_table *tab;
 		bool		isshared;
 		bool		skipit;
-		double		stdVacuumCostDelay;
-		int			stdVacuumCostLimit;
 		dlist_iter	iter;
 
 		CHECK_FOR_INTERRUPTS();
@@ -2416,31 +2428,17 @@ do_autovacuum(void)
 			continue;
 		}
 
-		/*
-		 * Remember the prevailing values of the vacuum cost GUCs.  We have to
-		 * restore these at the bottom of the loop, else we'll compute wrong
-		 * values in the next iteration of autovac_balance_cost().
-		 */
-		stdVacuumCostDelay = VacuumCostDelay;
-		stdVacuumCostLimit = VacuumCostLimit;
+		av_table_option_cost_limit = tab->at_table_option_vac_cost_limit;
+		av_table_option_cost_delay = tab->at_table_option_vac_cost_delay;
 
 		/* Must hold AutovacuumLock while mucking with cost balance info */
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-
-		/* advertise my cost delay parameters for the balancing algorithm */
 		MyWorkerInfo->wi_dobalance = tab->at_dobalance;
-		MyWorkerInfo->wi_cost_delay = tab->at_vacuum_cost_delay;
-		MyWorkerInfo->wi_cost_limit = tab->at_vacuum_cost_limit;
-		MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
-
-		/* do a balance */
 		autovac_balance_cost();
+		LWLockRelease(AutovacuumLock);
 
-		/* set the active cost parameters from the result of that */
 		AutoVacuumUpdateDelay();
-
-		/* done */
-		LWLockRelease(AutovacuumLock);
+		AutoVacuumUpdateLimit();
 
 		/* clean up memory before each iteration */
 		MemoryContextResetAndDeleteChildren(PortalContext);
@@ -2525,19 +2523,15 @@ deleted:
 
 		/*
 		 * Remove my info from shared memory.  We could, but intentionally
-		 * don't, clear wi_cost_limit and friends --- this is on the
-		 * assumption that we probably have more to do with similar cost
-		 * settings, so we don't want to give up our share of I/O for a very
-		 * short interval and thereby thrash the global balance.
+		 * don't, set wi_dobalance to false on the assumption that we are more
+		 * likely than not to vacuum a table with no table options next, so we
+		 * don't want to give up our share of I/O for a very short interval
+		 * and thereby thrash the global balance.
 		 */
 		LWLockAcquire(AutovacuumScheduleLock, LW_EXCLUSIVE);
 		MyWorkerInfo->wi_tableoid = InvalidOid;
 		MyWorkerInfo->wi_sharedrel = false;
 		LWLockRelease(AutovacuumScheduleLock);
-
-		/* restore vacuum cost GUCs for the next iteration */
-		VacuumCostDelay = stdVacuumCostDelay;
-		VacuumCostLimit = stdVacuumCostLimit;
 	}
 
 	/*
@@ -2569,6 +2563,8 @@ deleted:
 		{
 			ConfigReloadPending = false;
 			ProcessConfigFile(PGC_SIGHUP);
+			AutoVacuumUpdateDelay();
+			AutoVacuumUpdateLimit();
 		}
 
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
@@ -2804,8 +2800,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		int			freeze_table_age;
 		int			multixact_freeze_min_age;
 		int			multixact_freeze_table_age;
-		int			vac_cost_limit;
-		double		vac_cost_delay;
 		int			log_min_duration;
 
 		/*
@@ -2815,20 +2809,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 * defaults, autovacuum's own first and plain vacuum second.
 		 */
 
-		/* -1 in autovac setting means use plain vacuum_cost_delay */
-		vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
-			? avopts->vacuum_cost_delay
-			: (autovacuum_vac_cost_delay >= 0)
-			? autovacuum_vac_cost_delay
-			: VacuumCostDelay;
-
-		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
-		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
-			? avopts->vacuum_cost_limit
-			: (autovacuum_vac_cost_limit > 0)
-			? autovacuum_vac_cost_limit
-			: VacuumCostLimit;
-
 		/* -1 in autovac setting means use log_autovacuum_min_duration */
 		log_min_duration = (avopts && avopts->log_min_duration >= 0)
 			? avopts->log_min_duration
@@ -2884,8 +2864,10 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab->at_params.multixact_freeze_table_age = multixact_freeze_table_age;
 		tab->at_params.is_wraparound = wraparound;
 		tab->at_params.log_min_duration = log_min_duration;
-		tab->at_vacuum_cost_limit = vac_cost_limit;
-		tab->at_vacuum_cost_delay = vac_cost_delay;
+		tab->at_table_option_vac_cost_limit = avopts ?
+			avopts->vacuum_cost_limit : 0;
+		tab->at_table_option_vac_cost_delay = avopts ?
+			avopts->vacuum_cost_delay : -1;
 		tab->at_relname = NULL;
 		tab->at_nspname = NULL;
 		tab->at_datname = NULL;
@@ -3377,10 +3359,14 @@ AutoVacuumShmemInit(void)
 		worker = (WorkerInfo) ((char *) AutoVacuumShmem +
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
+
 		/* initialize the WorkerInfo free list */
 		for (i = 0; i < autovacuum_max_workers; i++)
 			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 							&worker[i].wi_links);
+
+		pg_atomic_init_u32(&AutoVacuumShmem->av_nworkers_for_balance, 0);
+
 	}
 	else
 		Assert(found);
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index c140371b51..7b462866c9 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -65,6 +65,7 @@ extern void AutoVacWorkerFailed(void);
 
 /* autovacuum cost-delay balancer */
 extern void AutoVacuumUpdateDelay(void);
+extern void AutoVacuumUpdateLimit(void);
 
 #ifdef EXEC_BACKEND
 extern void AutoVacLauncherMain(int argc, char *argv[]) pg_attribute_noreturn();
-- 
2.37.2

v9-0003-VACUUM-reloads-config-file-more-often.patchtext/x-patch; charset=US-ASCII; name=v9-0003-VACUUM-reloads-config-file-more-often.patchDownload
From 28d33a9d5cab0d384a18dbc2bb4ad95da61741c3 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Mon, 27 Mar 2023 13:33:19 -0400
Subject: [PATCH v9 3/4] VACUUM reloads config file more often

Previously, VACUUM would not reload the configuration file. So, changes
to cost-based delay parameters could only take effect on the next
invocation of VACUUM.

Check if a reload is pending roughly once per block now, when checking
if we need to delay.

Note that autovacuum is unaffected by this change. Autovacuum workers
overwrite the value of VacuumCostLimit and VacuumCostDelay with their
own WorkerInfo->wi_cost_limit and wi_cost_delay. Writing to their
wi_cost_delay more often makes reading wi_cost_delay without a lock to
update VacuumCostDelay an even worse idea.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_buP5wzsho3qNw5o9_R0pF69FRM5hgCmr-mvXmGXwdA7A%40mail.gmail.com#5e6771d4cdca4db6efc2acec2dce0bc7
---
 src/backend/commands/vacuum.c | 62 ++++++++++++++++++++++++++++++-----
 1 file changed, 54 insertions(+), 8 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index eb126f2247..0e686c94b2 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -48,6 +48,7 @@
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/interrupt.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/pmsignal.h"
@@ -76,6 +77,7 @@ int			vacuum_multixact_failsafe_age;
 /* A few variables that don't seem worth passing around as parameters */
 static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
+static bool analyze_in_outer_xact = false;
 
 
 /*
@@ -314,8 +316,7 @@ vacuum(List *relations, VacuumParams *params,
 	static bool in_vacuum = false;
 
 	const char *stmttype;
-	volatile bool in_outer_xact,
-				use_own_xacts;
+	volatile bool use_own_xacts;
 
 	Assert(params != NULL);
 
@@ -332,10 +333,10 @@ vacuum(List *relations, VacuumParams *params,
 	if (params->options & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
-		in_outer_xact = false;
+		analyze_in_outer_xact = false;
 	}
 	else
-		in_outer_xact = IsInTransactionBlock(isTopLevel);
+		analyze_in_outer_xact = IsInTransactionBlock(isTopLevel);
 
 	/*
 	 * Due to static variables vac_context, anl_context and vac_strategy,
@@ -457,7 +458,7 @@ vacuum(List *relations, VacuumParams *params,
 		Assert(params->options & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
-		else if (in_outer_xact)
+		else if (analyze_in_outer_xact)
 			use_own_xacts = false;
 		else if (list_length(relations) > 1)
 			use_own_xacts = true;
@@ -475,7 +476,7 @@ vacuum(List *relations, VacuumParams *params,
 	 */
 	if (use_own_xacts)
 	{
-		Assert(!in_outer_xact);
+		Assert(!analyze_in_outer_xact);
 
 		/* ActiveSnapshot is not set by autovacuum */
 		if (ActiveSnapshotSet())
@@ -544,7 +545,7 @@ vacuum(List *relations, VacuumParams *params,
 				}
 
 				analyze_rel(vrel->oid, vrel->relation, params,
-							vrel->va_cols, in_outer_xact, vac_strategy);
+							vrel->va_cols, analyze_in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
 				{
@@ -568,6 +569,7 @@ vacuum(List *relations, VacuumParams *params,
 		in_vacuum = false;
 		VacuumCostInactive = VACUUM_COST_INACTIVE_AND_UNLOCKED;
 		VacuumCostBalance = 0;
+		analyze_in_outer_xact = false;
 	}
 	PG_END_TRY();
 
@@ -2233,7 +2235,51 @@ vacuum_delay_point(void)
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	if (VacuumCostInactive || InterruptPending)
+	if (InterruptPending ||
+		(VacuumCostInactive && !ConfigReloadPending))
+		return;
+
+	/*
+	 * Reload the configuration file if requested. This allows changes to
+	 * vacuum_cost_limit and vacuum_cost_delay to take effect while a table is
+	 * being vacuumed or analyzed. Analyze should not reload configuration
+	 * file if it is in an outer transaction, as GUC values shouldn't be
+	 * allowed to refer to some uncommitted state (e.g. database objects
+	 * created in this transaction).
+	 */
+	if (ConfigReloadPending && !analyze_in_outer_xact)
+	{
+		ConfigReloadPending = false;
+		ProcessConfigFile(PGC_SIGHUP);
+
+		/*
+		 * Autovacuum workers must restore the correct values of
+		 * VacuumCostLimit and VacuumCostDelay in case they were overwritten
+		 * by reload.
+		 */
+		AutoVacuumUpdateDelay();
+
+		/*
+		 * If configuration changes are allowed to impact VacuumCostInactive,
+		 * make sure it is updated.
+		 */
+		if (VacuumCostInactive == VACUUM_COST_INACTIVE_AND_LOCKED)
+			return;
+
+		if (VacuumCostDelay > 0)
+			VacuumCostInactive = VACUUM_COST_ACTIVE;
+		else
+		{
+			VacuumCostInactive = VACUUM_COST_INACTIVE_AND_UNLOCKED;
+			VacuumCostBalance = 0;
+		}
+	}
+
+	/*
+	 * If we disabled cost-based delays after reloading the config file,
+	 * return.
+	 */
+	if (VacuumCostInactive)
 		return;
 
 	/*
-- 
2.37.2

v9-0002-Make-VacuumCostActive-failsafe-aware.patchtext/x-patch; charset=US-ASCII; name=v9-0002-Make-VacuumCostActive-failsafe-aware.patchDownload
From 57a5284abc43fa5531574f998ac2ab8d038264d1 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 25 Mar 2023 12:05:18 -0400
Subject: [PATCH v9 2/4] Make VacuumCostActive failsafe-aware

While vacuuming a table in failsafe mode, VacuumCostActive should not be
re-enabled. This currently isn't a problem because vacuum cost
parameters are only refreshed in between vacuuming tables and failsafe
status is reset for every table. In preparation for allowing vacuum cost
parameters to be updated more frequently, make vacuum cost status more
expressive.

VacuumCostActive is now VacuumCostInactive, as it can only be active in
one way but it can be inactive in two ways. If performing a failsafe
vacuum, the vacuum cost status cannot be enabled and is effectively
"locked". If performing a non-failsafe vacuum, the vacuum cost status
may be active or inactive. To express this, VacuumCostInactive can be
one of three statuses: VACUUM_COST_INACTIVE_AND_LOCKED,
VACUUM_COST_ACTIVE_AND_LOCKED, and VACUUM_COST_ACTIVE.

VacuumCostInactive is defined as an integer because we do not want
non-vacuum code concerning itself with the distinction between the three
statuses -- only with whether or not VacuumCostInactive == 0 or not.
---
 src/backend/access/heap/vacuumlazy.c  |  2 +-
 src/backend/commands/vacuum.c         | 23 ++++++++++++++++++++---
 src/backend/commands/vacuumparallel.c |  8 ++++++--
 src/backend/storage/buffer/bufmgr.c   |  8 ++++----
 src/backend/utils/init/globals.c      |  2 +-
 src/include/commands/vacuum.h         |  8 ++++++++
 src/include/miscadmin.h               |  3 +--
 7 files changed, 41 insertions(+), 13 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8f14cf85f3..040a4e931b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2637,7 +2637,7 @@ lazy_check_wraparound_failsafe(LVRelState *vacrel)
 						 "You might also need to consider other ways for VACUUM to keep up with the allocation of transaction IDs.")));
 
 		/* Stop applying cost limits from this point on */
-		VacuumCostActive = false;
+		VacuumCostInactive = VACUUM_COST_INACTIVE_AND_LOCKED;
 		VacuumCostBalance = 0;
 
 		return true;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7d01df7e48..eb126f2247 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -491,7 +491,6 @@ vacuum(List *relations, VacuumParams *params,
 		ListCell   *cur;
 
 		in_vacuum = true;
-		VacuumCostActive = (VacuumCostDelay > 0);
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
@@ -507,6 +506,24 @@ vacuum(List *relations, VacuumParams *params,
 		{
 			VacuumRelation *vrel = lfirst_node(VacuumRelation, cur);
 
+			/*
+			 * failsafe_active is reset per relation, so we must be sure that
+			 * VacuumCostInactive is set to either VACUUM_COST_INACTIVE or
+			 * VACUUM_COST_INACTIVE_AND_UNLOCKED in between vacuuming
+			 * relations.
+			 */
+			VacuumCostInactive = VacuumCostDelay > 0 ? VACUUM_COST_ACTIVE :
+				VACUUM_COST_INACTIVE_AND_UNLOCKED;
+
+			/*
+			 * We should not have transitioned VacuumCostInactive from
+			 * VACUUM_COST_ACTIVE to VACUUM_COST_INACTIVE_AND_UNLOCKED above,
+			 * as that should have happened when we changed the value of
+			 * VacuumCostDelay.
+			 */
+			Assert(VacuumCostInactive == VACUUM_COST_ACTIVE ||
+				   VacuumCostBalance == 0);
+
 			if (params->options & VACOPT_VACUUM)
 			{
 				if (!vacuum_rel(vrel->oid, vrel->relation, params, false))
@@ -549,7 +566,7 @@ vacuum(List *relations, VacuumParams *params,
 	PG_FINALLY();
 	{
 		in_vacuum = false;
-		VacuumCostActive = false;
+		VacuumCostInactive = VACUUM_COST_INACTIVE_AND_UNLOCKED;
 		VacuumCostBalance = 0;
 	}
 	PG_END_TRY();
@@ -2216,7 +2233,7 @@ vacuum_delay_point(void)
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	if (!VacuumCostActive || InterruptPending)
+	if (VacuumCostInactive || InterruptPending)
 		return;
 
 	/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index bcd40c80a1..266bf6bb4c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -989,8 +989,12 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												 PARALLEL_VACUUM_KEY_DEAD_ITEMS,
 												 false);
 
-	/* Set cost-based vacuum delay */
-	VacuumCostActive = (VacuumCostDelay > 0);
+	/*
+	 * Set cost-based vacuum delay Parallel vacuum workers will not execute
+	 * failsafe VACUUM.
+	 */
+	VacuumCostInactive = VacuumCostDelay > 0 ? VACUUM_COST_ACTIVE :
+		VACUUM_COST_INACTIVE_AND_UNLOCKED;
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
 	VacuumPageMiss = 0;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 95212a3941..6d3dd26fc7 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -893,7 +893,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 			*hit = true;
 			VacuumPageHit++;
 
-			if (VacuumCostActive)
+			if (!VacuumCostInactive)
 				VacuumCostBalance += VacuumCostPageHit;
 
 			TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
@@ -1098,7 +1098,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 	}
 
 	VacuumPageMiss++;
-	if (VacuumCostActive)
+	if (!VacuumCostInactive)
 		VacuumCostBalance += VacuumCostPageMiss;
 
 	TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
@@ -1672,7 +1672,7 @@ MarkBufferDirty(Buffer buffer)
 	{
 		VacuumPageDirty++;
 		pgBufferUsage.shared_blks_dirtied++;
-		if (VacuumCostActive)
+		if (!VacuumCostInactive)
 			VacuumCostBalance += VacuumCostPageDirty;
 	}
 }
@@ -4199,7 +4199,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
 		{
 			VacuumPageDirty++;
 			pgBufferUsage.shared_blks_dirtied++;
-			if (VacuumCostActive)
+			if (!VacuumCostInactive)
 				VacuumCostBalance += VacuumCostPageDirty;
 		}
 	}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 1b1d814254..608ebb9182 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -150,4 +150,4 @@ int64		VacuumPageMiss = 0;
 int64		VacuumPageDirty = 0;
 
 int			VacuumCostBalance = 0;	/* working state for vacuum */
-bool		VacuumCostActive = false;
+int			VacuumCostInactive = 1;
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index bdfd96cfec..5c3e250b06 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -302,6 +302,14 @@ extern PGDLLIMPORT int vacuum_failsafe_age;
 extern PGDLLIMPORT int vacuum_multixact_failsafe_age;
 
 /* Variables for cost-based parallel vacuum */
+
+typedef enum VacuumCostStatus
+{
+	VACUUM_COST_INACTIVE_AND_LOCKED = -1,
+	VACUUM_COST_ACTIVE = 0,
+	VACUUM_COST_INACTIVE_AND_UNLOCKED = 1,
+}			VacuumCostStatus;
+
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 06a86f9ac1..33e22733ae 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -274,8 +274,7 @@ extern PGDLLIMPORT int64 VacuumPageMiss;
 extern PGDLLIMPORT int64 VacuumPageDirty;
 
 extern PGDLLIMPORT int VacuumCostBalance;
-extern PGDLLIMPORT bool VacuumCostActive;
-
+extern PGDLLIMPORT int VacuumCostInactive;
 
 /* in tcop/postgres.c */
 
-- 
2.37.2

#32Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Melanie Plageman (#31)
Re: Should vacuum process config file reload more often

At Mon, 27 Mar 2023 14:12:03 -0400, Melanie Plageman <melanieplageman@gmail.com> wrote in

So, I've attached an alternate version of the patchset which takes the
approach of having one commit which only enables cost-based delay GUC
refresh for VACUUM and another commit which enables it for autovacuum
and makes the changes to balancing variables.

I still think the commit which has workers updating their own
wi_cost_delay in vacuum_delay_point() is a bit weird. It relies on no one
else emulating our bad behavior and reading from wi_cost_delay without a
lock and on no one else deciding to ever write to wi_cost_delay (even
though it is in shared memory [this is the same as master]). It is only
safe because our process is the only one (right now) writing to
wi_cost_delay, so when we read from it without a lock, we know it isn't
being written to. And everyone else takes a lock when reading from
wi_cost_delay right now. So, it seems...not great.

This approach also introduces a function that is only around for
one commit until the next commit obsoletes it, which seems a bit silly.

(I'm not sure what this refers to, though..) I don't think it's silly
if a later patch removes something that the preceding patches
introdcued, as long as that contributes to readability. Untimately,
they will be merged together on committing.

Basically, I think it is probably better to just have one commit
enabling guc refresh for VACUUM and then another which correctly
implements what is needed for autovacuum to do the same.
Attached v9 does this.

I've provided both complete versions of both approaches (v9 and v8).

I took a look at v9 and have a few comments.

0001:

I don't believe it is necessary, as mentioned in the commit
message. It apperas that we are resetting it at the appropriate times.

0002:

I felt a bit uneasy on this. It seems somewhat complex (and makes the
succeeding patches complex), has confusing names, and doesn't seem
like self-contained. I think it'd be simpler to add a global boolean
(maybe VacuumCostActiveForceDisable or such) that forces
VacuumCostActive to be false and set VacuumCostActive using a setter
function that follows the boolean.

0003:

+	 * Reload the configuration file if requested. This allows changes to
+	 * vacuum_cost_limit and vacuum_cost_delay to take effect while a table is
+	 * being vacuumed or analyzed. Analyze should not reload configuration
+	 * file if it is in an outer transaction, as GUC values shouldn't be
+	 * allowed to refer to some uncommitted state (e.g. database objects
+	 * created in this transaction).

I'm not sure GUC reload is or should be related to transactions. For
instance, work_mem can be changed by a reload during a transaction
unless it has been set in the current transaction. I don't think we
need to deliberately suppress changes in variables caused by realods
during transactions only for analzye. If analyze doesn't like changes
to certain GUC variables, their values should be snapshotted before
starting the process.

0004:
-	double		at_vacuum_cost_delay;
-	int			at_vacuum_cost_limit;
+	double		at_table_option_vac_cost_delay;
+	int			at_table_option_vac_cost_limit;

We call that options "relopt(ion)". I don't think there's any reason
to use different names.

dlist_head av_runningWorkers;
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+ pg_atomic_uint32 av_nworkers_for_balance;

The name of the new member doesn't seem to follow the surrounding
convention. (However, I don't think the member is needed. See below.)

-		/*
-		 * Remember the prevailing values of the vacuum cost GUCs.  We have to
-		 * restore these at the bottom of the loop, else we'll compute wrong
-		 * values in the next iteration of autovac_balance_cost().
-		 */
-		stdVacuumCostDelay = VacuumCostDelay;
-		stdVacuumCostLimit = VacuumCostLimit;
+		av_table_option_cost_limit = tab->at_table_option_vac_cost_limit;
+		av_table_option_cost_delay = tab->at_table_option_vac_cost_delay;

I think this requires a comment.

+		/* There is at least 1 autovac worker (this worker). */
+		int			nworkers_for_balance = Max(pg_atomic_read_u32(
+								&AutoVacuumShmem->av_nworkers_for_balance), 1);

I think it *must* be greater than 0. However, to begin with, I don't
think we need that variable to be shared. I don't believe it matters
if we count involved workers every time we calcualte the delay.

+/*
+ * autovac_balance_cost
+ *		Recalculate the number of workers to consider, given table options and
+ *		the current number of active workers.
+ *
+ * Caller must hold the AutovacuumLock in at least shared mode.

The function name doesn't seem align with what it does. However, I
mentioned above that it might be unnecessary.

+AutoVacuumUpdateLimit(void)

If I'm not missing anything, this function does something quite
different from the original autovac_balance_cost(). The original
function distributes the total cost based on the GUC variables among
workers proportionally according to each worker's cost
parameters. Howevwer, this function distributes the total cost
equally.

+		int			vac_cost_limit = autovacuum_vac_cost_limit > 0 ?
+		autovacuum_vac_cost_limit : VacuumCostLimit;
...
+		int			balanced_cost_limit = vac_cost_limit / nworkers_for_balance;
...
+		VacuumCostLimit = Max(Min(balanced_cost_limit, vac_cost_limit), 1);
 	}

This seems to repeatedly divide VacuumCostLimit by
nworkers_for_balance. I'm not sure, but this function might only be
called after a reload. If that's the case, I don't think it's safe
coding, even if it works.

regars.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#33Melanie Plageman
melanieplageman@gmail.com
In reply to: Kyotaro Horiguchi (#32)
3 attachment(s)
Re: Should vacuum process config file reload more often

On Tue, Mar 28, 2023 at 4:21 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Mon, 27 Mar 2023 14:12:03 -0400, Melanie Plageman <melanieplageman@gmail.com> wrote in

So, I've attached an alternate version of the patchset which takes the
approach of having one commit which only enables cost-based delay GUC
refresh for VACUUM and another commit which enables it for autovacuum
and makes the changes to balancing variables.

I still think the commit which has workers updating their own
wi_cost_delay in vacuum_delay_point() is a bit weird. It relies on no one
else emulating our bad behavior and reading from wi_cost_delay without a
lock and on no one else deciding to ever write to wi_cost_delay (even
though it is in shared memory [this is the same as master]). It is only
safe because our process is the only one (right now) writing to
wi_cost_delay, so when we read from it without a lock, we know it isn't
being written to. And everyone else takes a lock when reading from
wi_cost_delay right now. So, it seems...not great.

This approach also introduces a function that is only around for
one commit until the next commit obsoletes it, which seems a bit silly.

(I'm not sure what this refers to, though..) I don't think it's silly
if a later patch removes something that the preceding patches
introdcued, as long as that contributes to readability. Untimately,
they will be merged together on committing.

I was under the impression that reviewers thought config reload and
worker balance changes should be committed in separate commits.

Either way, the ephemeral function is not my primary concern. I felt
more uncomfortable with increasing how often we update a double in
shared memory which is read without acquiring a lock.

Basically, I think it is probably better to just have one commit
enabling guc refresh for VACUUM and then another which correctly
implements what is needed for autovacuum to do the same.
Attached v9 does this.

I've provided both complete versions of both approaches (v9 and v8).

I took a look at v9 and have a few comments.

0001:

I don't believe it is necessary, as mentioned in the commit
message. It apperas that we are resetting it at the appropriate times.

VacuumCostBalance must be zeroed out when we disable vacuum cost.
Previously, we did not reenable VacuumCostActive once it was disabled,
but now that we do, I think it is good practice to always zero out
VacuumCostBalance when we disable vacuum cost. I will squash this commit
into the one introducing VacuumCostInactive, though.

0002:

I felt a bit uneasy on this. It seems somewhat complex (and makes the
succeeding patches complex),

Even if we introduced a second global variable to indicate that failsafe
mode has been engaged, we would still require the additional checks
of VacuumCostInactive.

has confusing names,

I would be happy to rename the values of the enum to make them less
confusing. Are you thinking "force" instead of "locked"?
maybe:
VACUUM_COST_FORCE_INACTIVE and
VACUUM_COST_INACTIVE
?

and doesn't seem like self-contained.

By changing the variable from VacuumCostActive to VacuumCostInactive, I
have kept all non-vacuum code from having to distinguish between it
being inactive due to failsafe mode or due to user settings.

I think it'd be simpler to add a global boolean (maybe
VacuumCostActiveForceDisable or such) that forces VacuumCostActive to
be false and set VacuumCostActive using a setter function that follows
the boolean.

I think maintaining an additional global variable is more brittle than
including the information in a single variable.

0003:

+        * Reload the configuration file if requested. This allows changes to
+        * vacuum_cost_limit and vacuum_cost_delay to take effect while a table is
+        * being vacuumed or analyzed. Analyze should not reload configuration
+        * file if it is in an outer transaction, as GUC values shouldn't be
+        * allowed to refer to some uncommitted state (e.g. database objects
+        * created in this transaction).

I'm not sure GUC reload is or should be related to transactions. For
instance, work_mem can be changed by a reload during a transaction
unless it has been set in the current transaction. I don't think we
need to deliberately suppress changes in variables caused by realods
during transactions only for analzye. If analyze doesn't like changes
to certain GUC variables, their values should be snapshotted before
starting the process.

Currently, we only reload the config file in top-level statements. We
don't reload the configuration file from within a nested transaction
command. BEGIN starts a transaction but not a transaction command. So
BEGIN; ANALYZE; probably wouldn't violate this rule. But it is simpler
to just forbid reloading when it is not a top-level transaction command.
I have updated the comment to reflect this.

0004:
-       double          at_vacuum_cost_delay;
-       int                     at_vacuum_cost_limit;
+       double          at_table_option_vac_cost_delay;
+       int                     at_table_option_vac_cost_limit;

We call that options "relopt(ion)". I don't think there's any reason
to use different names.

I've updated the names.

dlist_head av_runningWorkers;
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+ pg_atomic_uint32 av_nworkers_for_balance;

The name of the new member doesn't seem to follow the surrounding
convention. (However, I don't think the member is needed. See below.)

I've updated the name to fit the convention better.

-               /*
-                * Remember the prevailing values of the vacuum cost GUCs.  We have to
-                * restore these at the bottom of the loop, else we'll compute wrong
-                * values in the next iteration of autovac_balance_cost().
-                */
-               stdVacuumCostDelay = VacuumCostDelay;
-               stdVacuumCostLimit = VacuumCostLimit;
+               av_table_option_cost_limit = tab->at_table_option_vac_cost_limit;
+               av_table_option_cost_delay = tab->at_table_option_vac_cost_delay;

I think this requires a comment.

I've added one.

+               /* There is at least 1 autovac worker (this worker). */
+               int                     nworkers_for_balance = Max(pg_atomic_read_u32(
+                                                               &AutoVacuumShmem->av_nworkers_for_balance), 1);

I think it *must* be greater than 0. However, to begin with, I don't
think we need that variable to be shared. I don't believe it matters
if we count involved workers every time we calculate the delay.

We are not calculating the delay but the cost limit. The cost limit must
be balanced across all of the workers currently actively vacuuming
tables without cost-related table options.

There shouldn't be a way for this to be zero, since this worker calls
autovac_balance_cost() before it starts vacuuming the table. I wanted to
rule out any possibility of a divide by 0 issue. I have changed it to an
assert instead.

+/*
+ * autovac_balance_cost
+ *             Recalculate the number of workers to consider, given table options and
+ *             the current number of active workers.
+ *
+ * Caller must hold the AutovacuumLock in at least shared mode.

The function name doesn't seem align with what it does. However, I
mentioned above that it might be unnecessary.

This is the same name as the function had previously. However, I think
it does make sense to rename it. The cost limit must be balanced across
the workers. This function calculated how many workers the cost limit
should be balanced across. I renamed it to
autovac_recalculate_workers_for_balance()

+AutoVacuumUpdateLimit(void)

If I'm not missing anything, this function does something quite
different from the original autovac_balance_cost(). The original
function distributes the total cost based on the GUC variables among
workers proportionally according to each worker's cost
parameters. Howevwer, this function distributes the total cost
equally.

Yes, as I mentioned in the commit message, because all the workers now
have no reason to have different cost parameters (due to reloading the
config file on almost every page), there is no reason to use ratios.
Workers vacuuming a table with no cost-related table options simply need
to divide the limit equally amongst themselves because they all will
have the same limit and delay values.

+               int                     vac_cost_limit = autovacuum_vac_cost_limit > 0 ?
+               autovacuum_vac_cost_limit : VacuumCostLimit;
...
+               int                     balanced_cost_limit = vac_cost_limit / nworkers_for_balance;
...
+               VacuumCostLimit = Max(Min(balanced_cost_limit, vac_cost_limit), 1);
}

This seems to repeatedly divide VacuumCostLimit by
nworkers_for_balance. I'm not sure, but this function might only be
called after a reload. If that's the case, I don't think it's safe
coding, even if it works.

Good point about repeatedly dividing VacuumCostLimit by
nworkers_for_balance. I've added a variable to keep track of the base
cost limit and separated the functionality of updating the limit into
two parts -- one AutoVacuumUpdateLimit() which is only meant to be
called after reload and references VacuumCostLimit to set the
av_base_cost_limit and another, AutoVacuumBalanceLimit(), which only
overrides VacuumCostLimit but uses av_base_cost_limit.

I've noted in the comments that AutoVacuumBalanceLimit() should be
called to adjust to a potential change in nworkers_for_balance
(currently every time after we sleep in vacuum_delay_point()) and
AutoVacuumUpdateLimit() should only be called once after a config
reload, as it references VacuumCostLimit.

I will note that this problem also exists in master, as
autovac_balance_cost references VacuumCostLimit in order to set worker
cost limits and then AutoVacuumUpdateDelay() overrides VacuumCostLimit
with the value calculated in autovac_balance_cost() from
VacuumCostLimit.

v10 attached with mentioned updates.

- Melanie

Attachments:

v10-0001-Make-VacuumCostActive-failsafe-aware.patchtext/x-patch; charset=US-ASCII; name=v10-0001-Make-VacuumCostActive-failsafe-aware.patchDownload
From cfa73c9086f737ec103b3caa03175b837c5565cb Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 25 Mar 2023 12:05:18 -0400
Subject: [PATCH v10 1/3] Make VacuumCostActive failsafe-aware

While vacuuming a table in failsafe mode, VacuumCostActive should not be
re-enabled. This currently isn't a problem because vacuum cost
parameters are only refreshed in between vacuuming tables and failsafe
status is reset for every table. In preparation for allowing vacuum cost
parameters to be updated more frequently, make vacuum cost status more
expressive.

VacuumCostActive is now VacuumCostInactive, as it can only be active in
one way but it can be inactive in two ways. If performing a failsafe
vacuum, the vacuum cost status cannot be enabled and is effectively
"locked". If performing a non-failsafe vacuum, the vacuum cost status
may be active or inactive. To express this, VacuumCostInactive can be
one of three statuses: VACUUM_COST_INACTIVE_AND_LOCKED,
VACUUM_COST_ACTIVE_AND_LOCKED, and VACUUM_COST_ACTIVE.

VacuumCostInactive is defined as an integer because we do not want
non-vacuum code concerning itself with the distinction between the three
statuses -- only with whether or not VacuumCostInactive == 0 or not.
---
 src/backend/access/heap/vacuumlazy.c  |  2 +-
 src/backend/commands/vacuum.c         | 24 +++++++++++++++++++++---
 src/backend/commands/vacuumparallel.c |  8 ++++++--
 src/backend/storage/buffer/bufmgr.c   |  8 ++++----
 src/backend/utils/init/globals.c      |  2 +-
 src/include/commands/vacuum.h         |  8 ++++++++
 src/include/miscadmin.h               |  3 +--
 7 files changed, 42 insertions(+), 13 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8f14cf85f3..040a4e931b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2637,7 +2637,7 @@ lazy_check_wraparound_failsafe(LVRelState *vacrel)
 						 "You might also need to consider other ways for VACUUM to keep up with the allocation of transaction IDs.")));
 
 		/* Stop applying cost limits from this point on */
-		VacuumCostActive = false;
+		VacuumCostInactive = VACUUM_COST_INACTIVE_AND_LOCKED;
 		VacuumCostBalance = 0;
 
 		return true;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index c54360a6a0..eb126f2247 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -491,7 +491,6 @@ vacuum(List *relations, VacuumParams *params,
 		ListCell   *cur;
 
 		in_vacuum = true;
-		VacuumCostActive = (VacuumCostDelay > 0);
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
@@ -507,6 +506,24 @@ vacuum(List *relations, VacuumParams *params,
 		{
 			VacuumRelation *vrel = lfirst_node(VacuumRelation, cur);
 
+			/*
+			 * failsafe_active is reset per relation, so we must be sure that
+			 * VacuumCostInactive is set to either VACUUM_COST_INACTIVE or
+			 * VACUUM_COST_INACTIVE_AND_UNLOCKED in between vacuuming
+			 * relations.
+			 */
+			VacuumCostInactive = VacuumCostDelay > 0 ? VACUUM_COST_ACTIVE :
+				VACUUM_COST_INACTIVE_AND_UNLOCKED;
+
+			/*
+			 * We should not have transitioned VacuumCostInactive from
+			 * VACUUM_COST_ACTIVE to VACUUM_COST_INACTIVE_AND_UNLOCKED above,
+			 * as that should have happened when we changed the value of
+			 * VacuumCostDelay.
+			 */
+			Assert(VacuumCostInactive == VACUUM_COST_ACTIVE ||
+				   VacuumCostBalance == 0);
+
 			if (params->options & VACOPT_VACUUM)
 			{
 				if (!vacuum_rel(vrel->oid, vrel->relation, params, false))
@@ -549,7 +566,8 @@ vacuum(List *relations, VacuumParams *params,
 	PG_FINALLY();
 	{
 		in_vacuum = false;
-		VacuumCostActive = false;
+		VacuumCostInactive = VACUUM_COST_INACTIVE_AND_UNLOCKED;
+		VacuumCostBalance = 0;
 	}
 	PG_END_TRY();
 
@@ -2215,7 +2233,7 @@ vacuum_delay_point(void)
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	if (!VacuumCostActive || InterruptPending)
+	if (VacuumCostInactive || InterruptPending)
 		return;
 
 	/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index bcd40c80a1..266bf6bb4c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -989,8 +989,12 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												 PARALLEL_VACUUM_KEY_DEAD_ITEMS,
 												 false);
 
-	/* Set cost-based vacuum delay */
-	VacuumCostActive = (VacuumCostDelay > 0);
+	/*
+	 * Set cost-based vacuum delay Parallel vacuum workers will not execute
+	 * failsafe VACUUM.
+	 */
+	VacuumCostInactive = VacuumCostDelay > 0 ? VACUUM_COST_ACTIVE :
+		VACUUM_COST_INACTIVE_AND_UNLOCKED;
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
 	VacuumPageMiss = 0;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 95212a3941..6d3dd26fc7 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -893,7 +893,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 			*hit = true;
 			VacuumPageHit++;
 
-			if (VacuumCostActive)
+			if (!VacuumCostInactive)
 				VacuumCostBalance += VacuumCostPageHit;
 
 			TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
@@ -1098,7 +1098,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 	}
 
 	VacuumPageMiss++;
-	if (VacuumCostActive)
+	if (!VacuumCostInactive)
 		VacuumCostBalance += VacuumCostPageMiss;
 
 	TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
@@ -1672,7 +1672,7 @@ MarkBufferDirty(Buffer buffer)
 	{
 		VacuumPageDirty++;
 		pgBufferUsage.shared_blks_dirtied++;
-		if (VacuumCostActive)
+		if (!VacuumCostInactive)
 			VacuumCostBalance += VacuumCostPageDirty;
 	}
 }
@@ -4199,7 +4199,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
 		{
 			VacuumPageDirty++;
 			pgBufferUsage.shared_blks_dirtied++;
-			if (VacuumCostActive)
+			if (!VacuumCostInactive)
 				VacuumCostBalance += VacuumCostPageDirty;
 		}
 	}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 1b1d814254..608ebb9182 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -150,4 +150,4 @@ int64		VacuumPageMiss = 0;
 int64		VacuumPageDirty = 0;
 
 int			VacuumCostBalance = 0;	/* working state for vacuum */
-bool		VacuumCostActive = false;
+int			VacuumCostInactive = 1;
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index bdfd96cfec..5c3e250b06 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -302,6 +302,14 @@ extern PGDLLIMPORT int vacuum_failsafe_age;
 extern PGDLLIMPORT int vacuum_multixact_failsafe_age;
 
 /* Variables for cost-based parallel vacuum */
+
+typedef enum VacuumCostStatus
+{
+	VACUUM_COST_INACTIVE_AND_LOCKED = -1,
+	VACUUM_COST_ACTIVE = 0,
+	VACUUM_COST_INACTIVE_AND_UNLOCKED = 1,
+}			VacuumCostStatus;
+
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 06a86f9ac1..33e22733ae 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -274,8 +274,7 @@ extern PGDLLIMPORT int64 VacuumPageMiss;
 extern PGDLLIMPORT int64 VacuumPageDirty;
 
 extern PGDLLIMPORT int VacuumCostBalance;
-extern PGDLLIMPORT bool VacuumCostActive;
-
+extern PGDLLIMPORT int VacuumCostInactive;
 
 /* in tcop/postgres.c */
 
-- 
2.37.2

v10-0003-Autovacuum-refreshes-cost-based-delay-params-mor.patchtext/x-patch; charset=US-ASCII; name=v10-0003-Autovacuum-refreshes-cost-based-delay-params-mor.patchDownload
From b441bc56a827e0dd5c5078fdf83db79bd9c938ec Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 25 Mar 2023 14:14:55 -0400
Subject: [PATCH v10 3/3] Autovacuum refreshes cost-based delay params more
 often

The previous commit allowed VACUUM to reload the config file more often
so that cost-based delay parameters could take effect while VACUUMing a
relation. Autovacuum, however did not benefit from this change.

In order for autovacuum workers to safely update their own cost delay
and cost limit parameters without impacting performance, we had to
rethink when and how these values were accessed.

Previously, an autovacuum worker's wi_cost_limit was set only at the
beginning of vacuuming a table, after reloading the config file.
Therefore, at the time that autovac_balance_cost() is called, workers
vacuuming tables with no table options could still have different values
for their wi_cost_limit_base and wi_cost_delay.

Now that the cost parameters can be updated while vacuuming a table,
workers will (within some margin of error) have no reason to have
different values for cost limit and cost delay (in the absence of table
options). This removes the rationale for keeping cost limit and cost
delay in shared memory. Balancing the cost limit requires only the
number of active autovacuum workers vacuuming a table with no cost-based
table options.
---
 src/backend/commands/vacuum.c       |  22 ++-
 src/backend/postmaster/autovacuum.c | 271 +++++++++++++++-------------
 src/include/postmaster/autovacuum.h |   2 +
 3 files changed, 165 insertions(+), 130 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7e3a8e404e..e9b683805a 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2241,10 +2241,10 @@ vacuum_delay_point(void)
 
 	/*
 	 * Reload the configuration file if requested. This allows changes to
-	 * vacuum_cost_limit and vacuum_cost_delay to take effect while a table is
-	 * being vacuumed or analyzed. Analyze should not reload configuration file
-	 * if it is in an outer transaction, as we currently only allow
-	 * configuration reload when in top-level statements.
+	 * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay to
+	 * take effect while a table is being vacuumed or analyzed. Analyze should
+	 * not reload configuration file if it is in an outer transaction, as we
+	 * currently only allow configuration reload when in top-level statements.
 	 */
 	if (ConfigReloadPending && !analyze_in_outer_xact)
 	{
@@ -2257,10 +2257,12 @@ vacuum_delay_point(void)
 		 * by reload.
 		 */
 		AutoVacuumUpdateDelay();
+		AutoVacuumUpdateLimit();
 
 		/*
 		 * If configuration changes are allowed to impact VacuumCostInactive,
-		 * make sure it is updated.
+		 * make sure it is updated. Autovacuum workers will have already done
+		 * this in AutoVacuumUpdateDelay()
 		 */
 		if (VacuumCostInactive == VACUUM_COST_INACTIVE_AND_LOCKED)
 			return;
@@ -2311,8 +2313,14 @@ vacuum_delay_point(void)
 
 		VacuumCostBalance = 0;
 
-		/* update balance values for workers */
-		AutoVacuumUpdateDelay();
+		/*
+		 * Balance and update limit values for autovacuum workers. We must
+		 * always do this in case the autovacuum launcher or another
+		 * autovacuum worker has recalculated the number of workers across
+		 * which we must balance the limit. This is done by the launcher when
+		 * launching a new worker and by workers before vacuuming each table.
+		 */
+		AutoVacuumBalanceLimit();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 585d28148c..c8dae5465a 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -139,6 +139,10 @@ int			Log_autovacuum_min_duration = 600000;
 static bool am_autovacuum_launcher = false;
 static bool am_autovacuum_worker = false;
 
+static double av_relopt_cost_delay = -1;
+static int	av_relopt_cost_limit = 0;
+static int	av_base_cost_limit = 0;
+
 /* Flags set by signal handlers */
 static volatile sig_atomic_t got_SIGUSR2 = false;
 
@@ -189,8 +193,8 @@ typedef struct autovac_table
 {
 	Oid			at_relid;
 	VacuumParams at_params;
-	double		at_vacuum_cost_delay;
-	int			at_vacuum_cost_limit;
+	double		at_relopt_vac_cost_delay;
+	int			at_relopt_vac_cost_limit;
 	bool		at_dobalance;
 	bool		at_sharedrel;
 	char	   *at_relname;
@@ -209,7 +213,7 @@ typedef struct autovac_table
  * wi_sharedrel flag indicating whether table is marked relisshared
  * wi_proc		pointer to PGPROC of the running worker, NULL if not started
  * wi_launchtime Time at which this worker was launched
- * wi_cost_*	Vacuum cost-based delay parameters current in this worker
+ * wi_dobalance Whether this worker should be included in balance calculations
  *
  * All fields are protected by AutovacuumLock, except for wi_tableoid and
  * wi_sharedrel which are protected by AutovacuumScheduleLock (note these
@@ -225,9 +229,6 @@ typedef struct WorkerInfoData
 	TimestampTz wi_launchtime;
 	bool		wi_dobalance;
 	bool		wi_sharedrel;
-	double		wi_cost_delay;
-	int			wi_cost_limit;
-	int			wi_cost_limit_base;
 } WorkerInfoData;
 
 typedef struct WorkerInfoData *WorkerInfo;
@@ -273,6 +274,8 @@ typedef struct AutoVacuumWorkItem
  * av_startingWorker pointer to WorkerInfo currently being started (cleared by
  *					the worker itself as soon as it's up and running)
  * av_workItems		work item array
+ * av_nworkersForBalance the number of autovacuum workers to use when
+ * 					calculating the per worker cost limit
  *
  * This struct is protected by AutovacuumLock, except for av_signal and parts
  * of the worker list (see above).
@@ -286,6 +289,7 @@ typedef struct
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+	pg_atomic_uint32 av_nworkersForBalance;
 } AutoVacuumShmemStruct;
 
 static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -319,7 +323,7 @@ static void launch_worker(TimestampTz now);
 static List *get_database_list(void);
 static void rebuild_database_list(Oid newdb);
 static int	db_comparator(const void *a, const void *b);
-static void autovac_balance_cost(void);
+static void autovac_recalculate_workers_for_balance(void);
 
 static void do_autovacuum(void);
 static void FreeWorkerInfo(int code, Datum arg);
@@ -670,7 +674,7 @@ AutoVacLauncherMain(int argc, char *argv[])
 			{
 				LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 				AutoVacuumShmem->av_signal[AutoVacRebalance] = false;
-				autovac_balance_cost();
+				autovac_recalculate_workers_for_balance();
 				LWLockRelease(AutovacuumLock);
 			}
 
@@ -820,8 +824,8 @@ HandleAutoVacLauncherInterrupts(void)
 			AutoVacLauncherShutdown();
 
 		/* rebalance in case the default cost parameters changed */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-		autovac_balance_cost();
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
+		autovac_recalculate_workers_for_balance();
 		LWLockRelease(AutovacuumLock);
 
 		/* rebuild the list in case the naptime changed */
@@ -1756,9 +1760,6 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
 		MyWorkerInfo->wi_dobalance = false;
-		MyWorkerInfo->wi_cost_delay = 0;
-		MyWorkerInfo->wi_cost_limit = 0;
-		MyWorkerInfo->wi_cost_limit_base = 0;
 		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 						&MyWorkerInfo->wi_links);
 		/* not mine anymore */
@@ -1773,100 +1774,142 @@ FreeWorkerInfo(int code, Datum arg)
 	}
 }
 
+
 /*
- * Update the cost-based delay parameters, so that multiple workers consume
- * each a fraction of the total available I/O.
+ * Update VacuumCostDelay with the correct value for an autovacuum worker,
+ * given the value of other relevant cost-based delay parameters. Autovacuum
+ * workers should call this after every config reload, in case VacuumCostDelay
+ * was overwritten.
  */
 void
 AutoVacuumUpdateDelay(void)
 {
-	if (MyWorkerInfo)
+	if (!am_autovacuum_worker)
+		return;
+
+	if (av_relopt_cost_delay >= 0)
+		VacuumCostDelay = av_relopt_cost_delay;
+	else if (autovacuum_vac_cost_delay >= 0)
+		VacuumCostDelay = autovacuum_vac_cost_delay;
+
+	/*
+	 * If configuration changes are allowed to impact VacuumCostInactive, make
+	 * sure it is updated.
+	 */
+	if (VacuumCostInactive == VACUUM_COST_INACTIVE_AND_LOCKED)
+		return;
+
+	if (VacuumCostDelay > 0)
+		VacuumCostInactive = VACUUM_COST_ACTIVE;
+	else
 	{
-		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
-		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+		VacuumCostInactive = VACUUM_COST_INACTIVE_AND_UNLOCKED;
+		VacuumCostBalance = 0;
 	}
 }
 
+
 /*
- * autovac_balance_cost
- *		Recalculate the cost limit setting for each active worker.
- *
- * Caller must hold the AutovacuumLock in exclusive mode.
+ * This must be called directly after a config reload before using the value of
+ * VacuumCostLimit and before calling AutoVacuumBalanceLimit(), as it uses the
+ * value of VacuumCostLimit to determine what the base av_base_cost_limit
+ * should be. AutoVacuumBalanceLimit() will override the value of
+ * VacuumCostLimit, so calling it multiple times after a config reload is
+ * incorrect.
  */
-static void
-autovac_balance_cost(void)
+void
+AutoVacuumUpdateLimit(void)
 {
+	if (!am_autovacuum_worker)
+		return;
+
 	/*
-	 * The idea here is that we ration out I/O equally.  The amount of I/O
-	 * that a worker can consume is determined by cost_limit/cost_delay, so we
-	 * try to equalize those ratios rather than the raw limit settings.
-	 *
 	 * note: in cost_limit, zero also means use value from elsewhere, because
 	 * zero is not a valid value.
 	 */
-	int			vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
-								  autovacuum_vac_cost_limit : VacuumCostLimit);
-	double		vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
-								  autovacuum_vac_cost_delay : VacuumCostDelay);
-	double		cost_total;
-	double		cost_avail;
-	dlist_iter	iter;
+	if (av_relopt_cost_limit > 0)
+		VacuumCostLimit = av_relopt_cost_limit;
+	else
+	{
+		av_base_cost_limit = autovacuum_vac_cost_limit > 0 ?
+			autovacuum_vac_cost_limit : VacuumCostLimit;
+
+		AutoVacuumBalanceLimit();
+	}
+}
+
+/*
+ * Update VacuumCostLimit with the correct value for an autovacuum worker,
+ * given the value of other relevant cost limit parameters and the number of
+ * workers across which the limit must be balanced. Autovacuum workers must
+ * call this regularly in case av_nworkers_for_balance has been updated by
+ * another worker or by the autovacuum launcher. After a config reload, they
+ * must call AutoVacuumUpdateLimit() which will call AutoVacuumBalanceLimit(),
+ * in case VacuumCostLimit was overwritten.
+ */
+void
+AutoVacuumBalanceLimit(void)
+{
+	int			nworkers_for_balance;
+	int			total_cost_limit;
+	int			balanced_cost_limit;
 
-	/* not set? nothing to do */
-	if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
+	if (!am_autovacuum_worker)
 		return;
 
-	/* calculate the total base cost limit of participating active workers */
-	cost_total = 0.0;
-	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
-	{
-		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
+	Assert(av_base_cost_limit > 0);
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-			cost_total +=
-				(double) worker->wi_cost_limit_base / worker->wi_cost_delay;
-	}
+	nworkers_for_balance = pg_atomic_read_u32(
+							&AutoVacuumShmem->av_nworkersForBalance);
+
+	/* There is at least 1 autovac worker (this worker). */
+	Assert(nworkers_for_balance > 0);
+
+	total_cost_limit = autovacuum_vac_cost_limit > 0 ?
+		autovacuum_vac_cost_limit : av_base_cost_limit;
 
-	/* there are no cost limits -- nothing to do */
-	if (cost_total <= 0)
+	balanced_cost_limit = total_cost_limit / nworkers_for_balance;
+
+	VacuumCostLimit = Max(Min(balanced_cost_limit, total_cost_limit), 1);
+}
+
+/*
+ * autovac_recalculate_workers_for_balance
+ *		Recalculate the number of workers to consider, given table options and
+ *		the current number of active workers.
+ *
+ * Caller must hold the AutovacuumLock in at least shared mode.
+ */
+static void
+autovac_recalculate_workers_for_balance(void)
+{
+	dlist_iter	iter;
+	int			orig_nworkers_for_balance;
+	int			nworkers_for_balance = 0;
+
+	if (autovacuum_vac_cost_delay == 0 ||
+		(autovacuum_vac_cost_delay == -1 && VacuumCostDelay == 0))
 		return;
 
-	/*
-	 * Adjust cost limit of each active worker to balance the total of cost
-	 * limit to autovacuum_vacuum_cost_limit.
-	 */
-	cost_avail = (double) vac_cost_limit / vac_cost_delay;
+	if (autovacuum_vac_cost_limit <= 0 && VacuumCostLimit <= 0)
+		return;
+
+	orig_nworkers_for_balance =
+		pg_atomic_read_u32(&AutoVacuumShmem->av_nworkersForBalance);
+
 	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
 	{
 		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-		{
-			int			limit = (int)
-			(cost_avail * worker->wi_cost_limit_base / cost_total);
-
-			/*
-			 * We put a lower bound of 1 on the cost_limit, to avoid division-
-			 * by-zero in the vacuum code.  Also, in case of roundoff trouble
-			 * in these calculations, let's be sure we don't ever set
-			 * cost_limit to more than the base value.
-			 */
-			worker->wi_cost_limit = Max(Min(limit,
-											worker->wi_cost_limit_base),
-										1);
-		}
+		if (worker->wi_proc == NULL || !worker->wi_dobalance)
+			continue;
 
-		if (worker->wi_proc != NULL)
-			elog(DEBUG2, "autovac_balance_cost(pid=%d db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d, cost_delay=%g)",
-				 worker->wi_proc->pid, worker->wi_dboid, worker->wi_tableoid,
-				 worker->wi_dobalance ? "yes" : "no",
-				 worker->wi_cost_limit, worker->wi_cost_limit_base,
-				 worker->wi_cost_delay);
+		nworkers_for_balance++;
 	}
+
+	if (nworkers_for_balance != orig_nworkers_for_balance)
+		pg_atomic_write_u32(&AutoVacuumShmem->av_nworkersForBalance,
+							nworkers_for_balance);
 }
 
 /*
@@ -2312,8 +2355,6 @@ do_autovacuum(void)
 		autovac_table *tab;
 		bool		isshared;
 		bool		skipit;
-		double		stdVacuumCostDelay;
-		int			stdVacuumCostLimit;
 		dlist_iter	iter;
 
 		CHECK_FOR_INTERRUPTS();
@@ -2417,30 +2458,26 @@ do_autovacuum(void)
 		}
 
 		/*
-		 * Remember the prevailing values of the vacuum cost GUCs.  We have to
-		 * restore these at the bottom of the loop, else we'll compute wrong
-		 * values in the next iteration of autovac_balance_cost().
+		 * Save the cost-related table options in global variables for
+		 * reference when updating VacuumCostLimit and VacuumCostDelay during
+		 * vacuuming this table.
 		 */
-		stdVacuumCostDelay = VacuumCostDelay;
-		stdVacuumCostLimit = VacuumCostLimit;
+		av_relopt_cost_limit = tab->at_relopt_vac_cost_limit;
+		av_relopt_cost_delay = tab->at_relopt_vac_cost_delay;
 
 		/* Must hold AutovacuumLock while mucking with cost balance info */
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-
-		/* advertise my cost delay parameters for the balancing algorithm */
 		MyWorkerInfo->wi_dobalance = tab->at_dobalance;
-		MyWorkerInfo->wi_cost_delay = tab->at_vacuum_cost_delay;
-		MyWorkerInfo->wi_cost_limit = tab->at_vacuum_cost_limit;
-		MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
-
-		/* do a balance */
-		autovac_balance_cost();
+		autovac_recalculate_workers_for_balance();
+		LWLockRelease(AutovacuumLock);
 
-		/* set the active cost parameters from the result of that */
+		/*
+		 * We wait until this point to update cost delay and cost limit
+		 * values, even though we reloaded the configuration file above, so
+		 * that we can take into account the cost-related table options.
+		 */
 		AutoVacuumUpdateDelay();
-
-		/* done */
-		LWLockRelease(AutovacuumLock);
+		AutoVacuumUpdateLimit();
 
 		/* clean up memory before each iteration */
 		MemoryContextResetAndDeleteChildren(PortalContext);
@@ -2525,19 +2562,15 @@ deleted:
 
 		/*
 		 * Remove my info from shared memory.  We could, but intentionally
-		 * don't, clear wi_cost_limit and friends --- this is on the
-		 * assumption that we probably have more to do with similar cost
-		 * settings, so we don't want to give up our share of I/O for a very
-		 * short interval and thereby thrash the global balance.
+		 * don't, set wi_dobalance to false on the assumption that we are more
+		 * likely than not to vacuum a table with no table options next, so we
+		 * don't want to give up our share of I/O for a very short interval
+		 * and thereby thrash the global balance.
 		 */
 		LWLockAcquire(AutovacuumScheduleLock, LW_EXCLUSIVE);
 		MyWorkerInfo->wi_tableoid = InvalidOid;
 		MyWorkerInfo->wi_sharedrel = false;
 		LWLockRelease(AutovacuumScheduleLock);
-
-		/* restore vacuum cost GUCs for the next iteration */
-		VacuumCostDelay = stdVacuumCostDelay;
-		VacuumCostLimit = stdVacuumCostLimit;
 	}
 
 	/*
@@ -2569,6 +2602,8 @@ deleted:
 		{
 			ConfigReloadPending = false;
 			ProcessConfigFile(PGC_SIGHUP);
+			AutoVacuumUpdateDelay();
+			AutoVacuumUpdateLimit();
 		}
 
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
@@ -2804,8 +2839,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		int			freeze_table_age;
 		int			multixact_freeze_min_age;
 		int			multixact_freeze_table_age;
-		int			vac_cost_limit;
-		double		vac_cost_delay;
 		int			log_min_duration;
 
 		/*
@@ -2815,20 +2848,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 * defaults, autovacuum's own first and plain vacuum second.
 		 */
 
-		/* -1 in autovac setting means use plain vacuum_cost_delay */
-		vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
-			? avopts->vacuum_cost_delay
-			: (autovacuum_vac_cost_delay >= 0)
-			? autovacuum_vac_cost_delay
-			: VacuumCostDelay;
-
-		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
-		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
-			? avopts->vacuum_cost_limit
-			: (autovacuum_vac_cost_limit > 0)
-			? autovacuum_vac_cost_limit
-			: VacuumCostLimit;
-
 		/* -1 in autovac setting means use log_autovacuum_min_duration */
 		log_min_duration = (avopts && avopts->log_min_duration >= 0)
 			? avopts->log_min_duration
@@ -2884,8 +2903,10 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab->at_params.multixact_freeze_table_age = multixact_freeze_table_age;
 		tab->at_params.is_wraparound = wraparound;
 		tab->at_params.log_min_duration = log_min_duration;
-		tab->at_vacuum_cost_limit = vac_cost_limit;
-		tab->at_vacuum_cost_delay = vac_cost_delay;
+		tab->at_relopt_vac_cost_limit = avopts ?
+			avopts->vacuum_cost_limit : 0;
+		tab->at_relopt_vac_cost_delay = avopts ?
+			avopts->vacuum_cost_delay : -1;
 		tab->at_relname = NULL;
 		tab->at_nspname = NULL;
 		tab->at_datname = NULL;
@@ -3377,10 +3398,14 @@ AutoVacuumShmemInit(void)
 		worker = (WorkerInfo) ((char *) AutoVacuumShmem +
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
+
 		/* initialize the WorkerInfo free list */
 		for (i = 0; i < autovacuum_max_workers; i++)
 			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 							&worker[i].wi_links);
+
+		pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
+
 	}
 	else
 		Assert(found);
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index c140371b51..80bdfb2cc0 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -64,7 +64,9 @@ extern int	StartAutoVacWorker(void);
 extern void AutoVacWorkerFailed(void);
 
 /* autovacuum cost-delay balancer */
+extern void AutoVacuumBalanceLimit(void);
 extern void AutoVacuumUpdateDelay(void);
+extern void AutoVacuumUpdateLimit(void);
 
 #ifdef EXEC_BACKEND
 extern void AutoVacLauncherMain(int argc, char *argv[]) pg_attribute_noreturn();
-- 
2.37.2

v10-0002-VACUUM-reloads-config-file-more-often.patchtext/x-patch; charset=US-ASCII; name=v10-0002-VACUUM-reloads-config-file-more-often.patchDownload
From df724144b6fbdc804fb91033bd88df0f82ba6f30 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Mon, 27 Mar 2023 13:33:19 -0400
Subject: [PATCH v10 2/3] VACUUM reloads config file more often

Previously, VACUUM would not reload the configuration file. So, changes
to cost-based delay parameters could only take effect on the next
invocation of VACUUM.

Check if a reload is pending roughly once per block now, when checking
if we need to delay.

Note that autovacuum is unaffected by this change. Autovacuum workers
overwrite the value of VacuumCostLimit and VacuumCostDelay with their
own WorkerInfo->wi_cost_limit and wi_cost_delay. Writing to their
wi_cost_delay more often makes reading wi_cost_delay without a lock to
update VacuumCostDelay an even worse idea.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_buP5wzsho3qNw5o9_R0pF69FRM5hgCmr-mvXmGXwdA7A%40mail.gmail.com#5e6771d4cdca4db6efc2acec2dce0bc7
---
 src/backend/commands/vacuum.c | 61 ++++++++++++++++++++++++++++++-----
 1 file changed, 53 insertions(+), 8 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index eb126f2247..7e3a8e404e 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -48,6 +48,7 @@
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/interrupt.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/pmsignal.h"
@@ -76,6 +77,7 @@ int			vacuum_multixact_failsafe_age;
 /* A few variables that don't seem worth passing around as parameters */
 static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
+static bool analyze_in_outer_xact = false;
 
 
 /*
@@ -314,8 +316,7 @@ vacuum(List *relations, VacuumParams *params,
 	static bool in_vacuum = false;
 
 	const char *stmttype;
-	volatile bool in_outer_xact,
-				use_own_xacts;
+	volatile bool use_own_xacts;
 
 	Assert(params != NULL);
 
@@ -332,10 +333,10 @@ vacuum(List *relations, VacuumParams *params,
 	if (params->options & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
-		in_outer_xact = false;
+		analyze_in_outer_xact = false;
 	}
 	else
-		in_outer_xact = IsInTransactionBlock(isTopLevel);
+		analyze_in_outer_xact = IsInTransactionBlock(isTopLevel);
 
 	/*
 	 * Due to static variables vac_context, anl_context and vac_strategy,
@@ -457,7 +458,7 @@ vacuum(List *relations, VacuumParams *params,
 		Assert(params->options & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
-		else if (in_outer_xact)
+		else if (analyze_in_outer_xact)
 			use_own_xacts = false;
 		else if (list_length(relations) > 1)
 			use_own_xacts = true;
@@ -475,7 +476,7 @@ vacuum(List *relations, VacuumParams *params,
 	 */
 	if (use_own_xacts)
 	{
-		Assert(!in_outer_xact);
+		Assert(!analyze_in_outer_xact);
 
 		/* ActiveSnapshot is not set by autovacuum */
 		if (ActiveSnapshotSet())
@@ -544,7 +545,7 @@ vacuum(List *relations, VacuumParams *params,
 				}
 
 				analyze_rel(vrel->oid, vrel->relation, params,
-							vrel->va_cols, in_outer_xact, vac_strategy);
+							vrel->va_cols, analyze_in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
 				{
@@ -568,6 +569,7 @@ vacuum(List *relations, VacuumParams *params,
 		in_vacuum = false;
 		VacuumCostInactive = VACUUM_COST_INACTIVE_AND_UNLOCKED;
 		VacuumCostBalance = 0;
+		analyze_in_outer_xact = false;
 	}
 	PG_END_TRY();
 
@@ -2233,7 +2235,50 @@ vacuum_delay_point(void)
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	if (VacuumCostInactive || InterruptPending)
+	if (InterruptPending ||
+		(VacuumCostInactive && !ConfigReloadPending))
+		return;
+
+	/*
+	 * Reload the configuration file if requested. This allows changes to
+	 * vacuum_cost_limit and vacuum_cost_delay to take effect while a table is
+	 * being vacuumed or analyzed. Analyze should not reload configuration file
+	 * if it is in an outer transaction, as we currently only allow
+	 * configuration reload when in top-level statements.
+	 */
+	if (ConfigReloadPending && !analyze_in_outer_xact)
+	{
+		ConfigReloadPending = false;
+		ProcessConfigFile(PGC_SIGHUP);
+
+		/*
+		 * Autovacuum workers must restore the correct values of
+		 * VacuumCostLimit and VacuumCostDelay in case they were overwritten
+		 * by reload.
+		 */
+		AutoVacuumUpdateDelay();
+
+		/*
+		 * If configuration changes are allowed to impact VacuumCostInactive,
+		 * make sure it is updated.
+		 */
+		if (VacuumCostInactive == VACUUM_COST_INACTIVE_AND_LOCKED)
+			return;
+
+		if (VacuumCostDelay > 0)
+			VacuumCostInactive = VACUUM_COST_ACTIVE;
+		else
+		{
+			VacuumCostInactive = VACUUM_COST_INACTIVE_AND_UNLOCKED;
+			VacuumCostBalance = 0;
+		}
+	}
+
+	/*
+	 * If we disabled cost-based delays after reloading the config file,
+	 * return.
+	 */
+	if (VacuumCostInactive)
 		return;
 
 	/*
-- 
2.37.2

#34Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Melanie Plageman (#33)
1 attachment(s)
Re: Should vacuum process config file reload more often

At Tue, 28 Mar 2023 20:35:28 -0400, Melanie Plageman <melanieplageman@gmail.com> wrote in

On Tue, Mar 28, 2023 at 4:21 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Mon, 27 Mar 2023 14:12:03 -0400, Melanie Plageman <melanieplageman@gmail.com> wrote in

So, I've attached an alternate version of the patchset which takes the
approach of having one commit which only enables cost-based delay GUC
refresh for VACUUM and another commit which enables it for autovacuum
and makes the changes to balancing variables.

I still think the commit which has workers updating their own
wi_cost_delay in vacuum_delay_point() is a bit weird. It relies on no one
else emulating our bad behavior and reading from wi_cost_delay without a
lock and on no one else deciding to ever write to wi_cost_delay (even
though it is in shared memory [this is the same as master]). It is only
safe because our process is the only one (right now) writing to
wi_cost_delay, so when we read from it without a lock, we know it isn't
being written to. And everyone else takes a lock when reading from
wi_cost_delay right now. So, it seems...not great.

This approach also introduces a function that is only around for
one commit until the next commit obsoletes it, which seems a bit silly.

(I'm not sure what this refers to, though..) I don't think it's silly
if a later patch removes something that the preceding patches
introdcued, as long as that contributes to readability. Untimately,
they will be merged together on committing.

I was under the impression that reviewers thought config reload and
worker balance changes should be committed in separate commits.

Either way, the ephemeral function is not my primary concern. I felt
more uncomfortable with increasing how often we update a double in
shared memory which is read without acquiring a lock.

Basically, I think it is probably better to just have one commit
enabling guc refresh for VACUUM and then another which correctly
implements what is needed for autovacuum to do the same.
Attached v9 does this.

I've provided both complete versions of both approaches (v9 and v8).

I took a look at v9 and have a few comments.

0001:

I don't believe it is necessary, as mentioned in the commit
message. It apperas that we are resetting it at the appropriate times.

VacuumCostBalance must be zeroed out when we disable vacuum cost.
Previously, we did not reenable VacuumCostActive once it was disabled,
but now that we do, I think it is good practice to always zero out
VacuumCostBalance when we disable vacuum cost. I will squash this commit
into the one introducing VacuumCostInactive, though.

0002:

I felt a bit uneasy on this. It seems somewhat complex (and makes the
succeeding patches complex),

Even if we introduced a second global variable to indicate that failsafe
mode has been engaged, we would still require the additional checks
of VacuumCostInactive.

has confusing names,

I would be happy to rename the values of the enum to make them less
confusing. Are you thinking "force" instead of "locked"?
maybe:
VACUUM_COST_FORCE_INACTIVE and
VACUUM_COST_INACTIVE
?

and doesn't seem like self-contained.

By changing the variable from VacuumCostActive to VacuumCostInactive, I
have kept all non-vacuum code from having to distinguish between it
being inactive due to failsafe mode or due to user settings.

My concern is that VacuumCostActive is logic-inverted and turned into
a ternary variable in a subtle way. The expression
"!VacuumCostInactive" is quite confusing. (I sometimes feel the same
way about "!XLogRecPtrIsInvalid(lsn)", and I believe most people write
it with another macro like "lsn != InvalidXLogrecPtr"). Additionally,
the constraint in this patch will be implemented as open code. So I
wanted to suggest something like the attached. The main idea is to use
a wrapper function to enforce the restriction, and by doing so, we
eliminated the need to make the variable into a ternary without a good
reason.

I think it'd be simpler to add a global boolean (maybe
VacuumCostActiveForceDisable or such) that forces VacuumCostActive to
be false and set VacuumCostActive using a setter function that follows
the boolean.

I think maintaining an additional global variable is more brittle than
including the information in a single variable.

0003:

+        * Reload the configuration file if requested. This allows changes to
+        * vacuum_cost_limit and vacuum_cost_delay to take effect while a table is
+        * being vacuumed or analyzed. Analyze should not reload configuration
+        * file if it is in an outer transaction, as GUC values shouldn't be
+        * allowed to refer to some uncommitted state (e.g. database objects
+        * created in this transaction).

I'm not sure GUC reload is or should be related to transactions. For
instance, work_mem can be changed by a reload during a transaction
unless it has been set in the current transaction. I don't think we
need to deliberately suppress changes in variables caused by realods
during transactions only for analzye. If analyze doesn't like changes
to certain GUC variables, their values should be snapshotted before
starting the process.

Currently, we only reload the config file in top-level statements. We
don't reload the configuration file from within a nested transaction
command. BEGIN starts a transaction but not a transaction command. So
BEGIN; ANALYZE; probably wouldn't violate this rule. But it is simpler
to just forbid reloading when it is not a top-level transaction command.
I have updated the comment to reflect this.

I feel it's a bit fragile. We may not be able to manage the reload
timeing perfectly. I think we might accidentally add a reload
timing. In that case, the assumption could break. In most cases, I
think we use snapshotting in various ways to avoid unintended variable
changes. (And I beilieve the analyze code also does that.)

+               /* There is at least 1 autovac worker (this worker). */
+               int                     nworkers_for_balance = Max(pg_atomic_read_u32(
+                                                               &AutoVacuumShmem->av_nworkers_for_balance), 1);

I think it *must* be greater than 0. However, to begin with, I don't
think we need that variable to be shared. I don't believe it matters
if we count involved workers every time we calculate the delay.

We are not calculating the delay but the cost limit. The cost limit must

Ah, right, it's limit, but my main point still stands.

be balanced across all of the workers currently actively vacuuming
tables without cost-related table options.

The purpose of the old autovac_balance_cost() is to distribute the
cost among all involved tables, proportionally based on each worker's
cost specification. Adjusting the limit just for tables affected by
reloads disrupts the cost balance.

If I'm not missing anything, this function does something quite
different from the original autovac_balance_cost(). The original
function distributes the total cost based on the GUC variables among
workers proportionally according to each worker's cost
parameters. Howevwer, this function distributes the total cost
equally.

Yes, as I mentioned in the commit message, because all the workers now
have no reason to have different cost parameters (due to reloading the
config file on almost every page), there is no reason to use ratios.
Workers vacuuming a table with no cost-related table options simply need
to divide the limit equally amongst themselves because they all will
have the same limit and delay values.

I'm not sure about the assumption in the commit message. For instance,
if the total cost limit drops significantly, it's possible that the
workers left out of this calculation might end up using all the
reduced cost. Wouldn't this imply that all workers should recompute
their individual limits?

+               int                     vac_cost_limit = autovacuum_vac_cost_limit > 0 ?
+               autovacuum_vac_cost_limit : VacuumCostLimit;
...
+               int                     balanced_cost_limit = vac_cost_limit / nworkers_for_balance;
...
+               VacuumCostLimit = Max(Min(balanced_cost_limit, vac_cost_limit), 1);
}

This seems to repeatedly divide VacuumCostLimit by
nworkers_for_balance. I'm not sure, but this function might only be
called after a reload. If that's the case, I don't think it's safe
coding, even if it works.

Good point about repeatedly dividing VacuumCostLimit by
nworkers_for_balance. I've added a variable to keep track of the base
cost limit and separated the functionality of updating the limit into
two parts -- one AutoVacuumUpdateLimit() which is only meant to be
called after reload and references VacuumCostLimit to set the
av_base_cost_limit and another, AutoVacuumBalanceLimit(), which only
overrides VacuumCostLimit but uses av_base_cost_limit.

Sorry, but will check this later.

I've noted in the comments that AutoVacuumBalanceLimit() should be
called to adjust to a potential change in nworkers_for_balance
(currently every time after we sleep in vacuum_delay_point()) and
AutoVacuumUpdateLimit() should only be called once after a config
reload, as it references VacuumCostLimit.

I will note that this problem also exists in master, as
autovac_balance_cost references VacuumCostLimit in order to set worker
cost limits and then AutoVacuumUpdateDelay() overrides VacuumCostLimit
with the value calculated in autovac_balance_cost() from
VacuumCostLimit.

v10 attached with mentioned updates.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

allow_disabling_vacuum_cost_active_change.txttext/plain; charset=us-asciiDownload
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8f14cf85f3..343500dbe2 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2637,7 +2637,8 @@ lazy_check_wraparound_failsafe(LVRelState *vacrel)
 						 "You might also need to consider other ways for VACUUM to keep up with the allocation of transaction IDs.")));
 
 		/* Stop applying cost limits from this point on */
-		VacuumCostActive = false;
+		VacuumCostActiveForceDisable = true;
+		SetVacuumCostActive(false);
 		VacuumCostBalance = 0;
 
 		return true;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index c54360a6a0..f78219bd1e 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -491,7 +491,7 @@ vacuum(List *relations, VacuumParams *params,
 		ListCell   *cur;
 
 		in_vacuum = true;
-		VacuumCostActive = (VacuumCostDelay > 0);
+		SetVacuumCostActive(VacuumCostDelay > 0);
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
@@ -549,7 +549,7 @@ vacuum(List *relations, VacuumParams *params,
 	PG_FINALLY();
 	{
 		in_vacuum = false;
-		VacuumCostActive = false;
+		SetVacuumCostActive(false);
 	}
 	PG_END_TRY();
 
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index bcd40c80a1..47e0c2c354 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -990,7 +990,7 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												 false);
 
 	/* Set cost-based vacuum delay */
-	VacuumCostActive = (VacuumCostDelay > 0);
+	SetVacuumCostActive(VacuumCostDelay > 0);
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
 	VacuumPageMiss = 0;
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 1b1d814254..1191a39618 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -150,4 +150,13 @@ int64		VacuumPageMiss = 0;
 int64		VacuumPageDirty = 0;
 
 int			VacuumCostBalance = 0;	/* working state for vacuum */
-bool		VacuumCostActive = false;
+bool		VacuumCostActive = false; /* should be set using SetVacuumCostActive() */
+bool		VacuumCostActiveForceDisable = false;
+
+/* Set VacuumCostActive following VacuumCostActiveForceDisable */
+void
+SetVacuumCostActive(bool value)
+{
+	VacuumCostActive = value && !VacuumCostActiveForceDisable;
+}
+
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 06a86f9ac1..97fa3896c0 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -275,7 +275,9 @@ extern PGDLLIMPORT int64 VacuumPageDirty;
 
 extern PGDLLIMPORT int VacuumCostBalance;
 extern PGDLLIMPORT bool VacuumCostActive;
+extern PGDLLIMPORT bool VacuumCostActiveForceDisable;
 
+extern void SetVacuumCostActive(bool value);
 
 /* in tcop/postgres.c */
 
#35Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#34)
Re: Should vacuum process config file reload more often

At Wed, 29 Mar 2023 12:09:08 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

timeing perfectly. I think we might accidentally add a reload
timing. In that case, the assumption could break. In most cases, I
think we use snapshotting in various ways to avoid unintended variable
changes. (And I beilieve the analyze code also does that.)

Okay, I was missing the following code.

autovacuum.c:2893
/*
* If any of the cost delay parameters has been set individually for
* this table, disable the balancing algorithm.
*/
tab->at_dobalance =
!(avopts && (avopts->vacuum_cost_limit > 0 ||
avopts->vacuum_cost_delay > 0));

So, sorry for the noise. I'll review it while this into cnosideration.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#36Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#35)
Re: Should vacuum process config file reload more often

At Wed, 29 Mar 2023 13:21:55 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

autovacuum.c:2893
/*
* If any of the cost delay parameters has been set individually for
* this table, disable the balancing algorithm.
*/
tab->at_dobalance =
!(avopts && (avopts->vacuum_cost_limit > 0 ||
avopts->vacuum_cost_delay > 0));

So, sorry for the noise. I'll review it while this into cnosideration.

Then I found that the code is quite confusing as it is.

For the tables that don't have cost_delay and cost_limit specified
indificually, at_vacuum_cost_limit and _delay store the system global
values detemined by GUCs. wi_cost_delay, _limit and _limit_base stores
the same values with them. As the result I concluded tha
autovac_balance_cost() does exactly what Melanie's patch does, except
that nworkers_for_balance is not stored in shared memory.

I discovered that commit 1021bd6a89 brought in do_balance.

Since the mechanism is already complicated, just disable it for those
cases rather than trying to make it cope. There are undesirable

After reading this, I get why the code is so complex. It is a remnant
of when balancing was done with tables that had individually specified
cost parameters. And I found the following description in the doc.

https://www.postgresql.org/docs/devel/routine-vacuuming.html

When multiple workers are running, the autovacuum cost delay
parameters (see Section 20.4.4) are “balanced” among all the running
workers, so that the total I/O impact on the system is the same
regardless of the number of workers actually running. However, any
workers processing tables whose per-table
autovacuum_vacuum_cost_delay or autovacuum_vacuum_cost_limit storage
parameters have been set are not considered in the balancing
algorithm.

The initial balancing mechanism was brought in by e2a186b03c back in
2007. The balancing code has had that unnecessarily complexity ever
since.

Since I can't think of a better idea than Melanie's proposal for
handling this code, I'll keep reviewing it with that approach in mind.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#37Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#35)
1 attachment(s)
Re: Should vacuum process config file reload more often

At Wed, 29 Mar 2023 13:21:55 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

So, sorry for the noise. I'll review it while this into cnosideration.

0003:

It's not this patche's fault, but I don't like the fact that the
variables used for GUC, VacuumCostDelay and VacuumCostLimit, are
updated outside the GUC mechanism. Also I don't like the incorrect
sorting of variables, where some working variables are referred to as
GUC parameters or vise versa.

Although it's somewhat unrelated to the goal of this patch, I think we
should clean up the code tidy before proceeding. Shouldn't we separate
the actual parameters from the GUC base variables, and sort out the
all related variaghble? (something like the attached, on top of your
patch.)

I have some comments on 0003 as-is.

+		tab->at_relopt_vac_cost_limit = avopts ?
+			avopts->vacuum_cost_limit : 0;
+		tab->at_relopt_vac_cost_delay = avopts ?
+			avopts->vacuum_cost_delay : -1;

The value is not used when do_balance is false, so I don't see a
specific reason for these variables to be different when avopts is
null.

+autovac_recalculate_workers_for_balance(void)
+{
+	dlist_iter	iter;
+	int			orig_nworkers_for_balance;
+	int			nworkers_for_balance = 0;
+
+	if (autovacuum_vac_cost_delay == 0 ||
+		(autovacuum_vac_cost_delay == -1 && VacuumCostDelay == 0))
 		return;
+	if (autovacuum_vac_cost_limit <= 0 && VacuumCostLimit <= 0)
+		return;
+

I'm not quite sure how these conditions relate to the need to count
workers that shares the global I/O cost. (Though I still believe this
funtion might not be necessary.)

+	if (av_relopt_cost_limit > 0)
+		VacuumCostLimit = av_relopt_cost_limit;
+	else
+	{
+		av_base_cost_limit = autovacuum_vac_cost_limit > 0 ?
+			autovacuum_vac_cost_limit : VacuumCostLimit;
+
+		AutoVacuumBalanceLimit();

I think each worker should use MyWorkerInfo->wi_dobalance to identyify
whether the worker needs to use balanced cost values.

+void
+AutoVacuumBalanceLimit(void)

I'm not sure this function needs to be a separate function.

(Sorry, timed out..)

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

refactor_vacuum_variable_defenitions.txttext/plain; charset=us-asciiDownload
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index e9b683805a..f7ef7860ac 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -72,8 +72,22 @@ int			vacuum_multixact_freeze_min_age;
 int			vacuum_multixact_freeze_table_age;
 int			vacuum_failsafe_age;
 int			vacuum_multixact_failsafe_age;
+int			vacuum_cost_page_hit = 1;
+int			vacuum_cost_page_miss = 2;
+int			vacuum_cost_page_dirty = 20;
+int			vacuum_cost_limit = 200;
+double		vacuum_cost_delay = 0;
 
 
+/* working state for vacuum */
+int			VacuumCostBalance = 0;
+int			VacuumCostInactive = 1;
+int			VacuumCostLimit = 200;
+double		VacuumCostDelay = 0;
+int64		VacuumPageHit = 0;
+int64		VacuumPageMiss = 0;
+int64		VacuumPageDirty = 0;
+
 /* A few variables that don't seem worth passing around as parameters */
 static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c8dae5465a..b475db9bfe 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1832,7 +1832,7 @@ AutoVacuumUpdateLimit(void)
 	else
 	{
 		av_base_cost_limit = autovacuum_vac_cost_limit > 0 ?
-			autovacuum_vac_cost_limit : VacuumCostLimit;
+			autovacuum_vac_cost_limit : vacuum_cost_limit;
 
 		AutoVacuumBalanceLimit();
 	}
@@ -1866,7 +1866,7 @@ AutoVacuumBalanceLimit(void)
 	Assert(nworkers_for_balance > 0);
 
 	total_cost_limit = autovacuum_vac_cost_limit > 0 ?
-		autovacuum_vac_cost_limit : av_base_cost_limit;
+		autovacuum_vac_cost_limit : vacuum_cost_limit;
 
 	balanced_cost_limit = total_cost_limit / nworkers_for_balance;
 
@@ -1888,10 +1888,10 @@ autovac_recalculate_workers_for_balance(void)
 	int			nworkers_for_balance = 0;
 
 	if (autovacuum_vac_cost_delay == 0 ||
-		(autovacuum_vac_cost_delay == -1 && VacuumCostDelay == 0))
+		(autovacuum_vac_cost_delay == -1 && vacuum_cost_limit == 0))
 		return;
 
-	if (autovacuum_vac_cost_limit <= 0 && VacuumCostLimit <= 0)
+	if (autovacuum_vac_cost_limit <= 0 && vacuum_cost_limit <= 0)
 		return;
 
 	orig_nworkers_for_balance =
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 6d3dd26fc7..4524df23c4 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -39,6 +39,7 @@
 #include "catalog/catalog.h"
 #include "catalog/storage.h"
 #include "catalog/storage_xlog.h"
+#include "commands/vacuum.h"
 #include "executor/instrument.h"
 #include "lib/binaryheap.h"
 #include "miscadmin.h"
@@ -894,7 +895,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 			VacuumPageHit++;
 
 			if (!VacuumCostInactive)
-				VacuumCostBalance += VacuumCostPageHit;
+				VacuumCostBalance += vacuum_cost_page_hit;
 
 			TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
 											  smgr->smgr_rlocator.locator.spcOid,
@@ -1099,7 +1100,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 
 	VacuumPageMiss++;
 	if (!VacuumCostInactive)
-		VacuumCostBalance += VacuumCostPageMiss;
+		VacuumCostBalance += vacuum_cost_page_miss;
 
 	TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
 									  smgr->smgr_rlocator.locator.spcOid,
@@ -1673,7 +1674,7 @@ MarkBufferDirty(Buffer buffer)
 		VacuumPageDirty++;
 		pgBufferUsage.shared_blks_dirtied++;
 		if (!VacuumCostInactive)
-			VacuumCostBalance += VacuumCostPageDirty;
+			VacuumCostBalance += vacuum_cost_page_dirty;
 	}
 }
 
@@ -4200,7 +4201,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
 			VacuumPageDirty++;
 			pgBufferUsage.shared_blks_dirtied++;
 			if (!VacuumCostInactive)
-				VacuumCostBalance += VacuumCostPageDirty;
+				VacuumCostBalance += vacuum_cost_page_dirty;
 		}
 	}
 }
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 608ebb9182..dbf463c6e1 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -138,16 +138,3 @@ int			MaxConnections = 100;
 int			max_worker_processes = 8;
 int			max_parallel_workers = 8;
 int			MaxBackends = 0;
-
-int			VacuumCostPageHit = 1;	/* GUC parameters for vacuum */
-int			VacuumCostPageMiss = 2;
-int			VacuumCostPageDirty = 20;
-int			VacuumCostLimit = 200;
-double		VacuumCostDelay = 0;
-
-int64		VacuumPageHit = 0;
-int64		VacuumPageMiss = 0;
-int64		VacuumPageDirty = 0;
-
-int			VacuumCostBalance = 0;	/* working state for vacuum */
-int			VacuumCostInactive = 1;
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8062589efd..179f39ab9c 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2379,7 +2379,7 @@ struct config_int ConfigureNamesInt[] =
 			gettext_noop("Vacuum cost for a page found in the buffer cache."),
 			NULL
 		},
-		&VacuumCostPageHit,
+		&vacuum_cost_page_hit,
 		1, 0, 10000,
 		NULL, NULL, NULL
 	},
@@ -2389,7 +2389,7 @@ struct config_int ConfigureNamesInt[] =
 			gettext_noop("Vacuum cost for a page not found in the buffer cache."),
 			NULL
 		},
-		&VacuumCostPageMiss,
+		&vacuum_cost_page_miss,
 		2, 0, 10000,
 		NULL, NULL, NULL
 	},
@@ -2399,7 +2399,7 @@ struct config_int ConfigureNamesInt[] =
 			gettext_noop("Vacuum cost for a page dirtied by vacuum."),
 			NULL
 		},
-		&VacuumCostPageDirty,
+		&vacuum_cost_page_dirty,
 		20, 0, 10000,
 		NULL, NULL, NULL
 	},
@@ -2409,7 +2409,7 @@ struct config_int ConfigureNamesInt[] =
 			gettext_noop("Vacuum cost amount available before napping."),
 			NULL
 		},
-		&VacuumCostLimit,
+		&vacuum_cost_limit,
 		200, 1, 10000,
 		NULL, NULL, NULL
 	},
@@ -3701,7 +3701,7 @@ struct config_real ConfigureNamesReal[] =
 			NULL,
 			GUC_UNIT_MS
 		},
-		&VacuumCostDelay,
+		&vacuum_cost_delay,
 		0, 0, 100,
 		NULL, NULL, NULL
 	},
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 5c3e250b06..54842a75f9 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,11 @@ extern PGDLLIMPORT int vacuum_multixact_freeze_min_age;
 extern PGDLLIMPORT int vacuum_multixact_freeze_table_age;
 extern PGDLLIMPORT int vacuum_failsafe_age;
 extern PGDLLIMPORT int vacuum_multixact_failsafe_age;
+extern PGDLLIMPORT int vacuum_cost_limit;
+extern PGDLLIMPORT double vacuum_cost_delay;
+extern PGDLLIMPORT int vacuum_cost_page_hit;
+extern PGDLLIMPORT int vacuum_cost_page_miss;
+extern PGDLLIMPORT int vacuum_cost_page_dirty;
 
 /* Variables for cost-based parallel vacuum */
 
@@ -312,7 +317,15 @@ typedef enum VacuumCostStatus
 
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
+
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
+extern PGDLLIMPORT int VacuumCostBalance;
+extern PGDLLIMPORT int VacuumCostInactive;
+extern PGDLLIMPORT int VacuumCostLimit;
+extern PGDLLIMPORT double VacuumCostDelay;
+extern PGDLLIMPORT int64 VacuumPageHit;
+extern PGDLLIMPORT int64 VacuumPageMiss;
+extern PGDLLIMPORT int64 VacuumPageDirty;
 
 
 /* in commands/vacuum.c */
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 33e22733ae..cf0cc919cf 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -263,18 +263,10 @@ extern PGDLLIMPORT double hash_mem_multiplier;
 extern PGDLLIMPORT int maintenance_work_mem;
 extern PGDLLIMPORT int max_parallel_maintenance_workers;
 
-extern PGDLLIMPORT int VacuumCostPageHit;
-extern PGDLLIMPORT int VacuumCostPageMiss;
-extern PGDLLIMPORT int VacuumCostPageDirty;
-extern PGDLLIMPORT int VacuumCostLimit;
-extern PGDLLIMPORT double VacuumCostDelay;
 
-extern PGDLLIMPORT int64 VacuumPageHit;
-extern PGDLLIMPORT int64 VacuumPageMiss;
-extern PGDLLIMPORT int64 VacuumPageDirty;
-
-extern PGDLLIMPORT int VacuumCostBalance;
-extern PGDLLIMPORT int VacuumCostInactive;
+/*****************************************************************************
+ *	  globals.h --															 *
+ *****************************************************************************/
 
 /* in tcop/postgres.c */
 
#38Melanie Plageman
melanieplageman@gmail.com
In reply to: Kyotaro Horiguchi (#37)
3 attachment(s)
Re: Should vacuum process config file reload more often

Thanks for the detailed review!

On Tue, Mar 28, 2023 at 11:09 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Tue, 28 Mar 2023 20:35:28 -0400, Melanie Plageman <melanieplageman@gmail.com> wrote in

On Tue, Mar 28, 2023 at 4:21 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:

At Mon, 27 Mar 2023 14:12:03 -0400, Melanie Plageman <melanieplageman@gmail.com> wrote in

0002:

I felt a bit uneasy on this. It seems somewhat complex (and makes the
succeeding patches complex),

Even if we introduced a second global variable to indicate that failsafe
mode has been engaged, we would still require the additional checks
of VacuumCostInactive.

has confusing names,

I would be happy to rename the values of the enum to make them less
confusing. Are you thinking "force" instead of "locked"?
maybe:
VACUUM_COST_FORCE_INACTIVE and
VACUUM_COST_INACTIVE
?

and doesn't seem like self-contained.

By changing the variable from VacuumCostActive to VacuumCostInactive, I
have kept all non-vacuum code from having to distinguish between it
being inactive due to failsafe mode or due to user settings.

My concern is that VacuumCostActive is logic-inverted and turned into
a ternary variable in a subtle way. The expression
"!VacuumCostInactive" is quite confusing. (I sometimes feel the same
way about "!XLogRecPtrIsInvalid(lsn)", and I believe most people write
it with another macro like "lsn != InvalidXLogrecPtr"). Additionally,
the constraint in this patch will be implemented as open code. So I
wanted to suggest something like the attached. The main idea is to use
a wrapper function to enforce the restriction, and by doing so, we
eliminated the need to make the variable into a ternary without a good
reason.

So, the rationale for making it a ternary is that the variable is the
combination of two pieces of information which has only has 3 valid
states:
failsafe inactive + cost active = cost active
failsafe inactive + cost inactive = cost inactive
failsafe active + cost inactive = cost inactive and locked
the fourth is invalid
failsafe active + cost active = invalid
That is harder to enforce with two variables.
Also, the two pieces of information are not meaningful individually.
So, I thought it made sense to make a single variable.

Your suggested patch introduces an additional variable which shadows
LVRelState->failsafe_active but doesn't actually get set/reset at all of
the correct places. If we did introduce a second global variable, I
don't think we should also keep LVRelState->failsafe_active, as keeping
them in sync will be difficult.

As for the double negative (!VacuumCostInactive), I agree that it is not
ideal, however, if we use a ternary and keep VacuumCostActive, there is
no way for non-vacuum code to treat it as a boolean.
With the ternary VacuumCostInactive, only vacuum code has to know about
the distinction between inactive+failsafe active and inactive+failsafe
inactive.

As for the setter function, I think that having a function to set
VacuumCostActive based on failsafe_active is actually doing more harm
than good. Only vacuum code has to know about the distinction as it is,
so we aren't really saving any trouble (there would really only be two
callers of the suggested function). And, since the function hides
whether or not VacuumCostActive was actually set to the passed-in value,
we can't easily do other necessary maintenance -- like zero out
VacuumCostBalance if we disabled vacuum cost.

0003:

+        * Reload the configuration file if requested. This allows changes to
+        * vacuum_cost_limit and vacuum_cost_delay to take effect while a table is
+        * being vacuumed or analyzed. Analyze should not reload configuration
+        * file if it is in an outer transaction, as GUC values shouldn't be
+        * allowed to refer to some uncommitted state (e.g. database objects
+        * created in this transaction).

I'm not sure GUC reload is or should be related to transactions. For
instance, work_mem can be changed by a reload during a transaction
unless it has been set in the current transaction. I don't think we
need to deliberately suppress changes in variables caused by realods
during transactions only for analzye. If analyze doesn't like changes
to certain GUC variables, their values should be snapshotted before
starting the process.

Currently, we only reload the config file in top-level statements. We
don't reload the configuration file from within a nested transaction
command. BEGIN starts a transaction but not a transaction command. So
BEGIN; ANALYZE; probably wouldn't violate this rule. But it is simpler
to just forbid reloading when it is not a top-level transaction command.
I have updated the comment to reflect this.

I feel it's a bit fragile. We may not be able to manage the reload
timeing perfectly. I think we might accidentally add a reload
timing. In that case, the assumption could break. In most cases, I
think we use snapshotting in various ways to avoid unintended variable
changes. (And I beilieve the analyze code also does that.)

I'm not sure I fully understand the problem you are thinking of. What do
you mean about managing the reload timing? Are you suggesting there is a
problem with excluding analzye in an outer transaction from doing the
reload or with doing the reload during vacuum and analyze when they are
top-level statements?

And, by snapshotting do you mean how vacuum_rel() and do_analyze_rel() do
NewGUCNestLevel() so that they can then do AtEOXact_GUC() and rollback
guc changes done during that operation?
How are you envisioning that being used here?

On Wed, Mar 29, 2023 at 2:00 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Wed, 29 Mar 2023 13:21:55 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

autovacuum.c:2893
/*
* If any of the cost delay parameters has been set individually for
* this table, disable the balancing algorithm.
*/
tab->at_dobalance =
!(avopts && (avopts->vacuum_cost_limit > 0 ||
avopts->vacuum_cost_delay > 0));

So, sorry for the noise. I'll review it while this into cnosideration.

Then I found that the code is quite confusing as it is.

For the tables that don't have cost_delay and cost_limit specified
indificually, at_vacuum_cost_limit and _delay store the system global
values detemined by GUCs. wi_cost_delay, _limit and _limit_base stores
the same values with them. As the result I concluded tha
autovac_balance_cost() does exactly what Melanie's patch does, except
that nworkers_for_balance is not stored in shared memory.

I discovered that commit 1021bd6a89 brought in do_balance.

Since the mechanism is already complicated, just disable it for those
cases rather than trying to make it cope. There are undesirable

After reading this, I get why the code is so complex. It is a remnant
of when balancing was done with tables that had individually specified
cost parameters. And I found the following description in the doc.

https://www.postgresql.org/docs/devel/routine-vacuuming.html

When multiple workers are running, the autovacuum cost delay
parameters (see Section 20.4.4) are “balanced” among all the running
workers, so that the total I/O impact on the system is the same
regardless of the number of workers actually running. However, any
workers processing tables whose per-table
autovacuum_vacuum_cost_delay or autovacuum_vacuum_cost_limit storage
parameters have been set are not considered in the balancing
algorithm.

The initial balancing mechanism was brought in by e2a186b03c back in
2007. The balancing code has had that unnecessarily complexity ever
since.

Since I can't think of a better idea than Melanie's proposal for
handling this code, I'll keep reviewing it with that approach in mind.

Thanks for doing this archaeology. I didn't know the history of dobalance
and hadn't looked into 1021bd6a89.
I was a bit confused by why dobalance was false even if only table
option cost delay is set and not table option cost limit.

I think we can retain this behavior for now, but it may be worth
re-examining in the future.

On Wed, Mar 29, 2023 at 4:35 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Wed, 29 Mar 2023 13:21:55 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

So, sorry for the noise. I'll review it while this into cnosideration.

0003:

It's not this patche's fault, but I don't like the fact that the
variables used for GUC, VacuumCostDelay and VacuumCostLimit, are
updated outside the GUC mechanism. Also I don't like the incorrect
sorting of variables, where some working variables are referred to as
GUC parameters or vise versa.

Although it's somewhat unrelated to the goal of this patch, I think we
should clean up the code tidy before proceeding. Shouldn't we separate
the actual parameters from the GUC base variables, and sort out the
all related variaghble? (something like the attached, on top of your
patch.)

So, I agree we should separate the parameters used in the code from the
GUC variables -- since there are multiple users with different needs
(autovac workers, parallel vac workers, and vacuum). However, I was
hesitant to tackle that here.

I'm not sure how these changes will impact extensions that rely on
these vacuum parameters and their direct relationship to the guc values.

In your patch, you didn't update the parameter with the guc value of
vacuum_cost_limit and vacuum_cost_delay, but were we to do so, we would
need to make sure it was updated every time after a config reload. This
isn't hard to do in the current code, but I'm not sure how we can ensure
that future callers of ProcessConfigFile() in vacuum code always update
these values afterward. Perhaps we could add some after_reload hook?
Which does seem like a larger project.

I have some comments on 0003 as-is.

+               tab->at_relopt_vac_cost_limit = avopts ?
+                       avopts->vacuum_cost_limit : 0;
+               tab->at_relopt_vac_cost_delay = avopts ?
+                       avopts->vacuum_cost_delay : -1;

The value is not used when do_balance is false, so I don't see a
specific reason for these variables to be different when avopts is
null.

Actually we need to set these to 0 and -1, because we set
av_relopt_cost_limit and av_relopt_cost_delay with them and those values
are checked regardless of wi_dobalance.

We need to do this because we want to use the correct value to override
VacuumCostLimit and VacuumCostDelay. wi_dobalance may be false because
we have a table option cost delay but we have no table option cost
limit. When we override VacuumCostDelay, we want to use the table option
value but when we override VacuumCostLimit, we want to use the regular
value. We need these initialized to values that will allow us to do
that.

+autovac_recalculate_workers_for_balance(void)
+{
+       dlist_iter      iter;
+       int                     orig_nworkers_for_balance;
+       int                     nworkers_for_balance = 0;
+
+       if (autovacuum_vac_cost_delay == 0 ||
+               (autovacuum_vac_cost_delay == -1 && VacuumCostDelay == 0))
return;
+       if (autovacuum_vac_cost_limit <= 0 && VacuumCostLimit <= 0)
+               return;
+

I'm not quite sure how these conditions relate to the need to count
workers that shares the global I/O cost.

Ah, this is a good point, we should still keep this number up-to-date
even if the costs are disabled at the time we are checking it in case
cost-based delays are re-enabled later before we recalculate this
number. I had this code originally because autovac_balance_cost() would
exit early if cost-based delays were disabled -- but this only worked
because they couldn't be re-enabled during vacuuming a table and
autovac_balance_cost() was called always in between vacuuming tables.

I've removed these lines.

And perhaps there is an argument for calling
autovac_recalculate_workers_for_balance() in vacuum_delay_point() after
reloading the config file...
I have not done so in attached version.

(Though I still believe this funtion might not be necessary.)

I don't see how we can do without this function. We need an up-to-date
count of the number of autovacuum workers vacuuming tables which do not
have vacuum cost-related table options.

+       if (av_relopt_cost_limit > 0)
+               VacuumCostLimit = av_relopt_cost_limit;
+       else
+       {
+               av_base_cost_limit = autovacuum_vac_cost_limit > 0 ?
+                       autovacuum_vac_cost_limit : VacuumCostLimit;
+
+               AutoVacuumBalanceLimit();

I think each worker should use MyWorkerInfo->wi_dobalance to identyify
whether the worker needs to use balanced cost values.

Ah, there is a bug here. I have fixed it by making wi_dobalance an
atomic flag so that we can check it before calling
AutoVacuumBalanceLimit() (without taking a lock).

I don't see other (non-test code) callers using atomic flags, so I can't
tell if we need to loop to ensure that pg_atomic_test_set_flag() returns
true.

+void
+AutoVacuumBalanceLimit(void)

I'm not sure this function needs to be a separate function.

We need to call it more often than we can call AutoVacuumUpdateLimit(),
so the logic needs to be separate. Are you suggesting we inline the
logic in the two places it is needed?

v11 attached with updates mentioned above.

- Melanie

Attachments:

v11-0003-Autovacuum-refreshes-cost-based-delay-params-mor.patchtext/x-patch; charset=US-ASCII; name=v11-0003-Autovacuum-refreshes-cost-based-delay-params-mor.patchDownload
From a1db301c122641acd297a05d29bcb32bc7b769e2 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 25 Mar 2023 14:14:55 -0400
Subject: [PATCH v11 3/3] Autovacuum refreshes cost-based delay params more
 often

The previous commit allowed VACUUM to reload the config file more often
so that cost-based delay parameters could take effect while VACUUMing a
relation. Autovacuum, however did not benefit from this change.

In order for autovacuum workers to safely update their own cost delay
and cost limit parameters without impacting performance, we had to
rethink when and how these values were accessed.

Previously, an autovacuum worker's wi_cost_limit was set only at the
beginning of vacuuming a table, after reloading the config file.
Therefore, at the time that autovac_balance_cost() is called, workers
vacuuming tables with no table options could still have different values
for their wi_cost_limit_base and wi_cost_delay.

Now that the cost parameters can be updated while vacuuming a table,
workers will (within some margin of error) have no reason to have
different values for cost limit and cost delay (in the absence of table
options). This removes the rationale for keeping cost limit and cost
delay in shared memory. Balancing the cost limit requires only the
number of active autovacuum workers vacuuming a table with no cost-based
table options.

Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
---
 src/backend/commands/vacuum.c       |  22 ++-
 src/backend/postmaster/autovacuum.c | 286 +++++++++++++++-------------
 src/include/postmaster/autovacuum.h |   2 +
 3 files changed, 175 insertions(+), 135 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7e3a8e404e..e9b683805a 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2241,10 +2241,10 @@ vacuum_delay_point(void)
 
 	/*
 	 * Reload the configuration file if requested. This allows changes to
-	 * vacuum_cost_limit and vacuum_cost_delay to take effect while a table is
-	 * being vacuumed or analyzed. Analyze should not reload configuration file
-	 * if it is in an outer transaction, as we currently only allow
-	 * configuration reload when in top-level statements.
+	 * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay to
+	 * take effect while a table is being vacuumed or analyzed. Analyze should
+	 * not reload configuration file if it is in an outer transaction, as we
+	 * currently only allow configuration reload when in top-level statements.
 	 */
 	if (ConfigReloadPending && !analyze_in_outer_xact)
 	{
@@ -2257,10 +2257,12 @@ vacuum_delay_point(void)
 		 * by reload.
 		 */
 		AutoVacuumUpdateDelay();
+		AutoVacuumUpdateLimit();
 
 		/*
 		 * If configuration changes are allowed to impact VacuumCostInactive,
-		 * make sure it is updated.
+		 * make sure it is updated. Autovacuum workers will have already done
+		 * this in AutoVacuumUpdateDelay()
 		 */
 		if (VacuumCostInactive == VACUUM_COST_INACTIVE_AND_LOCKED)
 			return;
@@ -2311,8 +2313,14 @@ vacuum_delay_point(void)
 
 		VacuumCostBalance = 0;
 
-		/* update balance values for workers */
-		AutoVacuumUpdateDelay();
+		/*
+		 * Balance and update limit values for autovacuum workers. We must
+		 * always do this in case the autovacuum launcher or another
+		 * autovacuum worker has recalculated the number of workers across
+		 * which we must balance the limit. This is done by the launcher when
+		 * launching a new worker and by workers before vacuuming each table.
+		 */
+		AutoVacuumBalanceLimit();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 585d28148c..9775764fc4 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -139,6 +139,10 @@ int			Log_autovacuum_min_duration = 600000;
 static bool am_autovacuum_launcher = false;
 static bool am_autovacuum_worker = false;
 
+static double av_relopt_cost_delay = -1;
+static int	av_relopt_cost_limit = 0;
+static int	av_base_cost_limit = 0;
+
 /* Flags set by signal handlers */
 static volatile sig_atomic_t got_SIGUSR2 = false;
 
@@ -189,8 +193,8 @@ typedef struct autovac_table
 {
 	Oid			at_relid;
 	VacuumParams at_params;
-	double		at_vacuum_cost_delay;
-	int			at_vacuum_cost_limit;
+	double		at_relopt_vac_cost_delay;
+	int			at_relopt_vac_cost_limit;
 	bool		at_dobalance;
 	bool		at_sharedrel;
 	char	   *at_relname;
@@ -209,7 +213,7 @@ typedef struct autovac_table
  * wi_sharedrel flag indicating whether table is marked relisshared
  * wi_proc		pointer to PGPROC of the running worker, NULL if not started
  * wi_launchtime Time at which this worker was launched
- * wi_cost_*	Vacuum cost-based delay parameters current in this worker
+ * wi_dobalance Whether this worker should be included in balance calculations
  *
  * All fields are protected by AutovacuumLock, except for wi_tableoid and
  * wi_sharedrel which are protected by AutovacuumScheduleLock (note these
@@ -223,11 +227,8 @@ typedef struct WorkerInfoData
 	Oid			wi_tableoid;
 	PGPROC	   *wi_proc;
 	TimestampTz wi_launchtime;
-	bool		wi_dobalance;
+	pg_atomic_flag wi_dobalance;
 	bool		wi_sharedrel;
-	double		wi_cost_delay;
-	int			wi_cost_limit;
-	int			wi_cost_limit_base;
 } WorkerInfoData;
 
 typedef struct WorkerInfoData *WorkerInfo;
@@ -273,6 +274,8 @@ typedef struct AutoVacuumWorkItem
  * av_startingWorker pointer to WorkerInfo currently being started (cleared by
  *					the worker itself as soon as it's up and running)
  * av_workItems		work item array
+ * av_nworkersForBalance the number of autovacuum workers to use when
+ * 					calculating the per worker cost limit
  *
  * This struct is protected by AutovacuumLock, except for av_signal and parts
  * of the worker list (see above).
@@ -286,6 +289,7 @@ typedef struct
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+	pg_atomic_uint32 av_nworkersForBalance;
 } AutoVacuumShmemStruct;
 
 static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -319,7 +323,7 @@ static void launch_worker(TimestampTz now);
 static List *get_database_list(void);
 static void rebuild_database_list(Oid newdb);
 static int	db_comparator(const void *a, const void *b);
-static void autovac_balance_cost(void);
+static void autovac_recalculate_workers_for_balance(void);
 
 static void do_autovacuum(void);
 static void FreeWorkerInfo(int code, Datum arg);
@@ -670,7 +674,7 @@ AutoVacLauncherMain(int argc, char *argv[])
 			{
 				LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 				AutoVacuumShmem->av_signal[AutoVacRebalance] = false;
-				autovac_balance_cost();
+				autovac_recalculate_workers_for_balance();
 				LWLockRelease(AutovacuumLock);
 			}
 
@@ -820,8 +824,8 @@ HandleAutoVacLauncherInterrupts(void)
 			AutoVacLauncherShutdown();
 
 		/* rebalance in case the default cost parameters changed */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-		autovac_balance_cost();
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
+		autovac_recalculate_workers_for_balance();
 		LWLockRelease(AutovacuumLock);
 
 		/* rebuild the list in case the naptime changed */
@@ -1755,10 +1759,7 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_sharedrel = false;
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
-		MyWorkerInfo->wi_dobalance = false;
-		MyWorkerInfo->wi_cost_delay = 0;
-		MyWorkerInfo->wi_cost_limit = 0;
-		MyWorkerInfo->wi_cost_limit_base = 0;
+		pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
 		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 						&MyWorkerInfo->wi_links);
 		/* not mine anymore */
@@ -1773,100 +1774,140 @@ FreeWorkerInfo(int code, Datum arg)
 	}
 }
 
+
 /*
- * Update the cost-based delay parameters, so that multiple workers consume
- * each a fraction of the total available I/O.
+ * Update VacuumCostDelay with the correct value for an autovacuum worker,
+ * given the value of other relevant cost-based delay parameters. Autovacuum
+ * workers should call this after every config reload, in case VacuumCostDelay
+ * was overwritten.
  */
 void
 AutoVacuumUpdateDelay(void)
 {
-	if (MyWorkerInfo)
+	if (!am_autovacuum_worker)
+		return;
+
+	if (av_relopt_cost_delay >= 0)
+		VacuumCostDelay = av_relopt_cost_delay;
+	else if (autovacuum_vac_cost_delay >= 0)
+		VacuumCostDelay = autovacuum_vac_cost_delay;
+
+	/*
+	 * If configuration changes are allowed to impact VacuumCostInactive, make
+	 * sure it is updated.
+	 */
+	if (VacuumCostInactive == VACUUM_COST_INACTIVE_AND_LOCKED)
+		return;
+
+	if (VacuumCostDelay > 0)
+		VacuumCostInactive = VACUUM_COST_ACTIVE;
+	else
 	{
-		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
-		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+		VacuumCostInactive = VACUUM_COST_INACTIVE_AND_UNLOCKED;
+		VacuumCostBalance = 0;
 	}
 }
 
+
 /*
- * autovac_balance_cost
- *		Recalculate the cost limit setting for each active worker.
- *
- * Caller must hold the AutovacuumLock in exclusive mode.
+ * This must be called directly after a config reload before using the value of
+ * VacuumCostLimit and before calling AutoVacuumBalanceLimit(), as it uses the
+ * value of VacuumCostLimit to determine what the base av_base_cost_limit
+ * should be. AutoVacuumBalanceLimit() will override the value of
+ * VacuumCostLimit, so calling it multiple times after a config reload is
+ * incorrect.
  */
-static void
-autovac_balance_cost(void)
+void
+AutoVacuumUpdateLimit(void)
 {
+	if (!am_autovacuum_worker)
+		return;
+
 	/*
-	 * The idea here is that we ration out I/O equally.  The amount of I/O
-	 * that a worker can consume is determined by cost_limit/cost_delay, so we
-	 * try to equalize those ratios rather than the raw limit settings.
-	 *
 	 * note: in cost_limit, zero also means use value from elsewhere, because
 	 * zero is not a valid value.
 	 */
-	int			vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
-								  autovacuum_vac_cost_limit : VacuumCostLimit);
-	double		vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
-								  autovacuum_vac_cost_delay : VacuumCostDelay);
-	double		cost_total;
-	double		cost_avail;
-	dlist_iter	iter;
-
-	/* not set? nothing to do */
-	if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
-		return;
-
-	/* calculate the total base cost limit of participating active workers */
-	cost_total = 0.0;
-	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
+	if (av_relopt_cost_limit > 0)
+		VacuumCostLimit = av_relopt_cost_limit;
+	else
 	{
-		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
+		av_base_cost_limit = autovacuum_vac_cost_limit > 0 ?
+			autovacuum_vac_cost_limit : VacuumCostLimit;
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-			cost_total +=
-				(double) worker->wi_cost_limit_base / worker->wi_cost_delay;
+		AutoVacuumBalanceLimit();
 	}
+}
 
-	/* there are no cost limits -- nothing to do */
-	if (cost_total <= 0)
+/*
+ * Update VacuumCostLimit with the correct value for an autovacuum worker,
+ * given the value of other relevant cost limit parameters and the number of
+ * workers across which the limit must be balanced. Autovacuum workers must
+ * call this regularly in case av_nworkers_for_balance has been updated by
+ * another worker or by the autovacuum launcher. After a config reload, they
+ * must call AutoVacuumUpdateLimit() which will call AutoVacuumBalanceLimit(),
+ * in case VacuumCostLimit was overwritten.
+ */
+void
+AutoVacuumBalanceLimit(void)
+{
+	int			nworkers_for_balance;
+	int			total_cost_limit;
+	int			balanced_cost_limit;
+
+	if (!MyWorkerInfo)
 		return;
 
-	/*
-	 * Adjust cost limit of each active worker to balance the total of cost
-	 * limit to autovacuum_vacuum_cost_limit.
-	 */
-	cost_avail = (double) vac_cost_limit / vac_cost_delay;
+	if (pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance))
+		return;
+
+	Assert(av_base_cost_limit > 0);
+
+	nworkers_for_balance = pg_atomic_read_u32(
+							&AutoVacuumShmem->av_nworkersForBalance);
+
+	/* There is at least 1 autovac worker (this worker). */
+	Assert(nworkers_for_balance > 0);
+
+	total_cost_limit = autovacuum_vac_cost_limit > 0 ?
+		autovacuum_vac_cost_limit : av_base_cost_limit;
+
+	balanced_cost_limit = total_cost_limit / nworkers_for_balance;
+
+	VacuumCostLimit = Max(Min(balanced_cost_limit, total_cost_limit), 1);
+}
+
+/*
+ * autovac_recalculate_workers_for_balance
+ *		Recalculate the number of workers to consider, given table options and
+ *		the current number of active workers.
+ *
+ * Caller must hold the AutovacuumLock in at least shared mode to access
+ * worker->wi_proc.
+ */
+static void
+autovac_recalculate_workers_for_balance(void)
+{
+	dlist_iter	iter;
+	int			orig_nworkers_for_balance;
+	int			nworkers_for_balance = 0;
+
+	orig_nworkers_for_balance =
+		pg_atomic_read_u32(&AutoVacuumShmem->av_nworkersForBalance);
+
 	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
 	{
 		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-		{
-			int			limit = (int)
-			(cost_avail * worker->wi_cost_limit_base / cost_total);
-
-			/*
-			 * We put a lower bound of 1 on the cost_limit, to avoid division-
-			 * by-zero in the vacuum code.  Also, in case of roundoff trouble
-			 * in these calculations, let's be sure we don't ever set
-			 * cost_limit to more than the base value.
-			 */
-			worker->wi_cost_limit = Max(Min(limit,
-											worker->wi_cost_limit_base),
-										1);
-		}
+		if (worker->wi_proc == NULL ||
+			pg_atomic_unlocked_test_flag(&worker->wi_dobalance))
+			continue;
 
-		if (worker->wi_proc != NULL)
-			elog(DEBUG2, "autovac_balance_cost(pid=%d db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d, cost_delay=%g)",
-				 worker->wi_proc->pid, worker->wi_dboid, worker->wi_tableoid,
-				 worker->wi_dobalance ? "yes" : "no",
-				 worker->wi_cost_limit, worker->wi_cost_limit_base,
-				 worker->wi_cost_delay);
+		nworkers_for_balance++;
 	}
+
+	if (nworkers_for_balance != orig_nworkers_for_balance)
+		pg_atomic_write_u32(&AutoVacuumShmem->av_nworkersForBalance,
+							nworkers_for_balance);
 }
 
 /*
@@ -2312,8 +2353,6 @@ do_autovacuum(void)
 		autovac_table *tab;
 		bool		isshared;
 		bool		skipit;
-		double		stdVacuumCostDelay;
-		int			stdVacuumCostLimit;
 		dlist_iter	iter;
 
 		CHECK_FOR_INTERRUPTS();
@@ -2417,30 +2456,29 @@ do_autovacuum(void)
 		}
 
 		/*
-		 * Remember the prevailing values of the vacuum cost GUCs.  We have to
-		 * restore these at the bottom of the loop, else we'll compute wrong
-		 * values in the next iteration of autovac_balance_cost().
+		 * Save the cost-related table options in global variables for
+		 * reference when updating VacuumCostLimit and VacuumCostDelay during
+		 * vacuuming this table.
 		 */
-		stdVacuumCostDelay = VacuumCostDelay;
-		stdVacuumCostLimit = VacuumCostLimit;
+		av_relopt_cost_limit = tab->at_relopt_vac_cost_limit;
+		av_relopt_cost_delay = tab->at_relopt_vac_cost_delay;
 
-		/* Must hold AutovacuumLock while mucking with cost balance info */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-
-		/* advertise my cost delay parameters for the balancing algorithm */
-		MyWorkerInfo->wi_dobalance = tab->at_dobalance;
-		MyWorkerInfo->wi_cost_delay = tab->at_vacuum_cost_delay;
-		MyWorkerInfo->wi_cost_limit = tab->at_vacuum_cost_limit;
-		MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
+		if (tab->at_dobalance)
+			pg_atomic_test_set_flag(&MyWorkerInfo->wi_dobalance);
+		else
+			pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
 
-		/* do a balance */
-		autovac_balance_cost();
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
+		autovac_recalculate_workers_for_balance();
+		LWLockRelease(AutovacuumLock);
 
-		/* set the active cost parameters from the result of that */
+		/*
+		 * We wait until this point to update cost delay and cost limit
+		 * values, even though we reloaded the configuration file above, so
+		 * that we can take into account the cost-related table options.
+		 */
 		AutoVacuumUpdateDelay();
-
-		/* done */
-		LWLockRelease(AutovacuumLock);
+		AutoVacuumUpdateLimit();
 
 		/* clean up memory before each iteration */
 		MemoryContextResetAndDeleteChildren(PortalContext);
@@ -2525,19 +2563,15 @@ deleted:
 
 		/*
 		 * Remove my info from shared memory.  We could, but intentionally
-		 * don't, clear wi_cost_limit and friends --- this is on the
-		 * assumption that we probably have more to do with similar cost
-		 * settings, so we don't want to give up our share of I/O for a very
-		 * short interval and thereby thrash the global balance.
+		 * don't, unset wi_dobalance on the assumption that we are more likely
+		 * than not to vacuum a table with no table options next, so we don't
+		 * want to give up our share of I/O for a very short interval and
+		 * thereby thrash the global balance.
 		 */
 		LWLockAcquire(AutovacuumScheduleLock, LW_EXCLUSIVE);
 		MyWorkerInfo->wi_tableoid = InvalidOid;
 		MyWorkerInfo->wi_sharedrel = false;
 		LWLockRelease(AutovacuumScheduleLock);
-
-		/* restore vacuum cost GUCs for the next iteration */
-		VacuumCostDelay = stdVacuumCostDelay;
-		VacuumCostLimit = stdVacuumCostLimit;
 	}
 
 	/*
@@ -2569,6 +2603,8 @@ deleted:
 		{
 			ConfigReloadPending = false;
 			ProcessConfigFile(PGC_SIGHUP);
+			AutoVacuumUpdateDelay();
+			AutoVacuumUpdateLimit();
 		}
 
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
@@ -2804,8 +2840,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		int			freeze_table_age;
 		int			multixact_freeze_min_age;
 		int			multixact_freeze_table_age;
-		int			vac_cost_limit;
-		double		vac_cost_delay;
 		int			log_min_duration;
 
 		/*
@@ -2815,20 +2849,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 * defaults, autovacuum's own first and plain vacuum second.
 		 */
 
-		/* -1 in autovac setting means use plain vacuum_cost_delay */
-		vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
-			? avopts->vacuum_cost_delay
-			: (autovacuum_vac_cost_delay >= 0)
-			? autovacuum_vac_cost_delay
-			: VacuumCostDelay;
-
-		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
-		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
-			? avopts->vacuum_cost_limit
-			: (autovacuum_vac_cost_limit > 0)
-			? autovacuum_vac_cost_limit
-			: VacuumCostLimit;
-
 		/* -1 in autovac setting means use log_autovacuum_min_duration */
 		log_min_duration = (avopts && avopts->log_min_duration >= 0)
 			? avopts->log_min_duration
@@ -2884,8 +2904,10 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab->at_params.multixact_freeze_table_age = multixact_freeze_table_age;
 		tab->at_params.is_wraparound = wraparound;
 		tab->at_params.log_min_duration = log_min_duration;
-		tab->at_vacuum_cost_limit = vac_cost_limit;
-		tab->at_vacuum_cost_delay = vac_cost_delay;
+		tab->at_relopt_vac_cost_limit = avopts ?
+			avopts->vacuum_cost_limit : 0;
+		tab->at_relopt_vac_cost_delay = avopts ?
+			avopts->vacuum_cost_delay : -1;
 		tab->at_relname = NULL;
 		tab->at_nspname = NULL;
 		tab->at_datname = NULL;
@@ -3377,10 +3399,18 @@ AutoVacuumShmemInit(void)
 		worker = (WorkerInfo) ((char *) AutoVacuumShmem +
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
+
 		/* initialize the WorkerInfo free list */
 		for (i = 0; i < autovacuum_max_workers; i++)
+		{
 			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 							&worker[i].wi_links);
+
+			pg_atomic_init_flag(&worker[i].wi_dobalance);
+		}
+
+		pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
+
 	}
 	else
 		Assert(found);
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index c140371b51..80bdfb2cc0 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -64,7 +64,9 @@ extern int	StartAutoVacWorker(void);
 extern void AutoVacWorkerFailed(void);
 
 /* autovacuum cost-delay balancer */
+extern void AutoVacuumBalanceLimit(void);
 extern void AutoVacuumUpdateDelay(void);
+extern void AutoVacuumUpdateLimit(void);
 
 #ifdef EXEC_BACKEND
 extern void AutoVacLauncherMain(int argc, char *argv[]) pg_attribute_noreturn();
-- 
2.37.2

v11-0001-Make-VacuumCostActive-failsafe-aware.patchtext/x-patch; charset=US-ASCII; name=v11-0001-Make-VacuumCostActive-failsafe-aware.patchDownload
From a46f13517d165479ec518b46fe98591aa75f35d6 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 25 Mar 2023 12:05:18 -0400
Subject: [PATCH v11 1/3] Make VacuumCostActive failsafe-aware

While vacuuming a table in failsafe mode, VacuumCostActive should not be
re-enabled. This currently isn't a problem because vacuum cost
parameters are only refreshed in between vacuuming tables and failsafe
status is reset for every table. In preparation for allowing vacuum cost
parameters to be updated more frequently, make vacuum cost status more
expressive.

VacuumCostActive is now VacuumCostInactive, as it can only be active in
one way but it can be inactive in two ways. If performing a failsafe
vacuum, the vacuum cost status cannot be enabled and is effectively
"locked". If performing a non-failsafe vacuum, the vacuum cost status
may be active or inactive. To express this, VacuumCostInactive can be
one of three statuses: VACUUM_COST_INACTIVE_AND_LOCKED,
VACUUM_COST_ACTIVE_AND_LOCKED, and VACUUM_COST_ACTIVE.

VacuumCostInactive is defined as an integer because we do not want
non-vacuum code concerning itself with the distinction between the three
statuses -- only with whether or not VacuumCostInactive == 0 or not.
---
 src/backend/access/heap/vacuumlazy.c  |  2 +-
 src/backend/commands/vacuum.c         | 24 +++++++++++++++++++++---
 src/backend/commands/vacuumparallel.c |  8 ++++++--
 src/backend/storage/buffer/bufmgr.c   |  8 ++++----
 src/backend/utils/init/globals.c      |  2 +-
 src/include/commands/vacuum.h         |  8 ++++++++
 src/include/miscadmin.h               |  3 +--
 7 files changed, 42 insertions(+), 13 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8f14cf85f3..040a4e931b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2637,7 +2637,7 @@ lazy_check_wraparound_failsafe(LVRelState *vacrel)
 						 "You might also need to consider other ways for VACUUM to keep up with the allocation of transaction IDs.")));
 
 		/* Stop applying cost limits from this point on */
-		VacuumCostActive = false;
+		VacuumCostInactive = VACUUM_COST_INACTIVE_AND_LOCKED;
 		VacuumCostBalance = 0;
 
 		return true;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index c54360a6a0..eb126f2247 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -491,7 +491,6 @@ vacuum(List *relations, VacuumParams *params,
 		ListCell   *cur;
 
 		in_vacuum = true;
-		VacuumCostActive = (VacuumCostDelay > 0);
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
@@ -507,6 +506,24 @@ vacuum(List *relations, VacuumParams *params,
 		{
 			VacuumRelation *vrel = lfirst_node(VacuumRelation, cur);
 
+			/*
+			 * failsafe_active is reset per relation, so we must be sure that
+			 * VacuumCostInactive is set to either VACUUM_COST_INACTIVE or
+			 * VACUUM_COST_INACTIVE_AND_UNLOCKED in between vacuuming
+			 * relations.
+			 */
+			VacuumCostInactive = VacuumCostDelay > 0 ? VACUUM_COST_ACTIVE :
+				VACUUM_COST_INACTIVE_AND_UNLOCKED;
+
+			/*
+			 * We should not have transitioned VacuumCostInactive from
+			 * VACUUM_COST_ACTIVE to VACUUM_COST_INACTIVE_AND_UNLOCKED above,
+			 * as that should have happened when we changed the value of
+			 * VacuumCostDelay.
+			 */
+			Assert(VacuumCostInactive == VACUUM_COST_ACTIVE ||
+				   VacuumCostBalance == 0);
+
 			if (params->options & VACOPT_VACUUM)
 			{
 				if (!vacuum_rel(vrel->oid, vrel->relation, params, false))
@@ -549,7 +566,8 @@ vacuum(List *relations, VacuumParams *params,
 	PG_FINALLY();
 	{
 		in_vacuum = false;
-		VacuumCostActive = false;
+		VacuumCostInactive = VACUUM_COST_INACTIVE_AND_UNLOCKED;
+		VacuumCostBalance = 0;
 	}
 	PG_END_TRY();
 
@@ -2215,7 +2233,7 @@ vacuum_delay_point(void)
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	if (!VacuumCostActive || InterruptPending)
+	if (VacuumCostInactive || InterruptPending)
 		return;
 
 	/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index bcd40c80a1..266bf6bb4c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -989,8 +989,12 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												 PARALLEL_VACUUM_KEY_DEAD_ITEMS,
 												 false);
 
-	/* Set cost-based vacuum delay */
-	VacuumCostActive = (VacuumCostDelay > 0);
+	/*
+	 * Set cost-based vacuum delay Parallel vacuum workers will not execute
+	 * failsafe VACUUM.
+	 */
+	VacuumCostInactive = VacuumCostDelay > 0 ? VACUUM_COST_ACTIVE :
+		VACUUM_COST_INACTIVE_AND_UNLOCKED;
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
 	VacuumPageMiss = 0;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index fe029d2ea6..495ee8f815 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -893,7 +893,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 			*hit = true;
 			VacuumPageHit++;
 
-			if (VacuumCostActive)
+			if (!VacuumCostInactive)
 				VacuumCostBalance += VacuumCostPageHit;
 
 			TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
@@ -1098,7 +1098,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 	}
 
 	VacuumPageMiss++;
-	if (VacuumCostActive)
+	if (!VacuumCostInactive)
 		VacuumCostBalance += VacuumCostPageMiss;
 
 	TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
@@ -1672,7 +1672,7 @@ MarkBufferDirty(Buffer buffer)
 	{
 		VacuumPageDirty++;
 		pgBufferUsage.shared_blks_dirtied++;
-		if (VacuumCostActive)
+		if (!VacuumCostInactive)
 			VacuumCostBalance += VacuumCostPageDirty;
 	}
 }
@@ -4189,7 +4189,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
 		{
 			VacuumPageDirty++;
 			pgBufferUsage.shared_blks_dirtied++;
-			if (VacuumCostActive)
+			if (!VacuumCostInactive)
 				VacuumCostBalance += VacuumCostPageDirty;
 		}
 	}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 1b1d814254..608ebb9182 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -150,4 +150,4 @@ int64		VacuumPageMiss = 0;
 int64		VacuumPageDirty = 0;
 
 int			VacuumCostBalance = 0;	/* working state for vacuum */
-bool		VacuumCostActive = false;
+int			VacuumCostInactive = 1;
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index bdfd96cfec..5c3e250b06 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -302,6 +302,14 @@ extern PGDLLIMPORT int vacuum_failsafe_age;
 extern PGDLLIMPORT int vacuum_multixact_failsafe_age;
 
 /* Variables for cost-based parallel vacuum */
+
+typedef enum VacuumCostStatus
+{
+	VACUUM_COST_INACTIVE_AND_LOCKED = -1,
+	VACUUM_COST_ACTIVE = 0,
+	VACUUM_COST_INACTIVE_AND_UNLOCKED = 1,
+}			VacuumCostStatus;
+
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 06a86f9ac1..33e22733ae 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -274,8 +274,7 @@ extern PGDLLIMPORT int64 VacuumPageMiss;
 extern PGDLLIMPORT int64 VacuumPageDirty;
 
 extern PGDLLIMPORT int VacuumCostBalance;
-extern PGDLLIMPORT bool VacuumCostActive;
-
+extern PGDLLIMPORT int VacuumCostInactive;
 
 /* in tcop/postgres.c */
 
-- 
2.37.2

v11-0002-VACUUM-reloads-config-file-more-often.patchtext/x-patch; charset=US-ASCII; name=v11-0002-VACUUM-reloads-config-file-more-often.patchDownload
From 3f4a39ff366f491ecdf2361ae11704bc98175a30 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Mon, 27 Mar 2023 13:33:19 -0400
Subject: [PATCH v11 2/3] VACUUM reloads config file more often

Previously, VACUUM would not reload the configuration file. So, changes
to cost-based delay parameters could only take effect on the next
invocation of VACUUM.

Check if a reload is pending roughly once per block now, when checking
if we need to delay.

Note that autovacuum is unaffected by this change. Autovacuum workers
overwrite the value of VacuumCostLimit and VacuumCostDelay with their
own WorkerInfo->wi_cost_limit and wi_cost_delay. Writing to their
wi_cost_delay more often makes reading wi_cost_delay without a lock to
update VacuumCostDelay an even worse idea.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_buP5wzsho3qNw5o9_R0pF69FRM5hgCmr-mvXmGXwdA7A%40mail.gmail.com#5e6771d4cdca4db6efc2acec2dce0bc7
---
 src/backend/commands/vacuum.c | 61 ++++++++++++++++++++++++++++++-----
 1 file changed, 53 insertions(+), 8 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index eb126f2247..7e3a8e404e 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -48,6 +48,7 @@
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/interrupt.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/pmsignal.h"
@@ -76,6 +77,7 @@ int			vacuum_multixact_failsafe_age;
 /* A few variables that don't seem worth passing around as parameters */
 static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
+static bool analyze_in_outer_xact = false;
 
 
 /*
@@ -314,8 +316,7 @@ vacuum(List *relations, VacuumParams *params,
 	static bool in_vacuum = false;
 
 	const char *stmttype;
-	volatile bool in_outer_xact,
-				use_own_xacts;
+	volatile bool use_own_xacts;
 
 	Assert(params != NULL);
 
@@ -332,10 +333,10 @@ vacuum(List *relations, VacuumParams *params,
 	if (params->options & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
-		in_outer_xact = false;
+		analyze_in_outer_xact = false;
 	}
 	else
-		in_outer_xact = IsInTransactionBlock(isTopLevel);
+		analyze_in_outer_xact = IsInTransactionBlock(isTopLevel);
 
 	/*
 	 * Due to static variables vac_context, anl_context and vac_strategy,
@@ -457,7 +458,7 @@ vacuum(List *relations, VacuumParams *params,
 		Assert(params->options & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
-		else if (in_outer_xact)
+		else if (analyze_in_outer_xact)
 			use_own_xacts = false;
 		else if (list_length(relations) > 1)
 			use_own_xacts = true;
@@ -475,7 +476,7 @@ vacuum(List *relations, VacuumParams *params,
 	 */
 	if (use_own_xacts)
 	{
-		Assert(!in_outer_xact);
+		Assert(!analyze_in_outer_xact);
 
 		/* ActiveSnapshot is not set by autovacuum */
 		if (ActiveSnapshotSet())
@@ -544,7 +545,7 @@ vacuum(List *relations, VacuumParams *params,
 				}
 
 				analyze_rel(vrel->oid, vrel->relation, params,
-							vrel->va_cols, in_outer_xact, vac_strategy);
+							vrel->va_cols, analyze_in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
 				{
@@ -568,6 +569,7 @@ vacuum(List *relations, VacuumParams *params,
 		in_vacuum = false;
 		VacuumCostInactive = VACUUM_COST_INACTIVE_AND_UNLOCKED;
 		VacuumCostBalance = 0;
+		analyze_in_outer_xact = false;
 	}
 	PG_END_TRY();
 
@@ -2233,7 +2235,50 @@ vacuum_delay_point(void)
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	if (VacuumCostInactive || InterruptPending)
+	if (InterruptPending ||
+		(VacuumCostInactive && !ConfigReloadPending))
+		return;
+
+	/*
+	 * Reload the configuration file if requested. This allows changes to
+	 * vacuum_cost_limit and vacuum_cost_delay to take effect while a table is
+	 * being vacuumed or analyzed. Analyze should not reload configuration file
+	 * if it is in an outer transaction, as we currently only allow
+	 * configuration reload when in top-level statements.
+	 */
+	if (ConfigReloadPending && !analyze_in_outer_xact)
+	{
+		ConfigReloadPending = false;
+		ProcessConfigFile(PGC_SIGHUP);
+
+		/*
+		 * Autovacuum workers must restore the correct values of
+		 * VacuumCostLimit and VacuumCostDelay in case they were overwritten
+		 * by reload.
+		 */
+		AutoVacuumUpdateDelay();
+
+		/*
+		 * If configuration changes are allowed to impact VacuumCostInactive,
+		 * make sure it is updated.
+		 */
+		if (VacuumCostInactive == VACUUM_COST_INACTIVE_AND_LOCKED)
+			return;
+
+		if (VacuumCostDelay > 0)
+			VacuumCostInactive = VACUUM_COST_ACTIVE;
+		else
+		{
+			VacuumCostInactive = VACUUM_COST_INACTIVE_AND_UNLOCKED;
+			VacuumCostBalance = 0;
+		}
+	}
+
+	/*
+	 * If we disabled cost-based delays after reloading the config file,
+	 * return.
+	 */
+	if (VacuumCostInactive)
 		return;
 
 	/*
-- 
2.37.2

#39Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Melanie Plageman (#38)
Re: Should vacuum process config file reload more often

Hi,

Thank you for updating the patches.

On Thu, Mar 30, 2023 at 5:01 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Thanks for the detailed review!

On Tue, Mar 28, 2023 at 11:09 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Tue, 28 Mar 2023 20:35:28 -0400, Melanie Plageman <melanieplageman@gmail.com> wrote in

On Tue, Mar 28, 2023 at 4:21 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:

At Mon, 27 Mar 2023 14:12:03 -0400, Melanie Plageman <melanieplageman@gmail.com> wrote in

0002:

I felt a bit uneasy on this. It seems somewhat complex (and makes the
succeeding patches complex),

Even if we introduced a second global variable to indicate that failsafe
mode has been engaged, we would still require the additional checks
of VacuumCostInactive.

has confusing names,

I would be happy to rename the values of the enum to make them less
confusing. Are you thinking "force" instead of "locked"?
maybe:
VACUUM_COST_FORCE_INACTIVE and
VACUUM_COST_INACTIVE
?

and doesn't seem like self-contained.

By changing the variable from VacuumCostActive to VacuumCostInactive, I
have kept all non-vacuum code from having to distinguish between it
being inactive due to failsafe mode or due to user settings.

My concern is that VacuumCostActive is logic-inverted and turned into
a ternary variable in a subtle way. The expression
"!VacuumCostInactive" is quite confusing. (I sometimes feel the same
way about "!XLogRecPtrIsInvalid(lsn)", and I believe most people write
it with another macro like "lsn != InvalidXLogrecPtr"). Additionally,
the constraint in this patch will be implemented as open code. So I
wanted to suggest something like the attached. The main idea is to use
a wrapper function to enforce the restriction, and by doing so, we
eliminated the need to make the variable into a ternary without a good
reason.

So, the rationale for making it a ternary is that the variable is the
combination of two pieces of information which has only has 3 valid
states:
failsafe inactive + cost active = cost active
failsafe inactive + cost inactive = cost inactive
failsafe active + cost inactive = cost inactive and locked
the fourth is invalid
failsafe active + cost active = invalid
That is harder to enforce with two variables.
Also, the two pieces of information are not meaningful individually.
So, I thought it made sense to make a single variable.

Your suggested patch introduces an additional variable which shadows
LVRelState->failsafe_active but doesn't actually get set/reset at all of
the correct places. If we did introduce a second global variable, I
don't think we should also keep LVRelState->failsafe_active, as keeping
them in sync will be difficult.

As for the double negative (!VacuumCostInactive), I agree that it is not
ideal, however, if we use a ternary and keep VacuumCostActive, there is
no way for non-vacuum code to treat it as a boolean.
With the ternary VacuumCostInactive, only vacuum code has to know about
the distinction between inactive+failsafe active and inactive+failsafe
inactive.

As another idea, why don't we use macros for that? For example,
suppose VacuumCostStatus is like:

typedef enum VacuumCostStatus
{
VACUUM_COST_INACTIVE_LOCKED = 0,
VACUUM_COST_INACTIVE,
VACUUM_COST_ACTIVE,
} VacuumCostStatus;
VacuumCostStatus VacuumCost;

non-vacuum code can use the following macros:

#define VacuumCostActive() (VacuumCost == VACUUM_COST_ACTIVE)
#define VacuumCostInactive() (VacuumCost <= VACUUM_COST_INACTIVE) //
or we can use !VacuumCostActive() instead.

Or is there any reason why we need to keep VacuumCostActive and treat
it as a boolean?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#40Daniel Gustafsson
daniel@yesql.se
In reply to: Masahiko Sawada (#39)
Re: Should vacuum process config file reload more often

On 30 Mar 2023, at 04:57, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

As another idea, why don't we use macros for that? For example,
suppose VacuumCostStatus is like:

typedef enum VacuumCostStatus
{
VACUUM_COST_INACTIVE_LOCKED = 0,
VACUUM_COST_INACTIVE,
VACUUM_COST_ACTIVE,
} VacuumCostStatus;
VacuumCostStatus VacuumCost;

non-vacuum code can use the following macros:

#define VacuumCostActive() (VacuumCost == VACUUM_COST_ACTIVE)
#define VacuumCostInactive() (VacuumCost <= VACUUM_COST_INACTIVE) //
or we can use !VacuumCostActive() instead.

I'm in favor of something along these lines. A variable with a name that
implies a boolean value (active/inactive) but actually contains a tri-value is
easily misunderstood. A VacuumCostState tri-value variable (or a better name)
with a set of convenient macros for extracting the boolean active/inactive that
most of the code needs to be concerned with would more for more readable code I
think.

--
Daniel Gustafsson

#41Melanie Plageman
melanieplageman@gmail.com
In reply to: Daniel Gustafsson (#40)
Re: Should vacuum process config file reload more often

On Thu, Mar 30, 2023 at 3:26 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 30 Mar 2023, at 04:57, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

As another idea, why don't we use macros for that? For example,
suppose VacuumCostStatus is like:

typedef enum VacuumCostStatus
{
VACUUM_COST_INACTIVE_LOCKED = 0,
VACUUM_COST_INACTIVE,
VACUUM_COST_ACTIVE,
} VacuumCostStatus;
VacuumCostStatus VacuumCost;

non-vacuum code can use the following macros:

#define VacuumCostActive() (VacuumCost == VACUUM_COST_ACTIVE)
#define VacuumCostInactive() (VacuumCost <= VACUUM_COST_INACTIVE) //
or we can use !VacuumCostActive() instead.

I'm in favor of something along these lines. A variable with a name that
implies a boolean value (active/inactive) but actually contains a tri-value is
easily misunderstood. A VacuumCostState tri-value variable (or a better name)
with a set of convenient macros for extracting the boolean active/inactive that
most of the code needs to be concerned with would more for more readable code I
think.

The macros are very error-prone. I was just implementing this idea and
mistakenly tried to set the macro instead of the variable in multiple
places. Avoiding this involves another set of macros, and, in the end, I
think the complexity is much worse. Given the reviewers' uniform dislike
of VacuumCostInactive, I favor going back to two variables
(VacuumCostActive + VacuumFailsafeActive) and moving
LVRelState->failsafe_active to the global VacuumFailsafeActive.

I will reimplement this in the next version.

On the subject of globals, the next version will implement
Horiguchi-san's proposal to separate GUC variables from the globals used
in the code (quoted below). It should hopefully reduce the complexity of
this patchset.

Although it's somewhat unrelated to the goal of this patch, I think we
should clean up the code tidy before proceeding. Shouldn't we separate
the actual parameters from the GUC base variables, and sort out the
all related variaghble? (something like the attached, on top of your
patch.)

- Melanie

#42Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#41)
4 attachment(s)
Re: Should vacuum process config file reload more often

On Fri, Mar 31, 2023 at 10:31 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Thu, Mar 30, 2023 at 3:26 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 30 Mar 2023, at 04:57, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

As another idea, why don't we use macros for that? For example,
suppose VacuumCostStatus is like:

typedef enum VacuumCostStatus
{
VACUUM_COST_INACTIVE_LOCKED = 0,
VACUUM_COST_INACTIVE,
VACUUM_COST_ACTIVE,
} VacuumCostStatus;
VacuumCostStatus VacuumCost;

non-vacuum code can use the following macros:

#define VacuumCostActive() (VacuumCost == VACUUM_COST_ACTIVE)
#define VacuumCostInactive() (VacuumCost <= VACUUM_COST_INACTIVE) //
or we can use !VacuumCostActive() instead.

I'm in favor of something along these lines. A variable with a name that
implies a boolean value (active/inactive) but actually contains a tri-value is
easily misunderstood. A VacuumCostState tri-value variable (or a better name)
with a set of convenient macros for extracting the boolean active/inactive that
most of the code needs to be concerned with would more for more readable code I
think.

The macros are very error-prone. I was just implementing this idea and
mistakenly tried to set the macro instead of the variable in multiple
places. Avoiding this involves another set of macros, and, in the end, I
think the complexity is much worse. Given the reviewers' uniform dislike
of VacuumCostInactive, I favor going back to two variables
(VacuumCostActive + VacuumFailsafeActive) and moving
LVRelState->failsafe_active to the global VacuumFailsafeActive.

I will reimplement this in the next version.

On the subject of globals, the next version will implement
Horiguchi-san's proposal to separate GUC variables from the globals used
in the code (quoted below). It should hopefully reduce the complexity of
this patchset.

Although it's somewhat unrelated to the goal of this patch, I think we
should clean up the code tidy before proceeding. Shouldn't we separate
the actual parameters from the GUC base variables, and sort out the
all related variaghble? (something like the attached, on top of your
patch.)

Attached is v12. It has a number of updates, including a commit to
separate VacuumCostLimit and VacuumCostDelay from the gucs
vacuum_cost_limit and vacuum_cost_delay, and a return to
VacuumCostActive.

- Melanie

Attachments:

v12-0001-Make-vacuum-s-failsafe_active-a-global.patchtext/x-patch; charset=US-ASCII; name=v12-0001-Make-vacuum-s-failsafe_active-a-global.patchDownload
From 106f5db65846e0497945b6171bdc29f5727aadc3 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Fri, 31 Mar 2023 10:38:39 -0400
Subject: [PATCH v12 1/4] Make vacuum's failsafe_active a global

While vacuuming a table in failsafe mode, VacuumCostActive should not be
re-enabled. This currently isn't a problem because vacuum cost
parameters are only refreshed in between vacuuming tables and failsafe
status is reset for every table. In preparation for allowing vacuum cost
parameters to be updated more frequently, elevate
LVRelState->failsafe_active to a global, VacuumFailsafeActive, which
will be checked when determining whether or not to re-enable vacuum
cost-related delays.
---
 src/backend/access/heap/vacuumlazy.c  | 16 +++++++---------
 src/backend/commands/vacuum.c         |  1 +
 src/backend/commands/vacuumparallel.c |  1 +
 src/include/commands/vacuum.h         |  1 +
 4 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8f14cf85f3..f4755bcc4b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -153,8 +153,6 @@ typedef struct LVRelState
 	bool		aggressive;
 	/* Use visibility map to skip? (disabled by DISABLE_PAGE_SKIPPING) */
 	bool		skipwithvm;
-	/* Wraparound failsafe has been triggered? */
-	bool		failsafe_active;
 	/* Consider index vacuuming bypass optimization? */
 	bool		consider_bypass_optimization;
 
@@ -391,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
 	Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
 		   params->truncate != VACOPTVALUE_AUTO);
-	vacrel->failsafe_active = false;
+	VacuumFailsafeActive = false;
 	vacrel->consider_bypass_optimization = true;
 	vacrel->do_index_vacuuming = true;
 	vacrel->do_index_cleanup = true;
@@ -709,7 +707,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 			}
 			else
 			{
-				if (!vacrel->failsafe_active)
+				if (!VacuumFailsafeActive)
 					appendStringInfoString(&buf, _("index scan bypassed: "));
 				else
 					appendStringInfoString(&buf, _("index scan bypassed by failsafe: "));
@@ -2293,7 +2291,7 @@ lazy_vacuum(LVRelState *vacrel)
 		 * vacuuming or heap vacuuming.  This VACUUM operation won't end up
 		 * back here again.
 		 */
-		Assert(vacrel->failsafe_active);
+		Assert(VacuumFailsafeActive);
 	}
 
 	/*
@@ -2374,7 +2372,7 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	 */
 	Assert(vacrel->num_index_scans > 0 ||
 		   vacrel->dead_items->num_items == vacrel->lpdead_items);
-	Assert(allindexes || vacrel->failsafe_active);
+	Assert(allindexes || VacuumFailsafeActive);
 
 	/*
 	 * Increase and report the number of index scans.
@@ -2616,12 +2614,12 @@ static bool
 lazy_check_wraparound_failsafe(LVRelState *vacrel)
 {
 	/* Don't warn more than once per VACUUM */
-	if (vacrel->failsafe_active)
+	if (VacuumFailsafeActive)
 		return true;
 
 	if (unlikely(vacuum_xid_failsafe_check(&vacrel->cutoffs)))
 	{
-		vacrel->failsafe_active = true;
+		VacuumFailsafeActive = true;
 
 		/* Disable index vacuuming, index cleanup, and heap rel truncation */
 		vacrel->do_index_vacuuming = false;
@@ -2811,7 +2809,7 @@ should_attempt_truncation(LVRelState *vacrel)
 {
 	BlockNumber possibly_freeable;
 
-	if (!vacrel->do_rel_truncate || vacrel->failsafe_active ||
+	if (!vacrel->do_rel_truncate || VacuumFailsafeActive ||
 		old_snapshot_threshold >= 0)
 		return false;
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index c54360a6a0..0e1dbeec70 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -85,6 +85,7 @@ static BufferAccessStrategy vac_strategy;
 pg_atomic_uint32 *VacuumSharedCostBalance = NULL;
 pg_atomic_uint32 *VacuumActiveNWorkers = NULL;
 int			VacuumCostBalanceLocal = 0;
+bool		VacuumFailsafeActive = false;
 
 /* non-export function prototypes */
 static List *expand_vacuum_rel(VacuumRelation *vrel, int options);
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index bcd40c80a1..57188500d0 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -990,6 +990,7 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												 false);
 
 	/* Set cost-based vacuum delay */
+	VacuumFailsafeActive = false;
 	VacuumCostActive = (VacuumCostDelay > 0);
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index bdfd96cfec..7b8ee21788 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -306,6 +306,7 @@ extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
 
+extern bool VacuumFailsafeActive;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
-- 
2.37.2

v12-0003-VACUUM-reloads-config-file-more-often.patchtext/x-patch; charset=US-ASCII; name=v12-0003-VACUUM-reloads-config-file-more-often.patchDownload
From 795776af97d5f3ab05e48c7b78367bf6290e7ad4 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Mon, 27 Mar 2023 13:33:19 -0400
Subject: [PATCH v12 3/4] VACUUM reloads config file more often

Previously, VACUUM would not reload the configuration file, so changes
to cost-based delay parameters could only take effect on the next
invocation of VACUUM.

Now, check if a reload is pending roughly once per block, when checking
if we need to delay.

Note that autovacuum is unaffected by this change. Autovacuum workers
overwrite the value of VacuumCostLimit and VacuumCostDelay with their
own WorkerInfo->wi_cost_limit and wi_cost_delay instead of using
potentially refreshed values of autovacuum_vacuum_cost_limit and
autovacuum_vacuum_cost_delay. Locking considerations and needed updates
to the worker balancing logic make enabling this feature for autovacuum
worthy of an independent commit.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
---
 src/backend/commands/vacuum.c         | 46 +++++++++++++++++++++------
 src/backend/commands/vacuumparallel.c |  3 +-
 src/backend/postmaster/autovacuum.c   | 18 +++++++++++
 3 files changed, 55 insertions(+), 12 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 2c3afd4ff6..a288c402a9 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -48,6 +48,7 @@
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/interrupt.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/pmsignal.h"
@@ -78,6 +79,7 @@ int			vacuum_cost_limit;
 /* A few variables that don't seem worth passing around as parameters */
 static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
+static bool analyze_in_outer_xact = false;
 
 /*
  * Variables for cost-based vacuum delay. The defaults differ between
@@ -325,8 +327,7 @@ vacuum(List *relations, VacuumParams *params,
 	static bool in_vacuum = false;
 
 	const char *stmttype;
-	volatile bool in_outer_xact,
-				use_own_xacts;
+	volatile bool use_own_xacts;
 
 	Assert(params != NULL);
 
@@ -343,10 +344,10 @@ vacuum(List *relations, VacuumParams *params,
 	if (params->options & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
-		in_outer_xact = false;
+		analyze_in_outer_xact = false;
 	}
 	else
-		in_outer_xact = IsInTransactionBlock(isTopLevel);
+		analyze_in_outer_xact = IsInTransactionBlock(isTopLevel);
 
 	/*
 	 * Due to static variables vac_context, anl_context and vac_strategy,
@@ -468,7 +469,7 @@ vacuum(List *relations, VacuumParams *params,
 		Assert(params->options & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
-		else if (in_outer_xact)
+		else if (analyze_in_outer_xact)
 			use_own_xacts = false;
 		else if (list_length(relations) > 1)
 			use_own_xacts = true;
@@ -486,7 +487,7 @@ vacuum(List *relations, VacuumParams *params,
 	 */
 	if (use_own_xacts)
 	{
-		Assert(!in_outer_xact);
+		Assert(!analyze_in_outer_xact);
 
 		/* ActiveSnapshot is not set by autovacuum */
 		if (ActiveSnapshotSet())
@@ -501,9 +502,9 @@ vacuum(List *relations, VacuumParams *params,
 	{
 		ListCell   *cur;
 
-		VacuumUpdateCosts();
 		in_vacuum = true;
-		VacuumCostActive = (VacuumCostDelay > 0);
+		VacuumFailsafeActive = false;
+		VacuumUpdateCosts();
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
@@ -539,7 +540,7 @@ vacuum(List *relations, VacuumParams *params,
 				}
 
 				analyze_rel(vrel->oid, vrel->relation, params,
-							vrel->va_cols, in_outer_xact, vac_strategy);
+							vrel->va_cols, analyze_in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
 				{
@@ -562,6 +563,9 @@ vacuum(List *relations, VacuumParams *params,
 	{
 		in_vacuum = false;
 		VacuumCostActive = false;
+		VacuumFailsafeActive = false;
+		VacuumCostBalance = 0;
+		analyze_in_outer_xact = false;
 	}
 	PG_END_TRY();
 
@@ -2227,7 +2231,29 @@ vacuum_delay_point(void)
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	if (!VacuumCostActive || InterruptPending)
+	if (InterruptPending ||
+		(!VacuumCostActive && !ConfigReloadPending))
+		return;
+
+	/*
+	 * Reload the configuration file if requested. This allows changes to
+	 * vacuum_cost_limit and vacuum_cost_delay to take effect while a table is
+	 * being vacuumed or analyzed. Analyze should not reload configuration
+	 * file if it is in an outer transaction, as we currently only allow
+	 * configuration reload when in top-level statements.
+	 */
+	if (ConfigReloadPending && !analyze_in_outer_xact)
+	{
+		ConfigReloadPending = false;
+		ProcessConfigFile(PGC_SIGHUP);
+		VacuumUpdateCosts();
+	}
+
+	/*
+	 * If we disabled cost-based delays after reloading the config file,
+	 * return.
+	 */
+	if (!VacuumCostActive)
 		return;
 
 	/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 4c3c93b2fd..36963090e8 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -991,9 +991,8 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 
 	/* Set cost-based vacuum delay */
 	VacuumFailsafeActive = false;
-	VacuumUpdateCosts();
-	VacuumCostActive = (VacuumCostDelay > 0);
 	VacuumCostBalance = 0;
+	VacuumUpdateCosts();
 	VacuumPageHit = 0;
 	VacuumPageMiss = 0;
 	VacuumPageDirty = 0;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index ac54ed4546..e7833cd49e 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1797,6 +1797,24 @@ VacuumUpdateCosts(void)
 		VacuumCostLimit = vacuum_cost_limit;
 		VacuumCostDelay = vacuum_cost_delay;
 	}
+
+	/*
+	* If configuration changes are allowed to impact VacuumCostActive,
+	* make sure it is updated.
+	*/
+	if (VacuumFailsafeActive)
+	{
+		Assert(!VacuumCostActive);
+		return;
+	}
+
+	if (VacuumCostDelay > 0)
+		VacuumCostActive = true;
+	else
+	{
+		VacuumCostActive = false;
+		VacuumCostBalance = 0;
+	}
 }
 
 
-- 
2.37.2

v12-0004-Autovacuum-refreshes-cost-based-delay-params-mor.patchtext/x-patch; charset=US-ASCII; name=v12-0004-Autovacuum-refreshes-cost-based-delay-params-mor.patchDownload
From 4c1fbb27e18a8e16b9a1a70e17452a9c98c00217 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 25 Mar 2023 14:14:55 -0400
Subject: [PATCH v12 4/4] Autovacuum refreshes cost-based delay params more
 often

The previous commit allowed VACUUM to reload the config file more often
so that cost-based delay parameters could take effect while VACUUMing a
relation. Autovacuum, however did not benefit from this change.

In order for autovacuum workers to safely update their own cost delay
and cost limit parameters without impacting performance, we had to
rethink when and how these values were accessed.

Previously, an autovacuum worker's wi_cost_limit was set only at the
beginning of vacuuming a table, after reloading the config file.
Therefore, at the time that autovac_balance_cost() is called, workers
vacuuming tables with no table options could still have different values
for their wi_cost_limit_base and wi_cost_delay.

Now that the cost parameters can be updated while vacuuming a table,
workers will (within some margin of error) have no reason to have
different values for cost limit and cost delay (in the absence of table
options). This removes the rationale for keeping cost limit and cost
delay in shared memory. Balancing the cost limit requires only the
number of active autovacuum workers vacuuming a table with no cost-based
table options.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
---
 src/backend/commands/vacuum.c       |  17 +-
 src/backend/postmaster/autovacuum.c | 235 ++++++++++++++--------------
 src/include/commands/vacuum.h       |   1 +
 3 files changed, 134 insertions(+), 119 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index a288c402a9..774cc5e2b7 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2237,10 +2237,10 @@ vacuum_delay_point(void)
 
 	/*
 	 * Reload the configuration file if requested. This allows changes to
-	 * vacuum_cost_limit and vacuum_cost_delay to take effect while a table is
-	 * being vacuumed or analyzed. Analyze should not reload configuration
-	 * file if it is in an outer transaction, as we currently only allow
-	 * configuration reload when in top-level statements.
+	 * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay to
+	 * take effect while a table is being vacuumed or analyzed. Analyze should
+	 * not reload configuration file if it is in an outer transaction, as we
+	 * currently only allow configuration reload when in top-level statements.
 	 */
 	if (ConfigReloadPending && !analyze_in_outer_xact)
 	{
@@ -2286,7 +2286,14 @@ vacuum_delay_point(void)
 
 		VacuumCostBalance = 0;
 
-		VacuumUpdateCosts();
+		/*
+		 * Balance and update limit values for autovacuum workers. We must
+		 * always do this in case the autovacuum launcher or another
+		 * autovacuum worker has recalculated the number of workers across
+		 * which we must balance the limit. This is done by the launcher when
+		 * launching a new worker and by workers before vacuuming each table.
+		 */
+		AutoVacuumUpdateLimit();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index e7833cd49e..1b5b749371 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -139,6 +139,9 @@ int			Log_autovacuum_min_duration = 600000;
 static bool am_autovacuum_launcher = false;
 static bool am_autovacuum_worker = false;
 
+static double av_relopt_cost_delay = -1;
+static int	av_relopt_cost_limit = 0;
+
 /* Flags set by signal handlers */
 static volatile sig_atomic_t got_SIGUSR2 = false;
 
@@ -189,8 +192,8 @@ typedef struct autovac_table
 {
 	Oid			at_relid;
 	VacuumParams at_params;
-	double		at_vacuum_cost_delay;
-	int			at_vacuum_cost_limit;
+	double		at_relopt_vac_cost_delay;
+	int			at_relopt_vac_cost_limit;
 	bool		at_dobalance;
 	bool		at_sharedrel;
 	char	   *at_relname;
@@ -209,7 +212,7 @@ typedef struct autovac_table
  * wi_sharedrel flag indicating whether table is marked relisshared
  * wi_proc		pointer to PGPROC of the running worker, NULL if not started
  * wi_launchtime Time at which this worker was launched
- * wi_cost_*	Vacuum cost-based delay parameters current in this worker
+ * wi_dobalance Whether this worker should be included in balance calculations
  *
  * All fields are protected by AutovacuumLock, except for wi_tableoid and
  * wi_sharedrel which are protected by AutovacuumScheduleLock (note these
@@ -223,11 +226,8 @@ typedef struct WorkerInfoData
 	Oid			wi_tableoid;
 	PGPROC	   *wi_proc;
 	TimestampTz wi_launchtime;
-	bool		wi_dobalance;
+	pg_atomic_flag wi_dobalance;
 	bool		wi_sharedrel;
-	double		wi_cost_delay;
-	int			wi_cost_limit;
-	int			wi_cost_limit_base;
 } WorkerInfoData;
 
 typedef struct WorkerInfoData *WorkerInfo;
@@ -273,6 +273,8 @@ typedef struct AutoVacuumWorkItem
  * av_startingWorker pointer to WorkerInfo currently being started (cleared by
  *					the worker itself as soon as it's up and running)
  * av_workItems		work item array
+ * av_nworkersForBalance the number of autovacuum workers to use when
+ * 					calculating the per worker cost limit
  *
  * This struct is protected by AutovacuumLock, except for av_signal and parts
  * of the worker list (see above).
@@ -286,6 +288,7 @@ typedef struct
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+	pg_atomic_uint32 av_nworkersForBalance;
 } AutoVacuumShmemStruct;
 
 static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -319,7 +322,7 @@ static void launch_worker(TimestampTz now);
 static List *get_database_list(void);
 static void rebuild_database_list(Oid newdb);
 static int	db_comparator(const void *a, const void *b);
-static void autovac_balance_cost(void);
+static void autovac_recalculate_workers_for_balance(void);
 
 static void do_autovacuum(void);
 static void FreeWorkerInfo(int code, Datum arg);
@@ -670,7 +673,7 @@ AutoVacLauncherMain(int argc, char *argv[])
 			{
 				LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 				AutoVacuumShmem->av_signal[AutoVacRebalance] = false;
-				autovac_balance_cost();
+				autovac_recalculate_workers_for_balance();
 				LWLockRelease(AutovacuumLock);
 			}
 
@@ -820,8 +823,8 @@ HandleAutoVacLauncherInterrupts(void)
 			AutoVacLauncherShutdown();
 
 		/* rebalance in case the default cost parameters changed */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-		autovac_balance_cost();
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
+		autovac_recalculate_workers_for_balance();
 		LWLockRelease(AutovacuumLock);
 
 		/* rebuild the list in case the naptime changed */
@@ -1755,10 +1758,7 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_sharedrel = false;
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
-		MyWorkerInfo->wi_dobalance = false;
-		MyWorkerInfo->wi_cost_delay = 0;
-		MyWorkerInfo->wi_cost_limit = 0;
-		MyWorkerInfo->wi_cost_limit_base = 0;
+		pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
 		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 						&MyWorkerInfo->wi_links);
 		/* not mine anymore */
@@ -1788,14 +1788,21 @@ VacuumUpdateCosts(void)
 
 	if (am_autovacuum_worker)
 	{
-		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
-		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
+		if (av_relopt_cost_delay >= 0)
+			VacuumCostDelay = av_relopt_cost_delay;
+		else if (autovacuum_vac_cost_delay >= 0)
+			VacuumCostDelay = autovacuum_vac_cost_delay;
+		else
+			/* fall back to vacuum_cost_delay */
+			VacuumCostDelay = vacuum_cost_delay;
+
+		AutoVacuumUpdateLimit();
 	}
 	else
 	{
 		/* Must be explicit VACUUM or ANALYZE */
-		VacuumCostLimit = vacuum_cost_limit;
 		VacuumCostDelay = vacuum_cost_delay;
+		VacuumCostLimit = vacuum_cost_limit;
 	}
 
 	/*
@@ -1819,85 +1826,82 @@ VacuumUpdateCosts(void)
 
 
 /*
- * autovac_balance_cost
- *		Recalculate the cost limit setting for each active worker.
- *
- * Caller must hold the AutovacuumLock in exclusive mode.
- */
-static void
-autovac_balance_cost(void)
+* Update VacuumCostLimit with the correct value for an autovacuum worker, given
+* the value of other relevant cost limit parameters and the number of workers
+* across which the limit must be balanced. Autovacuum workers must call this
+* regularly in case av_nworkers_for_balance has been updated by another worker
+* or by the autovacuum launcher. They must also call it after a config reload.
+*/
+void
+AutoVacuumUpdateLimit(void)
 {
+	if (!am_autovacuum_worker)
+		return;
+
 	/*
-	 * The idea here is that we ration out I/O equally.  The amount of I/O
-	 * that a worker can consume is determined by cost_limit/cost_delay, so we
-	 * try to equalize those ratios rather than the raw limit settings.
-	 *
 	 * note: in cost_limit, zero also means use value from elsewhere, because
 	 * zero is not a valid value.
 	 */
-	int			vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
-								  autovacuum_vac_cost_limit : vacuum_cost_limit);
-	double		vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
-								  autovacuum_vac_cost_delay : vacuum_cost_delay);
-	double		cost_total;
-	double		cost_avail;
-	dlist_iter	iter;
-
-	/* not set? nothing to do */
-	if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
-		return;
 
-	/* calculate the total base cost limit of participating active workers */
-	cost_total = 0.0;
-	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
+	if (av_relopt_cost_limit > 0)
+		VacuumCostLimit = av_relopt_cost_limit;
+	else
 	{
-		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
+		int			nworkers_for_balance;
+
+		if (autovacuum_vac_cost_limit > 0)
+			VacuumCostLimit = autovacuum_vac_cost_limit;
+		else
+			VacuumCostLimit = vacuum_cost_limit;
+
+		/* Only balance limit if no table options specified */
+		if (pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance))
+			return;
+
+		Assert(VacuumCostLimit > 0);
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-			cost_total +=
-				(double) worker->wi_cost_limit_base / worker->wi_cost_delay;
+		nworkers_for_balance = pg_atomic_read_u32(
+								&AutoVacuumShmem->av_nworkersForBalance);
+
+		/* There is at least 1 autovac worker (this worker). */
+		Assert(nworkers_for_balance > 0);
+
+		VacuumCostLimit = Max(VacuumCostLimit / nworkers_for_balance, 1);
 	}
+}
 
-	/* there are no cost limits -- nothing to do */
-	if (cost_total <= 0)
-		return;
+/*
+ * autovac_recalculate_workers_for_balance
+ *		Recalculate the number of workers to consider, given table options and
+ *		the current number of active workers.
+ *
+ * Caller must hold the AutovacuumLock in at least shared mode to access
+ * worker->wi_proc.
+ */
+static void
+autovac_recalculate_workers_for_balance(void)
+{
+	dlist_iter	iter;
+	int			orig_nworkers_for_balance;
+	int			nworkers_for_balance = 0;
+
+	orig_nworkers_for_balance =
+		pg_atomic_read_u32(&AutoVacuumShmem->av_nworkersForBalance);
 
-	/*
-	 * Adjust cost limit of each active worker to balance the total of cost
-	 * limit to autovacuum_vacuum_cost_limit.
-	 */
-	cost_avail = (double) vac_cost_limit / vac_cost_delay;
 	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
 	{
 		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-		{
-			int			limit = (int)
-			(cost_avail * worker->wi_cost_limit_base / cost_total);
-
-			/*
-			 * We put a lower bound of 1 on the cost_limit, to avoid division-
-			 * by-zero in the vacuum code.  Also, in case of roundoff trouble
-			 * in these calculations, let's be sure we don't ever set
-			 * cost_limit to more than the base value.
-			 */
-			worker->wi_cost_limit = Max(Min(limit,
-											worker->wi_cost_limit_base),
-										1);
-		}
+		if (worker->wi_proc == NULL ||
+			pg_atomic_unlocked_test_flag(&worker->wi_dobalance))
+			continue;
 
-		if (worker->wi_proc != NULL)
-			elog(DEBUG2, "autovac_balance_cost(pid=%d db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d, cost_delay=%g)",
-				 worker->wi_proc->pid, worker->wi_dboid, worker->wi_tableoid,
-				 worker->wi_dobalance ? "yes" : "no",
-				 worker->wi_cost_limit, worker->wi_cost_limit_base,
-				 worker->wi_cost_delay);
+		nworkers_for_balance++;
 	}
+
+	if (nworkers_for_balance != orig_nworkers_for_balance)
+		pg_atomic_write_u32(&AutoVacuumShmem->av_nworkersForBalance,
+							nworkers_for_balance);
 }
 
 /*
@@ -2445,23 +2449,31 @@ do_autovacuum(void)
 			continue;
 		}
 
-		/* Must hold AutovacuumLock while mucking with cost balance info */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 
-		/* advertise my cost delay parameters for the balancing algorithm */
-		MyWorkerInfo->wi_dobalance = tab->at_dobalance;
-		MyWorkerInfo->wi_cost_delay = tab->at_vacuum_cost_delay;
-		MyWorkerInfo->wi_cost_limit = tab->at_vacuum_cost_limit;
-		MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
+		/*
+		 * Save the cost-related table options in global variables for
+		 * reference when updating VacuumCostLimit and VacuumCostDelay during
+		 * vacuuming this table.
+		 */
+		av_relopt_cost_limit = tab->at_relopt_vac_cost_limit;
+		av_relopt_cost_delay = tab->at_relopt_vac_cost_delay;
 
-		/* do a balance */
-		autovac_balance_cost();
+		if (tab->at_dobalance)
+			pg_atomic_test_set_flag(&MyWorkerInfo->wi_dobalance);
+		else
+			pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
 
-		/* set the active cost parameters from the result of that */
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
+		autovac_recalculate_workers_for_balance();
+		LWLockRelease(AutovacuumLock);
+
+		/*
+		 * We wait until this point to update cost delay and cost limit
+		 * values, even though we reloaded the configuration file above, so
+		 * that we can take into account the cost-related table options.
+		 */
 		VacuumUpdateCosts();
 
-		/* done */
-		LWLockRelease(AutovacuumLock);
 
 		/* clean up memory before each iteration */
 		MemoryContextResetAndDeleteChildren(PortalContext);
@@ -2546,10 +2558,10 @@ deleted:
 
 		/*
 		 * Remove my info from shared memory.  We could, but intentionally
-		 * don't, clear wi_cost_limit and friends --- this is on the
-		 * assumption that we probably have more to do with similar cost
-		 * settings, so we don't want to give up our share of I/O for a very
-		 * short interval and thereby thrash the global balance.
+		 * don't, unset wi_dobalance on the assumption that we are more likely
+		 * than not to vacuum a table with no table options next, so we don't
+		 * want to give up our share of I/O for a very short interval and
+		 * thereby thrash the global balance.
 		 */
 		LWLockAcquire(AutovacuumScheduleLock, LW_EXCLUSIVE);
 		MyWorkerInfo->wi_tableoid = InvalidOid;
@@ -2586,6 +2598,7 @@ deleted:
 		{
 			ConfigReloadPending = false;
 			ProcessConfigFile(PGC_SIGHUP);
+			VacuumUpdateCosts();
 		}
 
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
@@ -2821,8 +2834,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		int			freeze_table_age;
 		int			multixact_freeze_min_age;
 		int			multixact_freeze_table_age;
-		int			vac_cost_limit;
-		double		vac_cost_delay;
 		int			log_min_duration;
 
 		/*
@@ -2832,20 +2843,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 * defaults, autovacuum's own first and plain vacuum second.
 		 */
 
-		/* -1 in autovac setting means use plain vacuum_cost_delay */
-		vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
-			? avopts->vacuum_cost_delay
-			: (autovacuum_vac_cost_delay >= 0)
-			? autovacuum_vac_cost_delay
-			: vacuum_cost_delay;
-
-		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
-		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
-			? avopts->vacuum_cost_limit
-			: (autovacuum_vac_cost_limit > 0)
-			? autovacuum_vac_cost_limit
-			: vacuum_cost_limit;
-
 		/* -1 in autovac setting means use log_autovacuum_min_duration */
 		log_min_duration = (avopts && avopts->log_min_duration >= 0)
 			? avopts->log_min_duration
@@ -2901,8 +2898,10 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab->at_params.multixact_freeze_table_age = multixact_freeze_table_age;
 		tab->at_params.is_wraparound = wraparound;
 		tab->at_params.log_min_duration = log_min_duration;
-		tab->at_vacuum_cost_limit = vac_cost_limit;
-		tab->at_vacuum_cost_delay = vac_cost_delay;
+		tab->at_relopt_vac_cost_limit = avopts ?
+			avopts->vacuum_cost_limit : 0;
+		tab->at_relopt_vac_cost_delay = avopts ?
+			avopts->vacuum_cost_delay : -1;
 		tab->at_relname = NULL;
 		tab->at_nspname = NULL;
 		tab->at_datname = NULL;
@@ -3394,10 +3393,18 @@ AutoVacuumShmemInit(void)
 		worker = (WorkerInfo) ((char *) AutoVacuumShmem +
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
+
 		/* initialize the WorkerInfo free list */
 		for (i = 0; i < autovacuum_max_workers; i++)
+		{
 			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 							&worker[i].wi_links);
+
+			pg_atomic_init_flag(&worker[i].wi_dobalance);
+		}
+
+		pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
+
 	}
 	else
 		Assert(found);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index a62dd2e781..6b286037ca 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -351,6 +351,7 @@ extern IndexBulkDeleteResult *vac_cleanup_one_index(IndexVacuumInfo *ivinfo,
 extern Size vac_max_items_to_alloc_size(int max_items);
 
 /* In postmaster/autovacuum.c */
+extern void AutoVacuumUpdateLimit(void);
 extern void VacuumUpdateCosts(void);
 
 /* in commands/vacuumparallel.c */
-- 
2.37.2

v12-0002-Separate-vacuum-cost-variables-from-gucs.patchtext/x-patch; charset=US-ASCII; name=v12-0002-Separate-vacuum-cost-variables-from-gucs.patchDownload
From 71a840acc262a3154905f44743130b9668ec78ab Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Fri, 31 Mar 2023 13:10:26 -0400
Subject: [PATCH v12 2/4] Separate vacuum cost variables from gucs

Vacuum code run both by autovacuum workers and a backend doing
VACUUM/ANALYZE previously used VacuumCostLimit and VacuumCostDelay which
were the global variables for the gucs vacuum_cost_limit and
vacuum_cost_delay. Autovacuum workers needed to override these variables
with their own values, derived from autovacuum_vacuum_cost_limit and
autovacuum_vacuum_cost_delay and worker cost limit balancing logic. This
led to confusing code which, in some cases, both derived and set a new
value of VacuumCostLimit from VacuumCostLimit.

In preparation for refreshing these guc values more often, separate
these variables from the gucs themselves and add a function to update
the global variables using the gucs and existing logic.
---
 src/backend/commands/vacuum.c         | 15 +++++++--
 src/backend/commands/vacuumparallel.c |  1 +
 src/backend/postmaster/autovacuum.c   | 47 +++++++++++++--------------
 src/backend/utils/init/globals.c      |  2 --
 src/backend/utils/misc/guc_tables.c   |  4 +--
 src/include/commands/vacuum.h         |  7 ++++
 src/include/miscadmin.h               |  2 --
 src/include/postmaster/autovacuum.h   |  3 --
 8 files changed, 46 insertions(+), 35 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 0e1dbeec70..2c3afd4ff6 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -71,12 +71,22 @@ int			vacuum_multixact_freeze_min_age;
 int			vacuum_multixact_freeze_table_age;
 int			vacuum_failsafe_age;
 int			vacuum_multixact_failsafe_age;
+double		vacuum_cost_delay;
+int			vacuum_cost_limit;
 
 
 /* A few variables that don't seem worth passing around as parameters */
 static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
 
+/*
+ * Variables for cost-based vacuum delay. The defaults differ between
+ * autovacuum and vacuum. These should be overridden with the appropriate GUC
+ * value in vacuum code.
+ * TODO: should they be initialized to valid or invalid values?
+ */
+int			VacuumCostLimit = 0;
+double		VacuumCostDelay = -1;
 
 /*
  * Variables for cost-based parallel vacuum.  See comments atop
@@ -491,6 +501,7 @@ vacuum(List *relations, VacuumParams *params,
 	{
 		ListCell   *cur;
 
+		VacuumUpdateCosts();
 		in_vacuum = true;
 		VacuumCostActive = (VacuumCostDelay > 0);
 		VacuumCostBalance = 0;
@@ -2249,8 +2260,7 @@ vacuum_delay_point(void)
 
 		VacuumCostBalance = 0;
 
-		/* update balance values for workers */
-		AutoVacuumUpdateDelay();
+		VacuumUpdateCosts();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
@@ -2388,6 +2398,7 @@ vac_max_items_to_alloc_size(int max_items)
 	return offsetof(VacDeadItems, items) + sizeof(ItemPointerData) * max_items;
 }
 
+
 /*
  *	vac_tid_reaped() -- is a particular tid deletable?
  *
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 57188500d0..4c3c93b2fd 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -991,6 +991,7 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 
 	/* Set cost-based vacuum delay */
 	VacuumFailsafeActive = false;
+	VacuumUpdateCosts();
 	VacuumCostActive = (VacuumCostDelay > 0);
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 585d28148c..ac54ed4546 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1773,20 +1773,33 @@ FreeWorkerInfo(int code, Datum arg)
 	}
 }
 
+
 /*
- * Update the cost-based delay parameters, so that multiple workers consume
- * each a fraction of the total available I/O.
+ * Update vacuum cost-based delay-related parameters for autovacuum workers and
+ * backends executing VACUUM or ANALYZE using the value of relevant gucs and
+ * global state. This must be called during setup for vacuum and after every
+ * config reload to ensure up-to-date values.
  */
 void
-AutoVacuumUpdateDelay(void)
+VacuumUpdateCosts(void)
 {
-	if (MyWorkerInfo)
+	if (am_autovacuum_launcher)
+		return;
+
+	if (am_autovacuum_worker)
 	{
-		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
 		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
+	}
+	else
+	{
+		/* Must be explicit VACUUM or ANALYZE */
+		VacuumCostLimit = vacuum_cost_limit;
+		VacuumCostDelay = vacuum_cost_delay;
 	}
 }
 
+
 /*
  * autovac_balance_cost
  *		Recalculate the cost limit setting for each active worker.
@@ -1805,9 +1818,9 @@ autovac_balance_cost(void)
 	 * zero is not a valid value.
 	 */
 	int			vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
-								  autovacuum_vac_cost_limit : VacuumCostLimit);
+								  autovacuum_vac_cost_limit : vacuum_cost_limit);
 	double		vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
-								  autovacuum_vac_cost_delay : VacuumCostDelay);
+								  autovacuum_vac_cost_delay : vacuum_cost_delay);
 	double		cost_total;
 	double		cost_avail;
 	dlist_iter	iter;
@@ -2312,8 +2325,6 @@ do_autovacuum(void)
 		autovac_table *tab;
 		bool		isshared;
 		bool		skipit;
-		double		stdVacuumCostDelay;
-		int			stdVacuumCostLimit;
 		dlist_iter	iter;
 
 		CHECK_FOR_INTERRUPTS();
@@ -2416,14 +2427,6 @@ do_autovacuum(void)
 			continue;
 		}
 
-		/*
-		 * Remember the prevailing values of the vacuum cost GUCs.  We have to
-		 * restore these at the bottom of the loop, else we'll compute wrong
-		 * values in the next iteration of autovac_balance_cost().
-		 */
-		stdVacuumCostDelay = VacuumCostDelay;
-		stdVacuumCostLimit = VacuumCostLimit;
-
 		/* Must hold AutovacuumLock while mucking with cost balance info */
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 
@@ -2437,7 +2440,7 @@ do_autovacuum(void)
 		autovac_balance_cost();
 
 		/* set the active cost parameters from the result of that */
-		AutoVacuumUpdateDelay();
+		VacuumUpdateCosts();
 
 		/* done */
 		LWLockRelease(AutovacuumLock);
@@ -2534,10 +2537,6 @@ deleted:
 		MyWorkerInfo->wi_tableoid = InvalidOid;
 		MyWorkerInfo->wi_sharedrel = false;
 		LWLockRelease(AutovacuumScheduleLock);
-
-		/* restore vacuum cost GUCs for the next iteration */
-		VacuumCostDelay = stdVacuumCostDelay;
-		VacuumCostLimit = stdVacuumCostLimit;
 	}
 
 	/*
@@ -2820,14 +2819,14 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			? avopts->vacuum_cost_delay
 			: (autovacuum_vac_cost_delay >= 0)
 			? autovacuum_vac_cost_delay
-			: VacuumCostDelay;
+			: vacuum_cost_delay;
 
 		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
 		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
 			? avopts->vacuum_cost_limit
 			: (autovacuum_vac_cost_limit > 0)
 			? autovacuum_vac_cost_limit
-			: VacuumCostLimit;
+			: vacuum_cost_limit;
 
 		/* -1 in autovac setting means use log_autovacuum_min_duration */
 		log_min_duration = (avopts && avopts->log_min_duration >= 0)
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 1b1d814254..8e5b065e8f 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -142,8 +142,6 @@ int			MaxBackends = 0;
 int			VacuumCostPageHit = 1;	/* GUC parameters for vacuum */
 int			VacuumCostPageMiss = 2;
 int			VacuumCostPageDirty = 20;
-int			VacuumCostLimit = 200;
-double		VacuumCostDelay = 0;
 
 int64		VacuumPageHit = 0;
 int64		VacuumPageMiss = 0;
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8062589efd..77db1a146c 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2409,7 +2409,7 @@ struct config_int ConfigureNamesInt[] =
 			gettext_noop("Vacuum cost amount available before napping."),
 			NULL
 		},
-		&VacuumCostLimit,
+		&vacuum_cost_limit,
 		200, 1, 10000,
 		NULL, NULL, NULL
 	},
@@ -3701,7 +3701,7 @@ struct config_real ConfigureNamesReal[] =
 			NULL,
 			GUC_UNIT_MS
 		},
-		&VacuumCostDelay,
+		&vacuum_cost_delay,
 		0, 0, 100,
 		NULL, NULL, NULL
 	},
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 7b8ee21788..a62dd2e781 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,8 @@ extern PGDLLIMPORT int vacuum_multixact_freeze_min_age;
 extern PGDLLIMPORT int vacuum_multixact_freeze_table_age;
 extern PGDLLIMPORT int vacuum_failsafe_age;
 extern PGDLLIMPORT int vacuum_multixact_failsafe_age;
+extern PGDLLIMPORT double vacuum_cost_delay;
+extern PGDLLIMPORT int vacuum_cost_limit;
 
 /* Variables for cost-based parallel vacuum */
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
@@ -307,6 +309,8 @@ extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
 
 extern bool VacuumFailsafeActive;
+extern int	VacuumCostLimit;
+extern double VacuumCostDelay;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
@@ -346,6 +350,9 @@ extern IndexBulkDeleteResult *vac_cleanup_one_index(IndexVacuumInfo *ivinfo,
 													IndexBulkDeleteResult *istat);
 extern Size vac_max_items_to_alloc_size(int max_items);
 
+/* In postmaster/autovacuum.c */
+extern void VacuumUpdateCosts(void);
+
 /* in commands/vacuumparallel.c */
 extern ParallelVacuumState *parallel_vacuum_init(Relation rel, Relation *indrels,
 												 int nindexes, int nrequested_workers,
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 06a86f9ac1..66db1b2c69 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -266,8 +266,6 @@ extern PGDLLIMPORT int max_parallel_maintenance_workers;
 extern PGDLLIMPORT int VacuumCostPageHit;
 extern PGDLLIMPORT int VacuumCostPageMiss;
 extern PGDLLIMPORT int VacuumCostPageDirty;
-extern PGDLLIMPORT int VacuumCostLimit;
-extern PGDLLIMPORT double VacuumCostDelay;
 
 extern PGDLLIMPORT int64 VacuumPageHit;
 extern PGDLLIMPORT int64 VacuumPageMiss;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index c140371b51..65afd1ea1e 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -63,9 +63,6 @@ extern int	StartAutoVacWorker(void);
 /* called from postmaster when a worker could not be forked */
 extern void AutoVacWorkerFailed(void);
 
-/* autovacuum cost-delay balancer */
-extern void AutoVacuumUpdateDelay(void);
-
 #ifdef EXEC_BACKEND
 extern void AutoVacLauncherMain(int argc, char *argv[]) pg_attribute_noreturn();
 extern void AutoVacWorkerMain(int argc, char *argv[]) pg_attribute_noreturn();
-- 
2.37.2

#43Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Melanie Plageman (#42)
Re: Should vacuum process config file reload more often

On Sat, Apr 1, 2023 at 4:09 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Fri, Mar 31, 2023 at 10:31 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Thu, Mar 30, 2023 at 3:26 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 30 Mar 2023, at 04:57, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

As another idea, why don't we use macros for that? For example,
suppose VacuumCostStatus is like:

typedef enum VacuumCostStatus
{
VACUUM_COST_INACTIVE_LOCKED = 0,
VACUUM_COST_INACTIVE,
VACUUM_COST_ACTIVE,
} VacuumCostStatus;
VacuumCostStatus VacuumCost;

non-vacuum code can use the following macros:

#define VacuumCostActive() (VacuumCost == VACUUM_COST_ACTIVE)
#define VacuumCostInactive() (VacuumCost <= VACUUM_COST_INACTIVE) //
or we can use !VacuumCostActive() instead.

I'm in favor of something along these lines. A variable with a name that
implies a boolean value (active/inactive) but actually contains a tri-value is
easily misunderstood. A VacuumCostState tri-value variable (or a better name)
with a set of convenient macros for extracting the boolean active/inactive that
most of the code needs to be concerned with would more for more readable code I
think.

The macros are very error-prone. I was just implementing this idea and
mistakenly tried to set the macro instead of the variable in multiple
places. Avoiding this involves another set of macros, and, in the end, I
think the complexity is much worse. Given the reviewers' uniform dislike
of VacuumCostInactive, I favor going back to two variables
(VacuumCostActive + VacuumFailsafeActive) and moving
LVRelState->failsafe_active to the global VacuumFailsafeActive.

I will reimplement this in the next version.

Thank you for updating the patches. Here are comments for 0001, 0002,
and 0003 patches:

0001:

@@ -391,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
         Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
         Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
                    params->truncate != VACOPTVALUE_AUTO);
-        vacrel->failsafe_active = false;
+        VacuumFailsafeActive = false;

If we go with the idea of using VacuumCostActive +
VacuumFailsafeActive, we need to make sure that both are cleared at
the end of the vacuum per table. Since the patch clears it only here,
it remains true even after vacuum() if we trigger the failsafe mode
for the last table in the table list.

In addition to that, to ensure that also in an error case, I think we
need to clear it also in PG_FINALLY() block in vacuum().

---
@@ -306,6 +306,7 @@ extern PGDLLIMPORT pg_atomic_uint32
*VacuumSharedCostBalance;
extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
extern PGDLLIMPORT int VacuumCostBalanceLocal;

+extern bool VacuumFailsafeActive;

Do we need PGDLLIMPORT for VacuumFailSafeActive?

0002:

@@ -2388,6 +2398,7 @@ vac_max_items_to_alloc_size(int max_items)
return offsetof(VacDeadItems, items) +
sizeof(ItemPointerData) * max_items;
}

+
/*
* vac_tid_reaped() -- is a particular tid deletable?
*

Unnecessary new line. There are some other unnecessary new lines in this patch.

---
@@ -307,6 +309,8 @@ extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
 extern bool VacuumFailsafeActive;
+extern int     VacuumCostLimit;
+extern double VacuumCostDelay;

and

@@ -266,8 +266,6 @@ extern PGDLLIMPORT int max_parallel_maintenance_workers;
 extern PGDLLIMPORT int VacuumCostPageHit;
 extern PGDLLIMPORT int VacuumCostPageMiss;
 extern PGDLLIMPORT int VacuumCostPageDirty;
-extern PGDLLIMPORT int VacuumCostLimit;
-extern PGDLLIMPORT double VacuumCostDelay;

Do we need PGDLLIMPORT too?

---
@@ -1773,20 +1773,33 @@ FreeWorkerInfo(int code, Datum arg)
        }
 }
+
 /*
- * Update the cost-based delay parameters, so that multiple workers consume
- * each a fraction of the total available I/O.
+ * Update vacuum cost-based delay-related parameters for autovacuum workers and
+ * backends executing VACUUM or ANALYZE using the value of relevant gucs and
+ * global state. This must be called during setup for vacuum and after every
+ * config reload to ensure up-to-date values.
  */
 void
-AutoVacuumUpdateDelay(void)
+VacuumUpdateCosts(void)
 {

Isn't it better to define VacuumUpdateCosts() in vacuum.c rather than
autovacuum.c as this is now a common code for both vacuum and
autovacuum?

0003:

@@ -501,9 +502,9 @@ vacuum(List *relations, VacuumParams *params,
{
ListCell *cur;

-                VacuumUpdateCosts();
                 in_vacuum = true;
-                VacuumCostActive = (VacuumCostDelay > 0);
+                VacuumFailsafeActive = false;
+                VacuumUpdateCosts();

Hmm, if we initialize VacuumFailsafeActive here, should it be included
in 0001 patch?

---
+        if (VacuumCostDelay > 0)
+                VacuumCostActive = true;
+        else
+        {
+                VacuumCostActive = false;
+                VacuumCostBalance = 0;
+        }

I agree to update VacuumCostActive in VacuumUpdateCosts(). But if we
do that I think this change should be included in 0002 patch.

---
+        if (ConfigReloadPending && !analyze_in_outer_xact)
+        {
+                ConfigReloadPending = false;
+                ProcessConfigFile(PGC_SIGHUP);
+                VacuumUpdateCosts();
+        }

Since analyze_in_outer_xact is false by default, we reload the config
file in vacuum_delay_point() by default. We need to note that
vacuum_delay_point() could be called via other paths, for example
gin_cleanup_pending_list() and ambulkdelete() called by
validate_index(). So it seems to me that we should do the opposite; we
have another global variable, say vacuum_can_reload_config, which is
false by default, and is set to true only when vacuum() allows it. In
vacuum_delay_point(), we reload the config file iff
(ConfigReloadPending && vacuum_can_reload_config).

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#44Melanie Plageman
melanieplageman@gmail.com
In reply to: Masahiko Sawada (#43)
4 attachment(s)
Re: Should vacuum process config file reload more often

On Sun, Apr 2, 2023 at 10:28 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Thank you for updating the patches. Here are comments for 0001, 0002,
and 0003 patches:

Thanks for the review!

v13 attached with requested updates.

0001:

@@ -391,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
params->truncate != VACOPTVALUE_AUTO);
-        vacrel->failsafe_active = false;
+        VacuumFailsafeActive = false;

If we go with the idea of using VacuumCostActive +
VacuumFailsafeActive, we need to make sure that both are cleared at
the end of the vacuum per table. Since the patch clears it only here,
it remains true even after vacuum() if we trigger the failsafe mode
for the last table in the table list.

In addition to that, to ensure that also in an error case, I think we
need to clear it also in PG_FINALLY() block in vacuum().

So, in 0001, I tried to keep it exactly the same as
LVRelState->failsafe_active except for it being a global. We don't
actually use VacuumFailsafeActive in this commit except in vacuumlazy.c,
which does its own management of the value (it resets it to false at the
top of heap_vacuum_rel()).

In the later commit which references VacuumFailsafeActive outside of
vacuumlazy.c, I had reset it in PG_FINALLY(). I hadn't reset it in the
relation list loop in vacuum(). Autovacuum calls vacuum() for each
relation. However, you are right that for VACUUM with a list of
relations for a table access method other than heap, once set to true,
if the table AM forgets to reset the value to false at the end of
vacuuming the relation, it would stay true.

I've set it to false now at the bottom of the loop through relations in
vacuum().

---
@@ -306,6 +306,7 @@ extern PGDLLIMPORT pg_atomic_uint32
*VacuumSharedCostBalance;
extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
extern PGDLLIMPORT int VacuumCostBalanceLocal;

+extern bool VacuumFailsafeActive;

Do we need PGDLLIMPORT for VacuumFailSafeActive?

I didn't add one because I thought extensions and other code probably
shouldn't access this variable. I thought PGDLLIMPORT was only needed
for extensions built on windows to access variables.

0002:

@@ -2388,6 +2398,7 @@ vac_max_items_to_alloc_size(int max_items)
return offsetof(VacDeadItems, items) +
sizeof(ItemPointerData) * max_items;
}

+
/*
* vac_tid_reaped() -- is a particular tid deletable?
*

Unnecessary new line. There are some other unnecessary new lines in this patch.

Thanks! I think I got them all.

---
@@ -307,6 +309,8 @@ extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
extern PGDLLIMPORT int VacuumCostBalanceLocal;
extern bool VacuumFailsafeActive;
+extern int     VacuumCostLimit;
+extern double VacuumCostDelay;

and

@@ -266,8 +266,6 @@ extern PGDLLIMPORT int max_parallel_maintenance_workers;
extern PGDLLIMPORT int VacuumCostPageHit;
extern PGDLLIMPORT int VacuumCostPageMiss;
extern PGDLLIMPORT int VacuumCostPageDirty;
-extern PGDLLIMPORT int VacuumCostLimit;
-extern PGDLLIMPORT double VacuumCostDelay;

Do we need PGDLLIMPORT too?

I was on the fence about this. I annotated the new guc variables
vacuum_cost_delay and vacuum_cost_limit with PGDLLIMPORT, but I did not
annotate the variables used in vacuum code (VacuumCostLimit/Delay). I
think whether or not this is the right choice depends on two things:
whether or not my understanding of PGDLLIMPORT is correct and, if it is,
whether or not we want extensions to be able to access
VacuumCostLimit/Delay or if just access to the guc variables is
sufficient/desirable.

---
@@ -1773,20 +1773,33 @@ FreeWorkerInfo(int code, Datum arg)
}
}
+
/*
- * Update the cost-based delay parameters, so that multiple workers consume
- * each a fraction of the total available I/O.
+ * Update vacuum cost-based delay-related parameters for autovacuum workers and
+ * backends executing VACUUM or ANALYZE using the value of relevant gucs and
+ * global state. This must be called during setup for vacuum and after every
+ * config reload to ensure up-to-date values.
*/
void
-AutoVacuumUpdateDelay(void)
+VacuumUpdateCosts(void

Isn't it better to define VacuumUpdateCosts() in vacuum.c rather than
autovacuum.c as this is now a common code for both vacuum and
autovacuum?

We can't access members of WorkerInfoData from inside vacuum.c

0003:

@@ -501,9 +502,9 @@ vacuum(List *relations, VacuumParams *params,
{
ListCell *cur;

-                VacuumUpdateCosts();
in_vacuum = true;
-                VacuumCostActive = (VacuumCostDelay > 0);
+                VacuumFailsafeActive = false;
+                VacuumUpdateCosts();

Hmm, if we initialize VacuumFailsafeActive here, should it be included
in 0001 patch?

See comment above. This is the first patch where we use or reference it
outside of vacuumlazy.c

---
+        if (VacuumCostDelay > 0)
+                VacuumCostActive = true;
+        else
+        {
+                VacuumCostActive = false;
+                VacuumCostBalance = 0;
+        }

I agree to update VacuumCostActive in VacuumUpdateCosts(). But if we
do that I think this change should be included in 0002 patch.

I'm a bit hesitant to do this because in 0002 VacuumCostActive cannot
change status while vacuuming a table or even between tables for VACUUM
when a list of relations is specified (except for being disabled by
failsafe mode) Adding it to VacuumUpdateCosts() in 0003 makes it clear
that it could change while vacuuming a table, so we must update it.

I previously had 0002 introduce AutoVacuumUpdateLimit(), which only
updated VacuumCostLimit with wi_cost_limit for autovacuum workers and
then called that in vacuum_delay_point() (instead of
AutoVacuumUpdateDelay() or VacuumUpdateCosts()). I abandoned that idea
in favor of the simplicity of having VacuumUpdateCosts() just update
those variables for everyone, since it could be reused in 0003.

Now, I'm thinking the previous method might be more clear?
Or is what I have okay?

---
+        if (ConfigReloadPending && !analyze_in_outer_xact)
+        {
+                ConfigReloadPending = false;
+                ProcessConfigFile(PGC_SIGHUP);
+                VacuumUpdateCosts();
+        }

Since analyze_in_outer_xact is false by default, we reload the config
file in vacuum_delay_point() by default. We need to note that
vacuum_delay_point() could be called via other paths, for example
gin_cleanup_pending_list() and ambulkdelete() called by
validate_index(). So it seems to me that we should do the opposite; we
have another global variable, say vacuum_can_reload_config, which is
false by default, and is set to true only when vacuum() allows it. In
vacuum_delay_point(), we reload the config file iff
(ConfigReloadPending && vacuum_can_reload_config).

Wow, great point. Thanks for catching this. I've made the update you
suggested. I also set vacuum_can_reload_config to false in PG_FINALLY()
in vacuum().

- Melanie

Attachments:

v13-0002-Separate-vacuum-cost-variables-from-gucs.patchtext/x-patch; charset=US-ASCII; name=v13-0002-Separate-vacuum-cost-variables-from-gucs.patchDownload
From 6f40f4be3ae8bab5551a298ee6163f67c06861e9 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Mon, 3 Apr 2023 11:22:18 -0400
Subject: [PATCH v13 2/4] Separate vacuum cost variables from gucs

Vacuum code run both by autovacuum workers and a backend doing
VACUUM/ANALYZE previously used VacuumCostLimit and VacuumCostDelay which
were the global variables for the gucs vacuum_cost_limit and
vacuum_cost_delay. Autovacuum workers needed to override these variables
with their own values, derived from autovacuum_vacuum_cost_limit and
autovacuum_vacuum_cost_delay and worker cost limit balancing logic. This
led to confusing code which, in some cases, both derived and set a new
value of VacuumCostLimit from VacuumCostLimit.

In preparation for refreshing these guc values more often, separate
these variables from the gucs themselves and add a function to update
the global variables using the gucs and existing logic.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
---
 src/backend/commands/vacuum.c         | 16 ++++++++--
 src/backend/commands/vacuumparallel.c |  1 +
 src/backend/postmaster/autovacuum.c   | 45 +++++++++++++--------------
 src/backend/utils/init/globals.c      |  2 --
 src/backend/utils/misc/guc_tables.c   |  4 +--
 src/include/commands/vacuum.h         |  7 +++++
 src/include/miscadmin.h               |  2 --
 src/include/postmaster/autovacuum.h   |  3 --
 8 files changed, 45 insertions(+), 35 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 9724fbce46..96df5e2920 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -71,6 +71,18 @@ int			vacuum_multixact_freeze_min_age;
 int			vacuum_multixact_freeze_table_age;
 int			vacuum_failsafe_age;
 int			vacuum_multixact_failsafe_age;
+double		vacuum_cost_delay;
+int			vacuum_cost_limit;
+
+/*
+ * Variables for cost-based vacuum delay. The defaults differ between
+ * autovacuum and vacuum. These should be overridden with the appropriate GUC
+ * value in vacuum code.
+ * TODO: should VacuumCostLimit and VacuumCostDelay be initialized to valid or
+ * invalid values?
+ */
+int			VacuumCostLimit = 0;
+double		VacuumCostDelay = -1;
 
 /*
  * VacuumFailsafeActive is a defined as a global so that we can determine
@@ -498,6 +510,7 @@ vacuum(List *relations, VacuumParams *params,
 	{
 		ListCell   *cur;
 
+		VacuumUpdateCosts();
 		in_vacuum = true;
 		VacuumCostActive = (VacuumCostDelay > 0);
 		VacuumCostBalance = 0;
@@ -2258,8 +2271,7 @@ vacuum_delay_point(void)
 
 		VacuumCostBalance = 0;
 
-		/* update balance values for workers */
-		AutoVacuumUpdateDelay();
+		VacuumUpdateCosts();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index ff7ed0f561..cf8cf89927 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -996,6 +996,7 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 
 	/* Set cost-based vacuum delay */
 	VacuumFailsafeActive = false;
+	VacuumUpdateCosts();
 	VacuumCostActive = (VacuumCostDelay > 0);
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 585d28148c..27d0d5f9e2 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1774,16 +1774,27 @@ FreeWorkerInfo(int code, Datum arg)
 }
 
 /*
- * Update the cost-based delay parameters, so that multiple workers consume
- * each a fraction of the total available I/O.
+ * Update vacuum cost-based delay-related parameters for autovacuum workers and
+ * backends executing VACUUM or ANALYZE using the value of relevant gucs and
+ * global state. This must be called during setup for vacuum and after every
+ * config reload to ensure up-to-date values.
  */
 void
-AutoVacuumUpdateDelay(void)
+VacuumUpdateCosts(void)
 {
-	if (MyWorkerInfo)
+	if (am_autovacuum_launcher)
+		return;
+
+	if (am_autovacuum_worker)
 	{
-		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
 		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
+	}
+	else
+	{
+		/* Must be explicit VACUUM or ANALYZE */
+		VacuumCostLimit = vacuum_cost_limit;
+		VacuumCostDelay = vacuum_cost_delay;
 	}
 }
 
@@ -1805,9 +1816,9 @@ autovac_balance_cost(void)
 	 * zero is not a valid value.
 	 */
 	int			vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
-								  autovacuum_vac_cost_limit : VacuumCostLimit);
+								  autovacuum_vac_cost_limit : vacuum_cost_limit);
 	double		vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
-								  autovacuum_vac_cost_delay : VacuumCostDelay);
+								  autovacuum_vac_cost_delay : vacuum_cost_delay);
 	double		cost_total;
 	double		cost_avail;
 	dlist_iter	iter;
@@ -2312,8 +2323,6 @@ do_autovacuum(void)
 		autovac_table *tab;
 		bool		isshared;
 		bool		skipit;
-		double		stdVacuumCostDelay;
-		int			stdVacuumCostLimit;
 		dlist_iter	iter;
 
 		CHECK_FOR_INTERRUPTS();
@@ -2416,14 +2425,6 @@ do_autovacuum(void)
 			continue;
 		}
 
-		/*
-		 * Remember the prevailing values of the vacuum cost GUCs.  We have to
-		 * restore these at the bottom of the loop, else we'll compute wrong
-		 * values in the next iteration of autovac_balance_cost().
-		 */
-		stdVacuumCostDelay = VacuumCostDelay;
-		stdVacuumCostLimit = VacuumCostLimit;
-
 		/* Must hold AutovacuumLock while mucking with cost balance info */
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 
@@ -2437,7 +2438,7 @@ do_autovacuum(void)
 		autovac_balance_cost();
 
 		/* set the active cost parameters from the result of that */
-		AutoVacuumUpdateDelay();
+		VacuumUpdateCosts();
 
 		/* done */
 		LWLockRelease(AutovacuumLock);
@@ -2534,10 +2535,6 @@ deleted:
 		MyWorkerInfo->wi_tableoid = InvalidOid;
 		MyWorkerInfo->wi_sharedrel = false;
 		LWLockRelease(AutovacuumScheduleLock);
-
-		/* restore vacuum cost GUCs for the next iteration */
-		VacuumCostDelay = stdVacuumCostDelay;
-		VacuumCostLimit = stdVacuumCostLimit;
 	}
 
 	/*
@@ -2820,14 +2817,14 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			? avopts->vacuum_cost_delay
 			: (autovacuum_vac_cost_delay >= 0)
 			? autovacuum_vac_cost_delay
-			: VacuumCostDelay;
+			: vacuum_cost_delay;
 
 		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
 		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
 			? avopts->vacuum_cost_limit
 			: (autovacuum_vac_cost_limit > 0)
 			? autovacuum_vac_cost_limit
-			: VacuumCostLimit;
+			: vacuum_cost_limit;
 
 		/* -1 in autovac setting means use log_autovacuum_min_duration */
 		log_min_duration = (avopts && avopts->log_min_duration >= 0)
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 1b1d814254..8e5b065e8f 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -142,8 +142,6 @@ int			MaxBackends = 0;
 int			VacuumCostPageHit = 1;	/* GUC parameters for vacuum */
 int			VacuumCostPageMiss = 2;
 int			VacuumCostPageDirty = 20;
-int			VacuumCostLimit = 200;
-double		VacuumCostDelay = 0;
 
 int64		VacuumPageHit = 0;
 int64		VacuumPageMiss = 0;
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8062589efd..77db1a146c 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2409,7 +2409,7 @@ struct config_int ConfigureNamesInt[] =
 			gettext_noop("Vacuum cost amount available before napping."),
 			NULL
 		},
-		&VacuumCostLimit,
+		&vacuum_cost_limit,
 		200, 1, 10000,
 		NULL, NULL, NULL
 	},
@@ -3701,7 +3701,7 @@ struct config_real ConfigureNamesReal[] =
 			NULL,
 			GUC_UNIT_MS
 		},
-		&VacuumCostDelay,
+		&vacuum_cost_delay,
 		0, 0, 100,
 		NULL, NULL, NULL
 	},
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 7b8ee21788..a62dd2e781 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,8 @@ extern PGDLLIMPORT int vacuum_multixact_freeze_min_age;
 extern PGDLLIMPORT int vacuum_multixact_freeze_table_age;
 extern PGDLLIMPORT int vacuum_failsafe_age;
 extern PGDLLIMPORT int vacuum_multixact_failsafe_age;
+extern PGDLLIMPORT double vacuum_cost_delay;
+extern PGDLLIMPORT int vacuum_cost_limit;
 
 /* Variables for cost-based parallel vacuum */
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
@@ -307,6 +309,8 @@ extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
 
 extern bool VacuumFailsafeActive;
+extern int	VacuumCostLimit;
+extern double VacuumCostDelay;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
@@ -346,6 +350,9 @@ extern IndexBulkDeleteResult *vac_cleanup_one_index(IndexVacuumInfo *ivinfo,
 													IndexBulkDeleteResult *istat);
 extern Size vac_max_items_to_alloc_size(int max_items);
 
+/* In postmaster/autovacuum.c */
+extern void VacuumUpdateCosts(void);
+
 /* in commands/vacuumparallel.c */
 extern ParallelVacuumState *parallel_vacuum_init(Relation rel, Relation *indrels,
 												 int nindexes, int nrequested_workers,
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 06a86f9ac1..66db1b2c69 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -266,8 +266,6 @@ extern PGDLLIMPORT int max_parallel_maintenance_workers;
 extern PGDLLIMPORT int VacuumCostPageHit;
 extern PGDLLIMPORT int VacuumCostPageMiss;
 extern PGDLLIMPORT int VacuumCostPageDirty;
-extern PGDLLIMPORT int VacuumCostLimit;
-extern PGDLLIMPORT double VacuumCostDelay;
 
 extern PGDLLIMPORT int64 VacuumPageHit;
 extern PGDLLIMPORT int64 VacuumPageMiss;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index c140371b51..65afd1ea1e 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -63,9 +63,6 @@ extern int	StartAutoVacWorker(void);
 /* called from postmaster when a worker could not be forked */
 extern void AutoVacWorkerFailed(void);
 
-/* autovacuum cost-delay balancer */
-extern void AutoVacuumUpdateDelay(void);
-
 #ifdef EXEC_BACKEND
 extern void AutoVacLauncherMain(int argc, char *argv[]) pg_attribute_noreturn();
 extern void AutoVacWorkerMain(int argc, char *argv[]) pg_attribute_noreturn();
-- 
2.37.2

v13-0004-Autovacuum-refreshes-cost-based-delay-params-mor.patchtext/x-patch; charset=US-ASCII; name=v13-0004-Autovacuum-refreshes-cost-based-delay-params-mor.patchDownload
From e757a4ae08269bdd8fdcf77b6eeff4a292d1a8ec Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 25 Mar 2023 14:14:55 -0400
Subject: [PATCH v13 4/4] Autovacuum refreshes cost-based delay params more
 often

The previous commit allowed VACUUM to reload the config file more often
so that cost-based delay parameters could take effect while VACUUMing a
relation. Autovacuum, however did not benefit from this change.

In order for autovacuum workers to safely update their own cost delay
and cost limit parameters without impacting performance, we had to
rethink when and how these values were accessed.

Previously, an autovacuum worker's wi_cost_limit was set only at the
beginning of vacuuming a table, after reloading the config file.
Therefore, at the time that autovac_balance_cost() is called, workers
vacuuming tables with no table options could still have different values
for their wi_cost_limit_base and wi_cost_delay.

Now that the cost parameters can be updated while vacuuming a table,
workers will (within some margin of error) have no reason to have
different values for cost limit and cost delay (in the absence of table
options). This removes the rationale for keeping cost limit and cost
delay in shared memory. Balancing the cost limit requires only the
number of active autovacuum workers vacuuming a table with no cost-based
table options.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
---
 src/backend/commands/vacuum.c       |  17 +-
 src/backend/postmaster/autovacuum.c | 235 ++++++++++++++--------------
 src/include/commands/vacuum.h       |   1 +
 3 files changed, 134 insertions(+), 119 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 888eb022cc..3e34324645 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2257,10 +2257,10 @@ vacuum_delay_point(void)
 
 	/*
 	 * Reload the configuration file if requested. This allows changes to
-	 * vacuum_cost_limit and vacuum_cost_delay to take effect while a table is
-	 * being vacuumed or analyzed. Analyze should not reload configuration
-	 * file if it is in an outer transaction, as we currently only allow
-	 * configuration reload when in top-level statements.
+	 * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay to
+	 * take effect while a table is being vacuumed or analyzed. Analyze should
+	 * not reload configuration file if it is in an outer transaction, as we
+	 * currently only allow configuration reload when in top-level statements.
 	 */
 	if (ConfigReloadPending && vacuum_can_reload_config)
 	{
@@ -2306,7 +2306,14 @@ vacuum_delay_point(void)
 
 		VacuumCostBalance = 0;
 
-		VacuumUpdateCosts();
+		/*
+		 * Balance and update limit values for autovacuum workers. We must
+		 * always do this in case the autovacuum launcher or another
+		 * autovacuum worker has recalculated the number of workers across
+		 * which we must balance the limit. This is done by the launcher when
+		 * launching a new worker and by workers before vacuuming each table.
+		 */
+		AutoVacuumUpdateLimit();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 7a9738202f..8cb9bee4eb 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -139,6 +139,9 @@ int			Log_autovacuum_min_duration = 600000;
 static bool am_autovacuum_launcher = false;
 static bool am_autovacuum_worker = false;
 
+static double av_relopt_cost_delay = -1;
+static int	av_relopt_cost_limit = 0;
+
 /* Flags set by signal handlers */
 static volatile sig_atomic_t got_SIGUSR2 = false;
 
@@ -189,8 +192,8 @@ typedef struct autovac_table
 {
 	Oid			at_relid;
 	VacuumParams at_params;
-	double		at_vacuum_cost_delay;
-	int			at_vacuum_cost_limit;
+	double		at_relopt_vac_cost_delay;
+	int			at_relopt_vac_cost_limit;
 	bool		at_dobalance;
 	bool		at_sharedrel;
 	char	   *at_relname;
@@ -209,7 +212,7 @@ typedef struct autovac_table
  * wi_sharedrel flag indicating whether table is marked relisshared
  * wi_proc		pointer to PGPROC of the running worker, NULL if not started
  * wi_launchtime Time at which this worker was launched
- * wi_cost_*	Vacuum cost-based delay parameters current in this worker
+ * wi_dobalance Whether this worker should be included in balance calculations
  *
  * All fields are protected by AutovacuumLock, except for wi_tableoid and
  * wi_sharedrel which are protected by AutovacuumScheduleLock (note these
@@ -223,11 +226,8 @@ typedef struct WorkerInfoData
 	Oid			wi_tableoid;
 	PGPROC	   *wi_proc;
 	TimestampTz wi_launchtime;
-	bool		wi_dobalance;
+	pg_atomic_flag wi_dobalance;
 	bool		wi_sharedrel;
-	double		wi_cost_delay;
-	int			wi_cost_limit;
-	int			wi_cost_limit_base;
 } WorkerInfoData;
 
 typedef struct WorkerInfoData *WorkerInfo;
@@ -273,6 +273,8 @@ typedef struct AutoVacuumWorkItem
  * av_startingWorker pointer to WorkerInfo currently being started (cleared by
  *					the worker itself as soon as it's up and running)
  * av_workItems		work item array
+ * av_nworkersForBalance the number of autovacuum workers to use when
+ * 					calculating the per worker cost limit
  *
  * This struct is protected by AutovacuumLock, except for av_signal and parts
  * of the worker list (see above).
@@ -286,6 +288,7 @@ typedef struct
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+	pg_atomic_uint32 av_nworkersForBalance;
 } AutoVacuumShmemStruct;
 
 static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -319,7 +322,7 @@ static void launch_worker(TimestampTz now);
 static List *get_database_list(void);
 static void rebuild_database_list(Oid newdb);
 static int	db_comparator(const void *a, const void *b);
-static void autovac_balance_cost(void);
+static void autovac_recalculate_workers_for_balance(void);
 
 static void do_autovacuum(void);
 static void FreeWorkerInfo(int code, Datum arg);
@@ -670,7 +673,7 @@ AutoVacLauncherMain(int argc, char *argv[])
 			{
 				LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 				AutoVacuumShmem->av_signal[AutoVacRebalance] = false;
-				autovac_balance_cost();
+				autovac_recalculate_workers_for_balance();
 				LWLockRelease(AutovacuumLock);
 			}
 
@@ -820,8 +823,8 @@ HandleAutoVacLauncherInterrupts(void)
 			AutoVacLauncherShutdown();
 
 		/* rebalance in case the default cost parameters changed */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-		autovac_balance_cost();
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
+		autovac_recalculate_workers_for_balance();
 		LWLockRelease(AutovacuumLock);
 
 		/* rebuild the list in case the naptime changed */
@@ -1755,10 +1758,7 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_sharedrel = false;
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
-		MyWorkerInfo->wi_dobalance = false;
-		MyWorkerInfo->wi_cost_delay = 0;
-		MyWorkerInfo->wi_cost_limit = 0;
-		MyWorkerInfo->wi_cost_limit_base = 0;
+		pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
 		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 						&MyWorkerInfo->wi_links);
 		/* not mine anymore */
@@ -1787,14 +1787,21 @@ VacuumUpdateCosts(void)
 
 	if (am_autovacuum_worker)
 	{
-		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
-		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
+		if (av_relopt_cost_delay >= 0)
+			VacuumCostDelay = av_relopt_cost_delay;
+		else if (autovacuum_vac_cost_delay >= 0)
+			VacuumCostDelay = autovacuum_vac_cost_delay;
+		else
+			/* fall back to vacuum_cost_delay */
+			VacuumCostDelay = vacuum_cost_delay;
+
+		AutoVacuumUpdateLimit();
 	}
 	else
 	{
 		/* Must be explicit VACUUM or ANALYZE */
-		VacuumCostLimit = vacuum_cost_limit;
 		VacuumCostDelay = vacuum_cost_delay;
+		VacuumCostLimit = vacuum_cost_limit;
 	}
 
 	/*
@@ -1817,85 +1824,82 @@ VacuumUpdateCosts(void)
 }
 
 /*
- * autovac_balance_cost
- *		Recalculate the cost limit setting for each active worker.
- *
- * Caller must hold the AutovacuumLock in exclusive mode.
- */
-static void
-autovac_balance_cost(void)
+* Update VacuumCostLimit with the correct value for an autovacuum worker, given
+* the value of other relevant cost limit parameters and the number of workers
+* across which the limit must be balanced. Autovacuum workers must call this
+* regularly in case av_nworkers_for_balance has been updated by another worker
+* or by the autovacuum launcher. They must also call it after a config reload.
+*/
+void
+AutoVacuumUpdateLimit(void)
 {
+	if (!am_autovacuum_worker)
+		return;
+
 	/*
-	 * The idea here is that we ration out I/O equally.  The amount of I/O
-	 * that a worker can consume is determined by cost_limit/cost_delay, so we
-	 * try to equalize those ratios rather than the raw limit settings.
-	 *
 	 * note: in cost_limit, zero also means use value from elsewhere, because
 	 * zero is not a valid value.
 	 */
-	int			vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
-								  autovacuum_vac_cost_limit : vacuum_cost_limit);
-	double		vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
-								  autovacuum_vac_cost_delay : vacuum_cost_delay);
-	double		cost_total;
-	double		cost_avail;
-	dlist_iter	iter;
-
-	/* not set? nothing to do */
-	if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
-		return;
 
-	/* calculate the total base cost limit of participating active workers */
-	cost_total = 0.0;
-	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
+	if (av_relopt_cost_limit > 0)
+		VacuumCostLimit = av_relopt_cost_limit;
+	else
 	{
-		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
+		int			nworkers_for_balance;
+
+		if (autovacuum_vac_cost_limit > 0)
+			VacuumCostLimit = autovacuum_vac_cost_limit;
+		else
+			VacuumCostLimit = vacuum_cost_limit;
+
+		/* Only balance limit if no table options specified */
+		if (pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance))
+			return;
+
+		Assert(VacuumCostLimit > 0);
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-			cost_total +=
-				(double) worker->wi_cost_limit_base / worker->wi_cost_delay;
+		nworkers_for_balance = pg_atomic_read_u32(
+								&AutoVacuumShmem->av_nworkersForBalance);
+
+		/* There is at least 1 autovac worker (this worker). */
+		Assert(nworkers_for_balance > 0);
+
+		VacuumCostLimit = Max(VacuumCostLimit / nworkers_for_balance, 1);
 	}
+}
 
-	/* there are no cost limits -- nothing to do */
-	if (cost_total <= 0)
-		return;
+/*
+ * autovac_recalculate_workers_for_balance
+ *		Recalculate the number of workers to consider, given table options and
+ *		the current number of active workers.
+ *
+ * Caller must hold the AutovacuumLock in at least shared mode to access
+ * worker->wi_proc.
+ */
+static void
+autovac_recalculate_workers_for_balance(void)
+{
+	dlist_iter	iter;
+	int			orig_nworkers_for_balance;
+	int			nworkers_for_balance = 0;
+
+	orig_nworkers_for_balance =
+		pg_atomic_read_u32(&AutoVacuumShmem->av_nworkersForBalance);
 
-	/*
-	 * Adjust cost limit of each active worker to balance the total of cost
-	 * limit to autovacuum_vacuum_cost_limit.
-	 */
-	cost_avail = (double) vac_cost_limit / vac_cost_delay;
 	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
 	{
 		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-		{
-			int			limit = (int)
-			(cost_avail * worker->wi_cost_limit_base / cost_total);
-
-			/*
-			 * We put a lower bound of 1 on the cost_limit, to avoid division-
-			 * by-zero in the vacuum code.  Also, in case of roundoff trouble
-			 * in these calculations, let's be sure we don't ever set
-			 * cost_limit to more than the base value.
-			 */
-			worker->wi_cost_limit = Max(Min(limit,
-											worker->wi_cost_limit_base),
-										1);
-		}
+		if (worker->wi_proc == NULL ||
+			pg_atomic_unlocked_test_flag(&worker->wi_dobalance))
+			continue;
 
-		if (worker->wi_proc != NULL)
-			elog(DEBUG2, "autovac_balance_cost(pid=%d db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d, cost_delay=%g)",
-				 worker->wi_proc->pid, worker->wi_dboid, worker->wi_tableoid,
-				 worker->wi_dobalance ? "yes" : "no",
-				 worker->wi_cost_limit, worker->wi_cost_limit_base,
-				 worker->wi_cost_delay);
+		nworkers_for_balance++;
 	}
+
+	if (nworkers_for_balance != orig_nworkers_for_balance)
+		pg_atomic_write_u32(&AutoVacuumShmem->av_nworkersForBalance,
+							nworkers_for_balance);
 }
 
 /*
@@ -2443,23 +2447,31 @@ do_autovacuum(void)
 			continue;
 		}
 
-		/* Must hold AutovacuumLock while mucking with cost balance info */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 
-		/* advertise my cost delay parameters for the balancing algorithm */
-		MyWorkerInfo->wi_dobalance = tab->at_dobalance;
-		MyWorkerInfo->wi_cost_delay = tab->at_vacuum_cost_delay;
-		MyWorkerInfo->wi_cost_limit = tab->at_vacuum_cost_limit;
-		MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
+		/*
+		 * Save the cost-related table options in global variables for
+		 * reference when updating VacuumCostLimit and VacuumCostDelay during
+		 * vacuuming this table.
+		 */
+		av_relopt_cost_limit = tab->at_relopt_vac_cost_limit;
+		av_relopt_cost_delay = tab->at_relopt_vac_cost_delay;
 
-		/* do a balance */
-		autovac_balance_cost();
+		if (tab->at_dobalance)
+			pg_atomic_test_set_flag(&MyWorkerInfo->wi_dobalance);
+		else
+			pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
 
-		/* set the active cost parameters from the result of that */
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
+		autovac_recalculate_workers_for_balance();
+		LWLockRelease(AutovacuumLock);
+
+		/*
+		 * We wait until this point to update cost delay and cost limit
+		 * values, even though we reloaded the configuration file above, so
+		 * that we can take into account the cost-related table options.
+		 */
 		VacuumUpdateCosts();
 
-		/* done */
-		LWLockRelease(AutovacuumLock);
 
 		/* clean up memory before each iteration */
 		MemoryContextResetAndDeleteChildren(PortalContext);
@@ -2544,10 +2556,10 @@ deleted:
 
 		/*
 		 * Remove my info from shared memory.  We could, but intentionally
-		 * don't, clear wi_cost_limit and friends --- this is on the
-		 * assumption that we probably have more to do with similar cost
-		 * settings, so we don't want to give up our share of I/O for a very
-		 * short interval and thereby thrash the global balance.
+		 * don't, unset wi_dobalance on the assumption that we are more likely
+		 * than not to vacuum a table with no table options next, so we don't
+		 * want to give up our share of I/O for a very short interval and
+		 * thereby thrash the global balance.
 		 */
 		LWLockAcquire(AutovacuumScheduleLock, LW_EXCLUSIVE);
 		MyWorkerInfo->wi_tableoid = InvalidOid;
@@ -2584,6 +2596,7 @@ deleted:
 		{
 			ConfigReloadPending = false;
 			ProcessConfigFile(PGC_SIGHUP);
+			VacuumUpdateCosts();
 		}
 
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
@@ -2819,8 +2832,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		int			freeze_table_age;
 		int			multixact_freeze_min_age;
 		int			multixact_freeze_table_age;
-		int			vac_cost_limit;
-		double		vac_cost_delay;
 		int			log_min_duration;
 
 		/*
@@ -2830,20 +2841,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 * defaults, autovacuum's own first and plain vacuum second.
 		 */
 
-		/* -1 in autovac setting means use plain vacuum_cost_delay */
-		vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
-			? avopts->vacuum_cost_delay
-			: (autovacuum_vac_cost_delay >= 0)
-			? autovacuum_vac_cost_delay
-			: vacuum_cost_delay;
-
-		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
-		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
-			? avopts->vacuum_cost_limit
-			: (autovacuum_vac_cost_limit > 0)
-			? autovacuum_vac_cost_limit
-			: vacuum_cost_limit;
-
 		/* -1 in autovac setting means use log_autovacuum_min_duration */
 		log_min_duration = (avopts && avopts->log_min_duration >= 0)
 			? avopts->log_min_duration
@@ -2899,8 +2896,10 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab->at_params.multixact_freeze_table_age = multixact_freeze_table_age;
 		tab->at_params.is_wraparound = wraparound;
 		tab->at_params.log_min_duration = log_min_duration;
-		tab->at_vacuum_cost_limit = vac_cost_limit;
-		tab->at_vacuum_cost_delay = vac_cost_delay;
+		tab->at_relopt_vac_cost_limit = avopts ?
+			avopts->vacuum_cost_limit : 0;
+		tab->at_relopt_vac_cost_delay = avopts ?
+			avopts->vacuum_cost_delay : -1;
 		tab->at_relname = NULL;
 		tab->at_nspname = NULL;
 		tab->at_datname = NULL;
@@ -3392,10 +3391,18 @@ AutoVacuumShmemInit(void)
 		worker = (WorkerInfo) ((char *) AutoVacuumShmem +
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
+
 		/* initialize the WorkerInfo free list */
 		for (i = 0; i < autovacuum_max_workers; i++)
+		{
 			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 							&worker[i].wi_links);
+
+			pg_atomic_init_flag(&worker[i].wi_dobalance);
+		}
+
+		pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
+
 	}
 	else
 		Assert(found);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index a62dd2e781..6b286037ca 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -351,6 +351,7 @@ extern IndexBulkDeleteResult *vac_cleanup_one_index(IndexVacuumInfo *ivinfo,
 extern Size vac_max_items_to_alloc_size(int max_items);
 
 /* In postmaster/autovacuum.c */
+extern void AutoVacuumUpdateLimit(void);
 extern void VacuumUpdateCosts(void);
 
 /* in commands/vacuumparallel.c */
-- 
2.37.2

v13-0003-VACUUM-reloads-config-file-more-often.patchtext/x-patch; charset=US-ASCII; name=v13-0003-VACUUM-reloads-config-file-more-often.patchDownload
From fabe03a1e3cca701e35e6394c82e218369cd63a1 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Mon, 3 Apr 2023 12:36:35 -0400
Subject: [PATCH v13 3/4] VACUUM reloads config file more often

Previously, VACUUM would not reload the configuration file, so changes
to cost-based delay parameters could only take effect on the next
invocation of VACUUM.

Now, check if a reload is pending roughly once per block, when checking
if we need to delay.

Note that autovacuum is unaffected by this change. Autovacuum workers
overwrite the value of VacuumCostLimit and VacuumCostDelay with their
own WorkerInfo->wi_cost_limit and wi_cost_delay instead of using
potentially refreshed values of autovacuum_vacuum_cost_limit and
autovacuum_vacuum_cost_delay. Locking considerations and needed updates
to the worker balancing logic make enabling this feature for autovacuum
worthy of an independent commit.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
---
 src/backend/commands/vacuum.c         | 41 +++++++++++++++++++++++++--
 src/backend/commands/vacuumparallel.c |  5 ++--
 src/backend/postmaster/autovacuum.c   | 18 ++++++++++++
 3 files changed, 58 insertions(+), 6 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 96df5e2920..888eb022cc 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -48,6 +48,7 @@
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/interrupt.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/pmsignal.h"
@@ -83,6 +84,7 @@ int			vacuum_cost_limit;
  */
 int			VacuumCostLimit = 0;
 double		VacuumCostDelay = -1;
+static bool vacuum_can_reload_config = false;
 
 /*
  * VacuumFailsafeActive is a defined as a global so that we can determine
@@ -354,6 +356,8 @@ vacuum(List *relations, VacuumParams *params,
 	else
 		in_outer_xact = IsInTransactionBlock(isTopLevel);
 
+	vacuum_can_reload_config = !in_outer_xact;
+
 	/*
 	 * Check for and disallow recursive calls.  This could happen when VACUUM
 	 * FULL or ANALYZE calls a hostile index expression that itself calls
@@ -510,9 +514,9 @@ vacuum(List *relations, VacuumParams *params,
 	{
 		ListCell   *cur;
 
-		VacuumUpdateCosts();
 		in_vacuum = true;
-		VacuumCostActive = (VacuumCostDelay > 0);
+		VacuumFailsafeActive = false;
+		VacuumUpdateCosts();
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
@@ -566,12 +570,21 @@ vacuum(List *relations, VacuumParams *params,
 					CommandCounterIncrement();
 				}
 			}
+
+			/*
+			 * Ensure VacuumFailsafeActive has been reset before vacuuming the
+			 * next relation relation.
+			 */
+			VacuumFailsafeActive = false;
 		}
 	}
 	PG_FINALLY();
 	{
 		in_vacuum = false;
 		VacuumCostActive = false;
+		VacuumFailsafeActive = false;
+		VacuumCostBalance = 0;
+		vacuum_can_reload_config = false;
 	}
 	PG_END_TRY();
 
@@ -2238,7 +2251,29 @@ vacuum_delay_point(void)
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	if (!VacuumCostActive || InterruptPending)
+	if (InterruptPending ||
+		(!VacuumCostActive && !ConfigReloadPending))
+		return;
+
+	/*
+	 * Reload the configuration file if requested. This allows changes to
+	 * vacuum_cost_limit and vacuum_cost_delay to take effect while a table is
+	 * being vacuumed or analyzed. Analyze should not reload configuration
+	 * file if it is in an outer transaction, as we currently only allow
+	 * configuration reload when in top-level statements.
+	 */
+	if (ConfigReloadPending && vacuum_can_reload_config)
+	{
+		ConfigReloadPending = false;
+		ProcessConfigFile(PGC_SIGHUP);
+		VacuumUpdateCosts();
+	}
+
+	/*
+	 * If we disabled cost-based delays after reloading the config file,
+	 * return.
+	 */
+	if (!VacuumCostActive)
 		return;
 
 	/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index cf8cf89927..4bc1c14dff 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -995,10 +995,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												 false);
 
 	/* Set cost-based vacuum delay */
-	VacuumFailsafeActive = false;
-	VacuumUpdateCosts();
-	VacuumCostActive = (VacuumCostDelay > 0);
+	Assert(!VacuumFailsafeActive);
 	VacuumCostBalance = 0;
+	VacuumUpdateCosts();
 	VacuumPageHit = 0;
 	VacuumPageMiss = 0;
 	VacuumPageDirty = 0;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 27d0d5f9e2..7a9738202f 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1796,6 +1796,24 @@ VacuumUpdateCosts(void)
 		VacuumCostLimit = vacuum_cost_limit;
 		VacuumCostDelay = vacuum_cost_delay;
 	}
+
+	/*
+	 * If configuration changes are allowed to impact VacuumCostActive, make
+	 * sure it is updated.
+	 */
+	if (VacuumFailsafeActive)
+	{
+		Assert(!VacuumCostActive);
+		return;
+	}
+
+	if (VacuumCostDelay > 0)
+		VacuumCostActive = true;
+	else
+	{
+		VacuumCostActive = false;
+		VacuumCostBalance = 0;
+	}
 }
 
 /*
-- 
2.37.2

v13-0001-Make-vacuum-s-failsafe_active-a-global.patchtext/x-patch; charset=US-ASCII; name=v13-0001-Make-vacuum-s-failsafe_active-a-global.patchDownload
From b864281b99ecc58d53e0828a3ed1823609f5a7bc Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Fri, 31 Mar 2023 10:38:39 -0400
Subject: [PATCH v13 1/4] Make vacuum's failsafe_active a global

While vacuuming a table in failsafe mode, VacuumCostActive should not be
re-enabled. This currently isn't a problem because vacuum cost
parameters are only refreshed in between vacuuming tables and failsafe
status is reset for every table. In preparation for allowing vacuum cost
parameters to be updated more frequently, elevate
LVRelState->failsafe_active to a global, VacuumFailsafeActive, which
will be checked when determining whether or not to re-enable vacuum
cost-related delays.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c  | 16 +++++++---------
 src/backend/commands/vacuum.c         |  9 +++++++++
 src/backend/commands/vacuumparallel.c |  1 +
 src/include/commands/vacuum.h         |  1 +
 4 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 3e5d3982c7..6d761e6d0e 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -153,8 +153,6 @@ typedef struct LVRelState
 	bool		aggressive;
 	/* Use visibility map to skip? (disabled by DISABLE_PAGE_SKIPPING) */
 	bool		skipwithvm;
-	/* Wraparound failsafe has been triggered? */
-	bool		failsafe_active;
 	/* Consider index vacuuming bypass optimization? */
 	bool		consider_bypass_optimization;
 
@@ -391,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
 	Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
 		   params->truncate != VACOPTVALUE_AUTO);
-	vacrel->failsafe_active = false;
+	VacuumFailsafeActive = false;
 	vacrel->consider_bypass_optimization = true;
 	vacrel->do_index_vacuuming = true;
 	vacrel->do_index_cleanup = true;
@@ -709,7 +707,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 			}
 			else
 			{
-				if (!vacrel->failsafe_active)
+				if (!VacuumFailsafeActive)
 					appendStringInfoString(&buf, _("index scan bypassed: "));
 				else
 					appendStringInfoString(&buf, _("index scan bypassed by failsafe: "));
@@ -2293,7 +2291,7 @@ lazy_vacuum(LVRelState *vacrel)
 		 * vacuuming or heap vacuuming.  This VACUUM operation won't end up
 		 * back here again.
 		 */
-		Assert(vacrel->failsafe_active);
+		Assert(VacuumFailsafeActive);
 	}
 
 	/*
@@ -2374,7 +2372,7 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	 */
 	Assert(vacrel->num_index_scans > 0 ||
 		   vacrel->dead_items->num_items == vacrel->lpdead_items);
-	Assert(allindexes || vacrel->failsafe_active);
+	Assert(allindexes || VacuumFailsafeActive);
 
 	/*
 	 * Increase and report the number of index scans.
@@ -2616,12 +2614,12 @@ static bool
 lazy_check_wraparound_failsafe(LVRelState *vacrel)
 {
 	/* Don't warn more than once per VACUUM */
-	if (vacrel->failsafe_active)
+	if (VacuumFailsafeActive)
 		return true;
 
 	if (unlikely(vacuum_xid_failsafe_check(&vacrel->cutoffs)))
 	{
-		vacrel->failsafe_active = true;
+		VacuumFailsafeActive = true;
 
 		/*
 		 * Abandon use of a buffer access strategy to allow use of all of
@@ -2820,7 +2818,7 @@ should_attempt_truncation(LVRelState *vacrel)
 {
 	BlockNumber possibly_freeable;
 
-	if (!vacrel->do_rel_truncate || vacrel->failsafe_active ||
+	if (!vacrel->do_rel_truncate || VacuumFailsafeActive ||
 		old_snapshot_threshold >= 0)
 		return false;
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index da85330ef4..9724fbce46 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -72,6 +72,15 @@ int			vacuum_multixact_freeze_table_age;
 int			vacuum_failsafe_age;
 int			vacuum_multixact_failsafe_age;
 
+/*
+ * VacuumFailsafeActive is a defined as a global so that we can determine
+ * whether or not to re-enable cost-based vacuum delay when vacuuming a table.
+ * If failsafe mode has been engaged, we will not re-enable cost-based delay
+ * for the table until after vacuuming has completed, regardless of other
+ * settings.
+ */
+bool		VacuumFailsafeActive = false;
+
 /*
  * Variables for cost-based parallel vacuum.  See comments atop
  * compute_parallel_delay to understand how it works.
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 2cdbd182b6..ff7ed0f561 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -995,6 +995,7 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												 false);
 
 	/* Set cost-based vacuum delay */
+	VacuumFailsafeActive = false;
 	VacuumCostActive = (VacuumCostDelay > 0);
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index bdfd96cfec..7b8ee21788 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -306,6 +306,7 @@ extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
 
+extern bool VacuumFailsafeActive;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
-- 
2.37.2

#45Tom Lane
tgl@sss.pgh.pa.us
In reply to: Melanie Plageman (#44)
Re: Should vacuum process config file reload more often

Melanie Plageman <melanieplageman@gmail.com> writes:

v13 attached with requested updates.

I'm afraid I'd not been paying any attention to this discussion,
but better late than never. I'm okay with letting autovacuum
processes reload config files more often than now. However,
I object to allowing ProcessConfigFile to be called from within
commands in a normal user backend. The existing semantics are
that user backends respond to SIGHUP only at the start of processing
a user command, and I'm uncomfortable with suddenly deciding that
that can work differently if the command happens to be VACUUM.
It seems unprincipled and perhaps actively unsafe.

regards, tom lane

#46Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#45)
Re: Should vacuum process config file reload more often

Hi,

On 2023-04-03 14:43:14 -0400, Tom Lane wrote:

Melanie Plageman <melanieplageman@gmail.com> writes:

v13 attached with requested updates.

I'm afraid I'd not been paying any attention to this discussion,
but better late than never. I'm okay with letting autovacuum
processes reload config files more often than now. However,
I object to allowing ProcessConfigFile to be called from within
commands in a normal user backend. The existing semantics are
that user backends respond to SIGHUP only at the start of processing
a user command, and I'm uncomfortable with suddenly deciding that
that can work differently if the command happens to be VACUUM.
It seems unprincipled and perhaps actively unsafe.

I think it should be ok in commands like VACUUM that already internally start
their own transactions, and thus require to be run outside of a transaction
and at the toplevel. I share your concerns about allowing config reload in
arbitrary places. While we might want to go there, it would require a lot more
analysis.

Greetings,

Andres Freund

#47Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#46)
4 attachment(s)
Re: Should vacuum process config file reload more often

On Mon, Apr 3, 2023 at 3:08 PM Andres Freund <andres@anarazel.de> wrote:

On 2023-04-03 14:43:14 -0400, Tom Lane wrote:

Melanie Plageman <melanieplageman@gmail.com> writes:

v13 attached with requested updates.

I'm afraid I'd not been paying any attention to this discussion,
but better late than never. I'm okay with letting autovacuum
processes reload config files more often than now. However,
I object to allowing ProcessConfigFile to be called from within
commands in a normal user backend. The existing semantics are
that user backends respond to SIGHUP only at the start of processing
a user command, and I'm uncomfortable with suddenly deciding that
that can work differently if the command happens to be VACUUM.
It seems unprincipled and perhaps actively unsafe.

I think it should be ok in commands like VACUUM that already internally start
their own transactions, and thus require to be run outside of a transaction
and at the toplevel. I share your concerns about allowing config reload in
arbitrary places. While we might want to go there, it would require a lot more
analysis.

As an alternative for your consideration, attached v14 set implements
the config file reload for autovacuum only (in 0003) and then enables it
for VACUUM and ANALYZE not in a nested transaction command (in 0004).

Previously I had the commits in the reverse order for ease of review (to
separate changes to worker limit balancing logic from config reload
code).

- Melanie

Attachments:

v14-0003-Autovacuum-refreshes-cost-based-delay-params-mor.patchtext/x-patch; charset=US-ASCII; name=v14-0003-Autovacuum-refreshes-cost-based-delay-params-mor.patchDownload
From 1781dd7174d5d6eaaeb4bd02029212f3c23d4dbe Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 25 Mar 2023 14:14:55 -0400
Subject: [PATCH v14 3/4] Autovacuum refreshes cost-based delay params more
 often

Allow autovacuum to reload the config file more often so that cost-based
delay parameters can take effect while VACUUMing a relation. Previously
autovacuum workers only reloaded the config file once per relation
vacuumed, so config changes could not take effect until beginning to
vacuum the next table.

Now, check if a reload is pending roughly once per block, when checking
if we need to delay.

In order for autovacuum workers to safely update their own cost delay
and cost limit parameters without impacting performance, we had to
rethink when and how these values were accessed.

Previously, an autovacuum worker's wi_cost_limit was set only at the
beginning of vacuuming a table, after reloading the config file.
Therefore, at the time that autovac_balance_cost() is called, workers
vacuuming tables with no table options could still have different values
for their wi_cost_limit_base and wi_cost_delay.

Now that the cost parameters can be updated while vacuuming a table,
workers will (within some margin of error) have no reason to have
different values for cost limit and cost delay (in the absence of table
options). This removes the rationale for keeping cost limit and cost
delay in shared memory. Balancing the cost limit requires only the
number of active autovacuum workers vacuuming a table with no cost-based
table options.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
---
 src/backend/commands/vacuum.c       |  44 ++++-
 src/backend/postmaster/autovacuum.c | 253 +++++++++++++++-------------
 src/include/commands/vacuum.h       |   1 +
 3 files changed, 180 insertions(+), 118 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 96df5e2920..a51a3f78a0 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -48,6 +48,7 @@
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/interrupt.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/pmsignal.h"
@@ -510,9 +511,9 @@ vacuum(List *relations, VacuumParams *params,
 	{
 		ListCell   *cur;
 
-		VacuumUpdateCosts();
 		in_vacuum = true;
-		VacuumCostActive = (VacuumCostDelay > 0);
+		VacuumFailsafeActive = false;
+		VacuumUpdateCosts();
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
@@ -566,12 +567,20 @@ vacuum(List *relations, VacuumParams *params,
 					CommandCounterIncrement();
 				}
 			}
+
+			/*
+			 * Ensure VacuumFailsafeActive has been reset before vacuuming the
+			 * next relation relation.
+			 */
+			VacuumFailsafeActive = false;
 		}
 	}
 	PG_FINALLY();
 	{
 		in_vacuum = false;
 		VacuumCostActive = false;
+		VacuumFailsafeActive = false;
+		VacuumCostBalance = 0;
 	}
 	PG_END_TRY();
 
@@ -2238,7 +2247,27 @@ vacuum_delay_point(void)
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	if (!VacuumCostActive || InterruptPending)
+	if (InterruptPending ||
+		(!VacuumCostActive && !ConfigReloadPending))
+		return;
+
+	/*
+	 * Reload the configuration file if requested. This allows changes to
+	 * autovacuum_vacuum_cost_limit and autovacuum_vacuum_cost_delay to take
+	 * effect while a table is being vacuumed or analyzed.
+	 */
+	if (ConfigReloadPending && IsAutoVacuumWorkerProcess())
+	{
+		ConfigReloadPending = false;
+		ProcessConfigFile(PGC_SIGHUP);
+		VacuumUpdateCosts();
+	}
+
+	/*
+	 * If we disabled cost-based delays after reloading the config file,
+	 * return.
+	 */
+	if (!VacuumCostActive)
 		return;
 
 	/*
@@ -2271,7 +2300,14 @@ vacuum_delay_point(void)
 
 		VacuumCostBalance = 0;
 
-		VacuumUpdateCosts();
+		/*
+		 * Balance and update limit values for autovacuum workers. We must
+		 * always do this in case the autovacuum launcher or another
+		 * autovacuum worker has recalculated the number of workers across
+		 * which we must balance the limit. This is done by the launcher when
+		 * launching a new worker and by workers before vacuuming each table.
+		 */
+		AutoVacuumUpdateLimit();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index e0c568fdaf..8cb9bee4eb 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -139,6 +139,9 @@ int			Log_autovacuum_min_duration = 600000;
 static bool am_autovacuum_launcher = false;
 static bool am_autovacuum_worker = false;
 
+static double av_relopt_cost_delay = -1;
+static int	av_relopt_cost_limit = 0;
+
 /* Flags set by signal handlers */
 static volatile sig_atomic_t got_SIGUSR2 = false;
 
@@ -189,8 +192,8 @@ typedef struct autovac_table
 {
 	Oid			at_relid;
 	VacuumParams at_params;
-	double		at_vacuum_cost_delay;
-	int			at_vacuum_cost_limit;
+	double		at_relopt_vac_cost_delay;
+	int			at_relopt_vac_cost_limit;
 	bool		at_dobalance;
 	bool		at_sharedrel;
 	char	   *at_relname;
@@ -209,7 +212,7 @@ typedef struct autovac_table
  * wi_sharedrel flag indicating whether table is marked relisshared
  * wi_proc		pointer to PGPROC of the running worker, NULL if not started
  * wi_launchtime Time at which this worker was launched
- * wi_cost_*	Vacuum cost-based delay parameters current in this worker
+ * wi_dobalance Whether this worker should be included in balance calculations
  *
  * All fields are protected by AutovacuumLock, except for wi_tableoid and
  * wi_sharedrel which are protected by AutovacuumScheduleLock (note these
@@ -223,11 +226,8 @@ typedef struct WorkerInfoData
 	Oid			wi_tableoid;
 	PGPROC	   *wi_proc;
 	TimestampTz wi_launchtime;
-	bool		wi_dobalance;
+	pg_atomic_flag wi_dobalance;
 	bool		wi_sharedrel;
-	double		wi_cost_delay;
-	int			wi_cost_limit;
-	int			wi_cost_limit_base;
 } WorkerInfoData;
 
 typedef struct WorkerInfoData *WorkerInfo;
@@ -273,6 +273,8 @@ typedef struct AutoVacuumWorkItem
  * av_startingWorker pointer to WorkerInfo currently being started (cleared by
  *					the worker itself as soon as it's up and running)
  * av_workItems		work item array
+ * av_nworkersForBalance the number of autovacuum workers to use when
+ * 					calculating the per worker cost limit
  *
  * This struct is protected by AutovacuumLock, except for av_signal and parts
  * of the worker list (see above).
@@ -286,6 +288,7 @@ typedef struct
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+	pg_atomic_uint32 av_nworkersForBalance;
 } AutoVacuumShmemStruct;
 
 static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -319,7 +322,7 @@ static void launch_worker(TimestampTz now);
 static List *get_database_list(void);
 static void rebuild_database_list(Oid newdb);
 static int	db_comparator(const void *a, const void *b);
-static void autovac_balance_cost(void);
+static void autovac_recalculate_workers_for_balance(void);
 
 static void do_autovacuum(void);
 static void FreeWorkerInfo(int code, Datum arg);
@@ -670,7 +673,7 @@ AutoVacLauncherMain(int argc, char *argv[])
 			{
 				LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 				AutoVacuumShmem->av_signal[AutoVacRebalance] = false;
-				autovac_balance_cost();
+				autovac_recalculate_workers_for_balance();
 				LWLockRelease(AutovacuumLock);
 			}
 
@@ -820,8 +823,8 @@ HandleAutoVacLauncherInterrupts(void)
 			AutoVacLauncherShutdown();
 
 		/* rebalance in case the default cost parameters changed */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-		autovac_balance_cost();
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
+		autovac_recalculate_workers_for_balance();
 		LWLockRelease(AutovacuumLock);
 
 		/* rebuild the list in case the naptime changed */
@@ -1755,10 +1758,7 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_sharedrel = false;
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
-		MyWorkerInfo->wi_dobalance = false;
-		MyWorkerInfo->wi_cost_delay = 0;
-		MyWorkerInfo->wi_cost_limit = 0;
-		MyWorkerInfo->wi_cost_limit_base = 0;
+		pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
 		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 						&MyWorkerInfo->wi_links);
 		/* not mine anymore */
@@ -1787,97 +1787,119 @@ VacuumUpdateCosts(void)
 
 	if (am_autovacuum_worker)
 	{
-		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
-		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+		if (av_relopt_cost_delay >= 0)
+			VacuumCostDelay = av_relopt_cost_delay;
+		else if (autovacuum_vac_cost_delay >= 0)
+			VacuumCostDelay = autovacuum_vac_cost_delay;
+		else
+			/* fall back to vacuum_cost_delay */
+			VacuumCostDelay = vacuum_cost_delay;
+
+		AutoVacuumUpdateLimit();
 	}
 	else
 	{
 		/* Must be explicit VACUUM or ANALYZE */
-		VacuumCostLimit = vacuum_cost_limit;
 		VacuumCostDelay = vacuum_cost_delay;
+		VacuumCostLimit = vacuum_cost_limit;
+	}
+
+	/*
+	 * If configuration changes are allowed to impact VacuumCostActive, make
+	 * sure it is updated.
+	 */
+	if (VacuumFailsafeActive)
+	{
+		Assert(!VacuumCostActive);
+		return;
+	}
+
+	if (VacuumCostDelay > 0)
+		VacuumCostActive = true;
+	else
+	{
+		VacuumCostActive = false;
+		VacuumCostBalance = 0;
 	}
 }
 
 /*
- * autovac_balance_cost
- *		Recalculate the cost limit setting for each active worker.
- *
- * Caller must hold the AutovacuumLock in exclusive mode.
- */
-static void
-autovac_balance_cost(void)
+* Update VacuumCostLimit with the correct value for an autovacuum worker, given
+* the value of other relevant cost limit parameters and the number of workers
+* across which the limit must be balanced. Autovacuum workers must call this
+* regularly in case av_nworkers_for_balance has been updated by another worker
+* or by the autovacuum launcher. They must also call it after a config reload.
+*/
+void
+AutoVacuumUpdateLimit(void)
 {
+	if (!am_autovacuum_worker)
+		return;
+
 	/*
-	 * The idea here is that we ration out I/O equally.  The amount of I/O
-	 * that a worker can consume is determined by cost_limit/cost_delay, so we
-	 * try to equalize those ratios rather than the raw limit settings.
-	 *
 	 * note: in cost_limit, zero also means use value from elsewhere, because
 	 * zero is not a valid value.
 	 */
-	int			vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
-								  autovacuum_vac_cost_limit : vacuum_cost_limit);
-	double		vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
-								  autovacuum_vac_cost_delay : vacuum_cost_delay);
-	double		cost_total;
-	double		cost_avail;
-	dlist_iter	iter;
-
-	/* not set? nothing to do */
-	if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
-		return;
 
-	/* calculate the total base cost limit of participating active workers */
-	cost_total = 0.0;
-	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
+	if (av_relopt_cost_limit > 0)
+		VacuumCostLimit = av_relopt_cost_limit;
+	else
 	{
-		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
+		int			nworkers_for_balance;
+
+		if (autovacuum_vac_cost_limit > 0)
+			VacuumCostLimit = autovacuum_vac_cost_limit;
+		else
+			VacuumCostLimit = vacuum_cost_limit;
+
+		/* Only balance limit if no table options specified */
+		if (pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance))
+			return;
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-			cost_total +=
-				(double) worker->wi_cost_limit_base / worker->wi_cost_delay;
+		Assert(VacuumCostLimit > 0);
+
+		nworkers_for_balance = pg_atomic_read_u32(
+								&AutoVacuumShmem->av_nworkersForBalance);
+
+		/* There is at least 1 autovac worker (this worker). */
+		Assert(nworkers_for_balance > 0);
+
+		VacuumCostLimit = Max(VacuumCostLimit / nworkers_for_balance, 1);
 	}
+}
 
-	/* there are no cost limits -- nothing to do */
-	if (cost_total <= 0)
-		return;
+/*
+ * autovac_recalculate_workers_for_balance
+ *		Recalculate the number of workers to consider, given table options and
+ *		the current number of active workers.
+ *
+ * Caller must hold the AutovacuumLock in at least shared mode to access
+ * worker->wi_proc.
+ */
+static void
+autovac_recalculate_workers_for_balance(void)
+{
+	dlist_iter	iter;
+	int			orig_nworkers_for_balance;
+	int			nworkers_for_balance = 0;
+
+	orig_nworkers_for_balance =
+		pg_atomic_read_u32(&AutoVacuumShmem->av_nworkersForBalance);
 
-	/*
-	 * Adjust cost limit of each active worker to balance the total of cost
-	 * limit to autovacuum_vacuum_cost_limit.
-	 */
-	cost_avail = (double) vac_cost_limit / vac_cost_delay;
 	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
 	{
 		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-		{
-			int			limit = (int)
-			(cost_avail * worker->wi_cost_limit_base / cost_total);
-
-			/*
-			 * We put a lower bound of 1 on the cost_limit, to avoid division-
-			 * by-zero in the vacuum code.  Also, in case of roundoff trouble
-			 * in these calculations, let's be sure we don't ever set
-			 * cost_limit to more than the base value.
-			 */
-			worker->wi_cost_limit = Max(Min(limit,
-											worker->wi_cost_limit_base),
-										1);
-		}
+		if (worker->wi_proc == NULL ||
+			pg_atomic_unlocked_test_flag(&worker->wi_dobalance))
+			continue;
 
-		if (worker->wi_proc != NULL)
-			elog(DEBUG2, "autovac_balance_cost(pid=%d db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d, cost_delay=%g)",
-				 worker->wi_proc->pid, worker->wi_dboid, worker->wi_tableoid,
-				 worker->wi_dobalance ? "yes" : "no",
-				 worker->wi_cost_limit, worker->wi_cost_limit_base,
-				 worker->wi_cost_delay);
+		nworkers_for_balance++;
 	}
+
+	if (nworkers_for_balance != orig_nworkers_for_balance)
+		pg_atomic_write_u32(&AutoVacuumShmem->av_nworkersForBalance,
+							nworkers_for_balance);
 }
 
 /*
@@ -2425,23 +2447,31 @@ do_autovacuum(void)
 			continue;
 		}
 
-		/* Must hold AutovacuumLock while mucking with cost balance info */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 
-		/* advertise my cost delay parameters for the balancing algorithm */
-		MyWorkerInfo->wi_dobalance = tab->at_dobalance;
-		MyWorkerInfo->wi_cost_delay = tab->at_vacuum_cost_delay;
-		MyWorkerInfo->wi_cost_limit = tab->at_vacuum_cost_limit;
-		MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
+		/*
+		 * Save the cost-related table options in global variables for
+		 * reference when updating VacuumCostLimit and VacuumCostDelay during
+		 * vacuuming this table.
+		 */
+		av_relopt_cost_limit = tab->at_relopt_vac_cost_limit;
+		av_relopt_cost_delay = tab->at_relopt_vac_cost_delay;
 
-		/* do a balance */
-		autovac_balance_cost();
+		if (tab->at_dobalance)
+			pg_atomic_test_set_flag(&MyWorkerInfo->wi_dobalance);
+		else
+			pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
 
-		/* set the active cost parameters from the result of that */
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
+		autovac_recalculate_workers_for_balance();
+		LWLockRelease(AutovacuumLock);
+
+		/*
+		 * We wait until this point to update cost delay and cost limit
+		 * values, even though we reloaded the configuration file above, so
+		 * that we can take into account the cost-related table options.
+		 */
 		VacuumUpdateCosts();
 
-		/* done */
-		LWLockRelease(AutovacuumLock);
 
 		/* clean up memory before each iteration */
 		MemoryContextResetAndDeleteChildren(PortalContext);
@@ -2526,10 +2556,10 @@ deleted:
 
 		/*
 		 * Remove my info from shared memory.  We could, but intentionally
-		 * don't, clear wi_cost_limit and friends --- this is on the
-		 * assumption that we probably have more to do with similar cost
-		 * settings, so we don't want to give up our share of I/O for a very
-		 * short interval and thereby thrash the global balance.
+		 * don't, unset wi_dobalance on the assumption that we are more likely
+		 * than not to vacuum a table with no table options next, so we don't
+		 * want to give up our share of I/O for a very short interval and
+		 * thereby thrash the global balance.
 		 */
 		LWLockAcquire(AutovacuumScheduleLock, LW_EXCLUSIVE);
 		MyWorkerInfo->wi_tableoid = InvalidOid;
@@ -2566,6 +2596,7 @@ deleted:
 		{
 			ConfigReloadPending = false;
 			ProcessConfigFile(PGC_SIGHUP);
+			VacuumUpdateCosts();
 		}
 
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
@@ -2801,8 +2832,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		int			freeze_table_age;
 		int			multixact_freeze_min_age;
 		int			multixact_freeze_table_age;
-		int			vac_cost_limit;
-		double		vac_cost_delay;
 		int			log_min_duration;
 
 		/*
@@ -2812,20 +2841,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 * defaults, autovacuum's own first and plain vacuum second.
 		 */
 
-		/* -1 in autovac setting means use plain vacuum_cost_delay */
-		vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
-			? avopts->vacuum_cost_delay
-			: (autovacuum_vac_cost_delay >= 0)
-			? autovacuum_vac_cost_delay
-			: vacuum_cost_delay;
-
-		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
-		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
-			? avopts->vacuum_cost_limit
-			: (autovacuum_vac_cost_limit > 0)
-			? autovacuum_vac_cost_limit
-			: vacuum_cost_limit;
-
 		/* -1 in autovac setting means use log_autovacuum_min_duration */
 		log_min_duration = (avopts && avopts->log_min_duration >= 0)
 			? avopts->log_min_duration
@@ -2881,8 +2896,10 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab->at_params.multixact_freeze_table_age = multixact_freeze_table_age;
 		tab->at_params.is_wraparound = wraparound;
 		tab->at_params.log_min_duration = log_min_duration;
-		tab->at_vacuum_cost_limit = vac_cost_limit;
-		tab->at_vacuum_cost_delay = vac_cost_delay;
+		tab->at_relopt_vac_cost_limit = avopts ?
+			avopts->vacuum_cost_limit : 0;
+		tab->at_relopt_vac_cost_delay = avopts ?
+			avopts->vacuum_cost_delay : -1;
 		tab->at_relname = NULL;
 		tab->at_nspname = NULL;
 		tab->at_datname = NULL;
@@ -3374,10 +3391,18 @@ AutoVacuumShmemInit(void)
 		worker = (WorkerInfo) ((char *) AutoVacuumShmem +
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
+
 		/* initialize the WorkerInfo free list */
 		for (i = 0; i < autovacuum_max_workers; i++)
+		{
 			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 							&worker[i].wi_links);
+
+			pg_atomic_init_flag(&worker[i].wi_dobalance);
+		}
+
+		pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
+
 	}
 	else
 		Assert(found);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index a62dd2e781..6b286037ca 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -351,6 +351,7 @@ extern IndexBulkDeleteResult *vac_cleanup_one_index(IndexVacuumInfo *ivinfo,
 extern Size vac_max_items_to_alloc_size(int max_items);
 
 /* In postmaster/autovacuum.c */
+extern void AutoVacuumUpdateLimit(void);
 extern void VacuumUpdateCosts(void);
 
 /* in commands/vacuumparallel.c */
-- 
2.37.2

v14-0004-VACUUM-reloads-config-file-more-often.patchtext/x-patch; charset=US-ASCII; name=v14-0004-VACUUM-reloads-config-file-more-often.patchDownload
From 4cd9317cdec9c30e97bee30b194c75455c0afc99 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Mon, 3 Apr 2023 17:48:22 -0400
Subject: [PATCH v14 4/4] VACUUM reloads config file more often

A previous commit allowed autovacuum workers to reload the configuration
file more often. Now, enable this behavior for VACUUM and ANALYZE, when
not nested in a transaction command.

Note that, for now, this only applies to the primary VACUUM stage and
not to parallel vacuum workers vacuuming indexes.
---
 src/backend/commands/vacuum.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index a51a3f78a0..3c426ed501 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -84,6 +84,7 @@ int			vacuum_cost_limit;
  */
 int			VacuumCostLimit = 0;
 double		VacuumCostDelay = -1;
+static bool vacuum_can_reload_config = false;
 
 /*
  * VacuumFailsafeActive is a defined as a global so that we can determine
@@ -355,6 +356,8 @@ vacuum(List *relations, VacuumParams *params,
 	else
 		in_outer_xact = IsInTransactionBlock(isTopLevel);
 
+	vacuum_can_reload_config = !in_outer_xact;
+
 	/*
 	 * Check for and disallow recursive calls.  This could happen when VACUUM
 	 * FULL or ANALYZE calls a hostile index expression that itself calls
@@ -581,6 +584,7 @@ vacuum(List *relations, VacuumParams *params,
 		VacuumCostActive = false;
 		VacuumFailsafeActive = false;
 		VacuumCostBalance = 0;
+		vacuum_can_reload_config = false;
 	}
 	PG_END_TRY();
 
@@ -2253,10 +2257,12 @@ vacuum_delay_point(void)
 
 	/*
 	 * Reload the configuration file if requested. This allows changes to
-	 * autovacuum_vacuum_cost_limit and autovacuum_vacuum_cost_delay to take
-	 * effect while a table is being vacuumed or analyzed.
+	 * [autovacuum_]vacuum_cost_limit and [autovacuum}_vacuum_cost_delay to
+	 * take effect while a table is being vacuumed or analyzed. Analyze should
+	 * not reload configuration file if it is in an outer transaction, as we
+	 * currently only allow configuration reload when in top-level statements.
 	 */
-	if (ConfigReloadPending && IsAutoVacuumWorkerProcess())
+	if (ConfigReloadPending && vacuum_can_reload_config)
 	{
 		ConfigReloadPending = false;
 		ProcessConfigFile(PGC_SIGHUP);
-- 
2.37.2

v14-0002-Separate-vacuum-cost-variables-from-gucs.patchtext/x-patch; charset=US-ASCII; name=v14-0002-Separate-vacuum-cost-variables-from-gucs.patchDownload
From 7ee209bee5c096601ed3ad76ee4ab9ced340e3d3 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Mon, 3 Apr 2023 11:22:18 -0400
Subject: [PATCH v14 2/4] Separate vacuum cost variables from gucs

Vacuum code run both by autovacuum workers and a backend doing
VACUUM/ANALYZE previously used VacuumCostLimit and VacuumCostDelay which
were the global variables for the gucs vacuum_cost_limit and
vacuum_cost_delay. Autovacuum workers needed to override these variables
with their own values, derived from autovacuum_vacuum_cost_limit and
autovacuum_vacuum_cost_delay and worker cost limit balancing logic. This
led to confusing code which, in some cases, both derived and set a new
value of VacuumCostLimit from VacuumCostLimit.

In preparation for refreshing these guc values more often, separate
these variables from the gucs themselves and add a function to update
the global variables using the gucs and existing logic.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
---
 src/backend/commands/vacuum.c         | 16 ++++++++--
 src/backend/commands/vacuumparallel.c |  3 +-
 src/backend/postmaster/autovacuum.c   | 43 +++++++++++++--------------
 src/backend/utils/init/globals.c      |  2 --
 src/backend/utils/misc/guc_tables.c   |  4 +--
 src/include/commands/vacuum.h         |  7 +++++
 src/include/miscadmin.h               |  2 --
 src/include/postmaster/autovacuum.h   |  3 --
 8 files changed, 45 insertions(+), 35 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 9724fbce46..96df5e2920 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -71,6 +71,18 @@ int			vacuum_multixact_freeze_min_age;
 int			vacuum_multixact_freeze_table_age;
 int			vacuum_failsafe_age;
 int			vacuum_multixact_failsafe_age;
+double		vacuum_cost_delay;
+int			vacuum_cost_limit;
+
+/*
+ * Variables for cost-based vacuum delay. The defaults differ between
+ * autovacuum and vacuum. These should be overridden with the appropriate GUC
+ * value in vacuum code.
+ * TODO: should VacuumCostLimit and VacuumCostDelay be initialized to valid or
+ * invalid values?
+ */
+int			VacuumCostLimit = 0;
+double		VacuumCostDelay = -1;
 
 /*
  * VacuumFailsafeActive is a defined as a global so that we can determine
@@ -498,6 +510,7 @@ vacuum(List *relations, VacuumParams *params,
 	{
 		ListCell   *cur;
 
+		VacuumUpdateCosts();
 		in_vacuum = true;
 		VacuumCostActive = (VacuumCostDelay > 0);
 		VacuumCostBalance = 0;
@@ -2258,8 +2271,7 @@ vacuum_delay_point(void)
 
 		VacuumCostBalance = 0;
 
-		/* update balance values for workers */
-		AutoVacuumUpdateDelay();
+		VacuumUpdateCosts();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 563117a8f6..d346838cfc 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -995,8 +995,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												 false);
 
 	/* Set cost-based vacuum delay */
-	VacuumCostActive = (VacuumCostDelay > 0);
 	VacuumCostBalance = 0;
+	VacuumUpdateCosts();
+	VacuumCostActive = (VacuumCostDelay > 0);
 	VacuumPageHit = 0;
 	VacuumPageMiss = 0;
 	VacuumPageDirty = 0;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 585d28148c..e0c568fdaf 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1774,17 +1774,28 @@ FreeWorkerInfo(int code, Datum arg)
 }
 
 /*
- * Update the cost-based delay parameters, so that multiple workers consume
- * each a fraction of the total available I/O.
+ * Update vacuum cost-based delay-related parameters for autovacuum workers and
+ * backends executing VACUUM or ANALYZE using the value of relevant gucs and
+ * global state. This must be called during setup for vacuum and after every
+ * config reload to ensure up-to-date values.
  */
 void
-AutoVacuumUpdateDelay(void)
+VacuumUpdateCosts(void)
 {
-	if (MyWorkerInfo)
+	if (am_autovacuum_launcher)
+		return;
+
+	if (am_autovacuum_worker)
 	{
 		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
 		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
 	}
+	else
+	{
+		/* Must be explicit VACUUM or ANALYZE */
+		VacuumCostLimit = vacuum_cost_limit;
+		VacuumCostDelay = vacuum_cost_delay;
+	}
 }
 
 /*
@@ -1805,9 +1816,9 @@ autovac_balance_cost(void)
 	 * zero is not a valid value.
 	 */
 	int			vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
-								  autovacuum_vac_cost_limit : VacuumCostLimit);
+								  autovacuum_vac_cost_limit : vacuum_cost_limit);
 	double		vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
-								  autovacuum_vac_cost_delay : VacuumCostDelay);
+								  autovacuum_vac_cost_delay : vacuum_cost_delay);
 	double		cost_total;
 	double		cost_avail;
 	dlist_iter	iter;
@@ -2312,8 +2323,6 @@ do_autovacuum(void)
 		autovac_table *tab;
 		bool		isshared;
 		bool		skipit;
-		double		stdVacuumCostDelay;
-		int			stdVacuumCostLimit;
 		dlist_iter	iter;
 
 		CHECK_FOR_INTERRUPTS();
@@ -2416,14 +2425,6 @@ do_autovacuum(void)
 			continue;
 		}
 
-		/*
-		 * Remember the prevailing values of the vacuum cost GUCs.  We have to
-		 * restore these at the bottom of the loop, else we'll compute wrong
-		 * values in the next iteration of autovac_balance_cost().
-		 */
-		stdVacuumCostDelay = VacuumCostDelay;
-		stdVacuumCostLimit = VacuumCostLimit;
-
 		/* Must hold AutovacuumLock while mucking with cost balance info */
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 
@@ -2437,7 +2438,7 @@ do_autovacuum(void)
 		autovac_balance_cost();
 
 		/* set the active cost parameters from the result of that */
-		AutoVacuumUpdateDelay();
+		VacuumUpdateCosts();
 
 		/* done */
 		LWLockRelease(AutovacuumLock);
@@ -2534,10 +2535,6 @@ deleted:
 		MyWorkerInfo->wi_tableoid = InvalidOid;
 		MyWorkerInfo->wi_sharedrel = false;
 		LWLockRelease(AutovacuumScheduleLock);
-
-		/* restore vacuum cost GUCs for the next iteration */
-		VacuumCostDelay = stdVacuumCostDelay;
-		VacuumCostLimit = stdVacuumCostLimit;
 	}
 
 	/*
@@ -2820,14 +2817,14 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			? avopts->vacuum_cost_delay
 			: (autovacuum_vac_cost_delay >= 0)
 			? autovacuum_vac_cost_delay
-			: VacuumCostDelay;
+			: vacuum_cost_delay;
 
 		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
 		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
 			? avopts->vacuum_cost_limit
 			: (autovacuum_vac_cost_limit > 0)
 			? autovacuum_vac_cost_limit
-			: VacuumCostLimit;
+			: vacuum_cost_limit;
 
 		/* -1 in autovac setting means use log_autovacuum_min_duration */
 		log_min_duration = (avopts && avopts->log_min_duration >= 0)
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 1b1d814254..8e5b065e8f 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -142,8 +142,6 @@ int			MaxBackends = 0;
 int			VacuumCostPageHit = 1;	/* GUC parameters for vacuum */
 int			VacuumCostPageMiss = 2;
 int			VacuumCostPageDirty = 20;
-int			VacuumCostLimit = 200;
-double		VacuumCostDelay = 0;
 
 int64		VacuumPageHit = 0;
 int64		VacuumPageMiss = 0;
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8062589efd..77db1a146c 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2409,7 +2409,7 @@ struct config_int ConfigureNamesInt[] =
 			gettext_noop("Vacuum cost amount available before napping."),
 			NULL
 		},
-		&VacuumCostLimit,
+		&vacuum_cost_limit,
 		200, 1, 10000,
 		NULL, NULL, NULL
 	},
@@ -3701,7 +3701,7 @@ struct config_real ConfigureNamesReal[] =
 			NULL,
 			GUC_UNIT_MS
 		},
-		&VacuumCostDelay,
+		&vacuum_cost_delay,
 		0, 0, 100,
 		NULL, NULL, NULL
 	},
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 7b8ee21788..a62dd2e781 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,8 @@ extern PGDLLIMPORT int vacuum_multixact_freeze_min_age;
 extern PGDLLIMPORT int vacuum_multixact_freeze_table_age;
 extern PGDLLIMPORT int vacuum_failsafe_age;
 extern PGDLLIMPORT int vacuum_multixact_failsafe_age;
+extern PGDLLIMPORT double vacuum_cost_delay;
+extern PGDLLIMPORT int vacuum_cost_limit;
 
 /* Variables for cost-based parallel vacuum */
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
@@ -307,6 +309,8 @@ extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
 
 extern bool VacuumFailsafeActive;
+extern int	VacuumCostLimit;
+extern double VacuumCostDelay;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
@@ -346,6 +350,9 @@ extern IndexBulkDeleteResult *vac_cleanup_one_index(IndexVacuumInfo *ivinfo,
 													IndexBulkDeleteResult *istat);
 extern Size vac_max_items_to_alloc_size(int max_items);
 
+/* In postmaster/autovacuum.c */
+extern void VacuumUpdateCosts(void);
+
 /* in commands/vacuumparallel.c */
 extern ParallelVacuumState *parallel_vacuum_init(Relation rel, Relation *indrels,
 												 int nindexes, int nrequested_workers,
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 06a86f9ac1..66db1b2c69 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -266,8 +266,6 @@ extern PGDLLIMPORT int max_parallel_maintenance_workers;
 extern PGDLLIMPORT int VacuumCostPageHit;
 extern PGDLLIMPORT int VacuumCostPageMiss;
 extern PGDLLIMPORT int VacuumCostPageDirty;
-extern PGDLLIMPORT int VacuumCostLimit;
-extern PGDLLIMPORT double VacuumCostDelay;
 
 extern PGDLLIMPORT int64 VacuumPageHit;
 extern PGDLLIMPORT int64 VacuumPageMiss;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index c140371b51..65afd1ea1e 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -63,9 +63,6 @@ extern int	StartAutoVacWorker(void);
 /* called from postmaster when a worker could not be forked */
 extern void AutoVacWorkerFailed(void);
 
-/* autovacuum cost-delay balancer */
-extern void AutoVacuumUpdateDelay(void);
-
 #ifdef EXEC_BACKEND
 extern void AutoVacLauncherMain(int argc, char *argv[]) pg_attribute_noreturn();
 extern void AutoVacWorkerMain(int argc, char *argv[]) pg_attribute_noreturn();
-- 
2.37.2

v14-0001-Make-vacuum-s-failsafe_active-a-global.patchtext/x-patch; charset=US-ASCII; name=v14-0001-Make-vacuum-s-failsafe_active-a-global.patchDownload
From 4792ff15f6ab2ce52763dd61a2f374dee948a9cd Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Fri, 31 Mar 2023 10:38:39 -0400
Subject: [PATCH v14 1/4] Make vacuum's failsafe_active a global

While vacuuming a table in failsafe mode, VacuumCostActive should not be
re-enabled. This currently isn't a problem because vacuum cost
parameters are only refreshed in between vacuuming tables and failsafe
status is reset for every table. In preparation for allowing vacuum cost
parameters to be updated more frequently, elevate
LVRelState->failsafe_active to a global, VacuumFailsafeActive, which
will be checked when determining whether or not to re-enable vacuum
cost-related delays.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 16 +++++++---------
 src/backend/commands/vacuum.c        |  9 +++++++++
 src/include/commands/vacuum.h        |  1 +
 3 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 639179aa46..2ba85bd3d6 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -153,8 +153,6 @@ typedef struct LVRelState
 	bool		aggressive;
 	/* Use visibility map to skip? (disabled by DISABLE_PAGE_SKIPPING) */
 	bool		skipwithvm;
-	/* Wraparound failsafe has been triggered? */
-	bool		failsafe_active;
 	/* Consider index vacuuming bypass optimization? */
 	bool		consider_bypass_optimization;
 
@@ -391,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
 	Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
 		   params->truncate != VACOPTVALUE_AUTO);
-	vacrel->failsafe_active = false;
+	VacuumFailsafeActive = false;
 	vacrel->consider_bypass_optimization = true;
 	vacrel->do_index_vacuuming = true;
 	vacrel->do_index_cleanup = true;
@@ -709,7 +707,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 			}
 			else
 			{
-				if (!vacrel->failsafe_active)
+				if (!VacuumFailsafeActive)
 					appendStringInfoString(&buf, _("index scan bypassed: "));
 				else
 					appendStringInfoString(&buf, _("index scan bypassed by failsafe: "));
@@ -2293,7 +2291,7 @@ lazy_vacuum(LVRelState *vacrel)
 		 * vacuuming or heap vacuuming.  This VACUUM operation won't end up
 		 * back here again.
 		 */
-		Assert(vacrel->failsafe_active);
+		Assert(VacuumFailsafeActive);
 	}
 
 	/*
@@ -2374,7 +2372,7 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	 */
 	Assert(vacrel->num_index_scans > 0 ||
 		   vacrel->dead_items->num_items == vacrel->lpdead_items);
-	Assert(allindexes || vacrel->failsafe_active);
+	Assert(allindexes || VacuumFailsafeActive);
 
 	/*
 	 * Increase and report the number of index scans.
@@ -2616,12 +2614,12 @@ static bool
 lazy_check_wraparound_failsafe(LVRelState *vacrel)
 {
 	/* Don't warn more than once per VACUUM */
-	if (vacrel->failsafe_active)
+	if (VacuumFailsafeActive)
 		return true;
 
 	if (unlikely(vacuum_xid_failsafe_check(&vacrel->cutoffs)))
 	{
-		vacrel->failsafe_active = true;
+		VacuumFailsafeActive = true;
 
 		/*
 		 * Abandon use of a buffer access strategy to allow use of all of
@@ -2820,7 +2818,7 @@ should_attempt_truncation(LVRelState *vacrel)
 {
 	BlockNumber possibly_freeable;
 
-	if (!vacrel->do_rel_truncate || vacrel->failsafe_active ||
+	if (!vacrel->do_rel_truncate || VacuumFailsafeActive ||
 		old_snapshot_threshold >= 0)
 		return false;
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index da85330ef4..9724fbce46 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -72,6 +72,15 @@ int			vacuum_multixact_freeze_table_age;
 int			vacuum_failsafe_age;
 int			vacuum_multixact_failsafe_age;
 
+/*
+ * VacuumFailsafeActive is a defined as a global so that we can determine
+ * whether or not to re-enable cost-based vacuum delay when vacuuming a table.
+ * If failsafe mode has been engaged, we will not re-enable cost-based delay
+ * for the table until after vacuuming has completed, regardless of other
+ * settings.
+ */
+bool		VacuumFailsafeActive = false;
+
 /*
  * Variables for cost-based parallel vacuum.  See comments atop
  * compute_parallel_delay to understand how it works.
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index bdfd96cfec..7b8ee21788 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -306,6 +306,7 @@ extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
 
+extern bool VacuumFailsafeActive;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
-- 
2.37.2

#48Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Melanie Plageman (#44)
Re: Should vacuum process config file reload more often

On Tue, Apr 4, 2023 at 1:41 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Sun, Apr 2, 2023 at 10:28 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Thank you for updating the patches. Here are comments for 0001, 0002,
and 0003 patches:

Thanks for the review!

v13 attached with requested updates.

0001:

@@ -391,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
params->truncate != VACOPTVALUE_AUTO);
-        vacrel->failsafe_active = false;
+        VacuumFailsafeActive = false;

If we go with the idea of using VacuumCostActive +
VacuumFailsafeActive, we need to make sure that both are cleared at
the end of the vacuum per table. Since the patch clears it only here,
it remains true even after vacuum() if we trigger the failsafe mode
for the last table in the table list.

In addition to that, to ensure that also in an error case, I think we
need to clear it also in PG_FINALLY() block in vacuum().

So, in 0001, I tried to keep it exactly the same as
LVRelState->failsafe_active except for it being a global. We don't
actually use VacuumFailsafeActive in this commit except in vacuumlazy.c,
which does its own management of the value (it resets it to false at the
top of heap_vacuum_rel()).

In the later commit which references VacuumFailsafeActive outside of
vacuumlazy.c, I had reset it in PG_FINALLY(). I hadn't reset it in the
relation list loop in vacuum(). Autovacuum calls vacuum() for each
relation. However, you are right that for VACUUM with a list of
relations for a table access method other than heap, once set to true,
if the table AM forgets to reset the value to false at the end of
vacuuming the relation, it would stay true.

I've set it to false now at the bottom of the loop through relations in
vacuum().

Agreed. Probably we can merge 0001 into 0003 but I leave it to
committers. The 0001 patch mostly looks good to me except for one
point:

@@ -391,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
         Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
         Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
                    params->truncate != VACOPTVALUE_AUTO);
-        vacrel->failsafe_active = false;
+        VacuumFailsafeActive = false;
         vacrel->consider_bypass_optimization = true;
         vacrel->do_index_vacuuming = true;

Looking at the 0003 patch, we set VacuumFailsafeActive to false per table:

+                        /*
+                         * Ensure VacuumFailsafeActive has been reset
before vacuuming the
+                         * next relation relation.
+                         */
+                        VacuumFailsafeActive = false;

Given that we ensure it's reset before vacuuming the next table, do we
need to reset it in heap_vacuum_rel?

(there is a typo; s/relation relation/relation/)

---
@@ -306,6 +306,7 @@ extern PGDLLIMPORT pg_atomic_uint32
*VacuumSharedCostBalance;
extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
extern PGDLLIMPORT int VacuumCostBalanceLocal;

+extern bool VacuumFailsafeActive;

Do we need PGDLLIMPORT for VacuumFailSafeActive?

I didn't add one because I thought extensions and other code probably
shouldn't access this variable. I thought PGDLLIMPORT was only needed
for extensions built on windows to access variables.

Agreed.

0002:

@@ -2388,6 +2398,7 @@ vac_max_items_to_alloc_size(int max_items)
return offsetof(VacDeadItems, items) +
sizeof(ItemPointerData) * max_items;
}

+
/*
* vac_tid_reaped() -- is a particular tid deletable?
*

Unnecessary new line. There are some other unnecessary new lines in this patch.

Thanks! I think I got them all.

---
@@ -307,6 +309,8 @@ extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
extern PGDLLIMPORT int VacuumCostBalanceLocal;
extern bool VacuumFailsafeActive;
+extern int     VacuumCostLimit;
+extern double VacuumCostDelay;

and

@@ -266,8 +266,6 @@ extern PGDLLIMPORT int max_parallel_maintenance_workers;
extern PGDLLIMPORT int VacuumCostPageHit;
extern PGDLLIMPORT int VacuumCostPageMiss;
extern PGDLLIMPORT int VacuumCostPageDirty;
-extern PGDLLIMPORT int VacuumCostLimit;
-extern PGDLLIMPORT double VacuumCostDelay;

Do we need PGDLLIMPORT too?

I was on the fence about this. I annotated the new guc variables
vacuum_cost_delay and vacuum_cost_limit with PGDLLIMPORT, but I did not
annotate the variables used in vacuum code (VacuumCostLimit/Delay). I
think whether or not this is the right choice depends on two things:
whether or not my understanding of PGDLLIMPORT is correct and, if it is,
whether or not we want extensions to be able to access
VacuumCostLimit/Delay or if just access to the guc variables is
sufficient/desirable.

I guess it would be better to keep both accessible for backward
compatibility. Extensions are able to access both GUC values and
values that are actually used for vacuum delays (as we used to use the
same variables).

---
@@ -1773,20 +1773,33 @@ FreeWorkerInfo(int code, Datum arg)
}
}
+
/*
- * Update the cost-based delay parameters, so that multiple workers consume
- * each a fraction of the total available I/O.
+ * Update vacuum cost-based delay-related parameters for autovacuum workers and
+ * backends executing VACUUM or ANALYZE using the value of relevant gucs and
+ * global state. This must be called during setup for vacuum and after every
+ * config reload to ensure up-to-date values.
*/
void
-AutoVacuumUpdateDelay(void)
+VacuumUpdateCosts(void

Isn't it better to define VacuumUpdateCosts() in vacuum.c rather than
autovacuum.c as this is now a common code for both vacuum and
autovacuum?

We can't access members of WorkerInfoData from inside vacuum.c

Oops, you're right.

0003:

@@ -501,9 +502,9 @@ vacuum(List *relations, VacuumParams *params,
{
ListCell *cur;

-                VacuumUpdateCosts();
in_vacuum = true;
-                VacuumCostActive = (VacuumCostDelay > 0);
+                VacuumFailsafeActive = false;
+                VacuumUpdateCosts();

Hmm, if we initialize VacuumFailsafeActive here, should it be included
in 0001 patch?

See comment above. This is the first patch where we use or reference it
outside of vacuumlazy.c

---
+        if (VacuumCostDelay > 0)
+                VacuumCostActive = true;
+        else
+        {
+                VacuumCostActive = false;
+                VacuumCostBalance = 0;
+        }

I agree to update VacuumCostActive in VacuumUpdateCosts(). But if we
do that I think this change should be included in 0002 patch.

I'm a bit hesitant to do this because in 0002 VacuumCostActive cannot
change status while vacuuming a table or even between tables for VACUUM
when a list of relations is specified (except for being disabled by
failsafe mode) Adding it to VacuumUpdateCosts() in 0003 makes it clear
that it could change while vacuuming a table, so we must update it.

Agreed.

I previously had 0002 introduce AutoVacuumUpdateLimit(), which only
updated VacuumCostLimit with wi_cost_limit for autovacuum workers and
then called that in vacuum_delay_point() (instead of
AutoVacuumUpdateDelay() or VacuumUpdateCosts()). I abandoned that idea
in favor of the simplicity of having VacuumUpdateCosts() just update
those variables for everyone, since it could be reused in 0003.

Now, I'm thinking the previous method might be more clear?
Or is what I have okay?

I'm fine with the current one.

---
+        if (ConfigReloadPending && !analyze_in_outer_xact)
+        {
+                ConfigReloadPending = false;
+                ProcessConfigFile(PGC_SIGHUP);
+                VacuumUpdateCosts();
+        }

Since analyze_in_outer_xact is false by default, we reload the config
file in vacuum_delay_point() by default. We need to note that
vacuum_delay_point() could be called via other paths, for example
gin_cleanup_pending_list() and ambulkdelete() called by
validate_index(). So it seems to me that we should do the opposite; we
have another global variable, say vacuum_can_reload_config, which is
false by default, and is set to true only when vacuum() allows it. In
vacuum_delay_point(), we reload the config file iff
(ConfigReloadPending && vacuum_can_reload_config).

Wow, great point. Thanks for catching this. I've made the update you
suggested. I also set vacuum_can_reload_config to false in PG_FINALLY()
in vacuum().

Here are some review comments for 0002-0004 patches:

0002:
-        if (MyWorkerInfo)
+        if (am_autovacuum_launcher)
+                return;
+
+        if (am_autovacuum_worker)
         {
-                VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
                 VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+                VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
+        }

Isn't it a bit safer to check MyWorkerInfo instead of
am_autovacuum_worker? Also, I don't think there is any reason why we
want to exclude only the autovacuum launcher.

---
+ * TODO: should VacuumCostLimit and VacuumCostDelay be initialized to valid or
+ * invalid values?

How about using the default value of normal backends, 200 and 0?

0003:

@@ -83,6 +84,7 @@ int                   vacuum_cost_limit;
  */
 int                    VacuumCostLimit = 0;
 double         VacuumCostDelay = -1;
+static bool vacuum_can_reload_config = false;

In vacuum.c, we use snake case for GUC parameters and camel case for
other global variables, so it seems better to rename it
VacuumCanReloadConfig. Sorry, that's my fault.

0004:

+                if (tab->at_dobalance)
+                        pg_atomic_test_set_flag(&MyWorkerInfo->wi_dobalance);
+                else

The comment of pg_atomic_test_set_flag() says that it returns false if
the flag has not successfully been set:

* pg_atomic_test_set_flag - TAS()
*
* Returns true if the flag has successfully been set, false otherwise.
*
* Acquire (including read barrier) semantics.

But IIUC we don't need to worry about that as only one process updates
the flag, right? It might be a good idea to add some comments why we
don't need to check the return value.

---
- if (worker->wi_proc != NULL)
- elog(DEBUG2, "autovac_balance_cost(pid=%d
db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d,
cost_delay=%g)",
- worker->wi_proc->pid,
worker->wi_dboid, worker->wi_tableoid,
- worker->wi_dobalance ? "yes" : "no",
- worker->wi_cost_limit,
worker->wi_cost_limit_base,
- worker->wi_cost_delay);

I think it's better to keep this kind of log in some form for
debugging. For example, we can show these values of autovacuum workers
in VacuumUpdateCosts().

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#49Daniel Gustafsson
daniel@yesql.se
In reply to: Melanie Plageman (#47)
Re: Should vacuum process config file reload more often

On 4 Apr 2023, at 00:35, Melanie Plageman <melanieplageman@gmail.com> wrote:

On Mon, Apr 3, 2023 at 3:08 PM Andres Freund <andres@anarazel.de> wrote:

On 2023-04-03 14:43:14 -0400, Tom Lane wrote:

Melanie Plageman <melanieplageman@gmail.com> writes:

v13 attached with requested updates.

I'm afraid I'd not been paying any attention to this discussion,
but better late than never. I'm okay with letting autovacuum
processes reload config files more often than now. However,
I object to allowing ProcessConfigFile to be called from within
commands in a normal user backend. The existing semantics are
that user backends respond to SIGHUP only at the start of processing
a user command, and I'm uncomfortable with suddenly deciding that
that can work differently if the command happens to be VACUUM.
It seems unprincipled and perhaps actively unsafe.

I think it should be ok in commands like VACUUM that already internally start
their own transactions, and thus require to be run outside of a transaction
and at the toplevel. I share your concerns about allowing config reload in
arbitrary places. While we might want to go there, it would require a lot more
analysis.

Thinking more on this I'm leaning towards going with allowing more frequent
reloads in autovacuum, and saving the same for VACUUM for more careful study.
The general case is probably fine but I'm not convinced that there aren't error
cases which can present unpleasant scenarios.

Regarding the autovacuum part of this patch I think we are down to the final
details and I think it's doable to finish this in time for 16.

As an alternative for your consideration, attached v14 set implements
the config file reload for autovacuum only (in 0003) and then enables it
for VACUUM and ANALYZE not in a nested transaction command (in 0004).

Previously I had the commits in the reverse order for ease of review (to
separate changes to worker limit balancing logic from config reload
code).

A few comments on top of already submitted reviews, will do another pass over
this later today.

+ * VacuumFailsafeActive is a defined as a global so that we can determine
+ * whether or not to re-enable cost-based vacuum delay when vacuuming a table.

This comment should be expanded to document who we expect to inspect this
variable in order to decide on cost-based vacuum.

Moving the failsafe switch into a global context means we face the risk of an
extension changing it independently of the GUCs that control it (or the code
relying on it) such that these are out of sync. External code messing up
internal state is not new and of course outside of our control, but it's worth
at least considering. There isn't too much we can do here, but perhaps expand
this comment to include a "do not change this" note?

+extern bool VacuumFailsafeActive;

While I agree with upthread review comments that extensions shoulnd't poke at
this, not decorating it with PGDLLEXPORT adds little protection and only cause
inconsistencies in symbol exports across platforms. We only explicitly hide
symbols in shared libraries IIRC.

+extern int VacuumCostLimit;
+extern double VacuumCostDelay;
 ...
-extern PGDLLIMPORT int VacuumCostLimit;
-extern PGDLLIMPORT double VacuumCostDelay;

Same with these, I don't think this is according to our default visibility.
Moreover, I'm not sure it's a good idea to perform this rename. This will keep
VacuumCostLimit and VacuumCostDelay exported, but change their meaning. Any
external code referring to these thinking they are backing the GUCs will still
compile, but may be broken in subtle ways. Is there a reason for not keeping
the current GUC variables and instead add net new ones?

+ * TODO: should VacuumCostLimit and VacuumCostDelay be initialized to valid or
+ * invalid values?
+ */
+int                    VacuumCostLimit = 0;
+double         VacuumCostDelay = -1;

I think the important part is to make sure they are never accessed without
VacuumUpdateCosts having been called first. I think that's the case here, but
it's not entirely clear. Do you see a codepath where that could happen? If
they are initialized to a sentinel value we also need to check for that, so
initializing to the defaults from the corresponding GUCs seems better.

+* Update VacuumCostLimit with the correct value for an autovacuum worker, given

Trivial whitespace error in function comment.

+static double av_relopt_cost_delay = -1;
+static int av_relopt_cost_limit = 0;

These need a comment IMO, ideally one that explain why they are initialized to
those values.

+       /* There is at least 1 autovac worker (this worker). */
+       Assert(nworkers_for_balance > 0);

Is there a scenario where this is expected to fail? If so I think this should
be handled and not just an Assert.

--
Daniel Gustafsson

#50Melanie Plageman
melanieplageman@gmail.com
In reply to: Daniel Gustafsson (#49)
3 attachment(s)
Re: Should vacuum process config file reload more often

On Tue, Apr 4, 2023 at 4:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

The 0001 patch mostly looks good to me except for one
point:

@@ -391,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
params->truncate != VACOPTVALUE_AUTO);
-        vacrel->failsafe_active = false;
+        VacuumFailsafeActive = false;
vacrel->consider_bypass_optimization = true;
vacrel->do_index_vacuuming = true;

Looking at the 0003 patch, we set VacuumFailsafeActive to false per table:

+                        /*
+                         * Ensure VacuumFailsafeActive has been reset
before vacuuming the
+                         * next relation relation.
+                         */
+                        VacuumFailsafeActive = false;

Given that we ensure it's reset before vacuuming the next table, do we
need to reset it in heap_vacuum_rel?

I've changed the one in heap_vacuum_rel() to an assert.

(there is a typo; s/relation relation/relation/)

Thanks! fixed.

0002:

@@ -2388,6 +2398,7 @@ vac_max_items_to_alloc_size(int max_items)
return offsetof(VacDeadItems, items) +
sizeof(ItemPointerData) * max_items;
}

+
/*
* vac_tid_reaped() -- is a particular tid deletable?
*

Unnecessary new line. There are some other unnecessary new lines in this patch.

Thanks! I think I got them all.

---
@@ -307,6 +309,8 @@ extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
extern PGDLLIMPORT int VacuumCostBalanceLocal;
extern bool VacuumFailsafeActive;
+extern int     VacuumCostLimit;
+extern double VacuumCostDelay;

and

@@ -266,8 +266,6 @@ extern PGDLLIMPORT int max_parallel_maintenance_workers;
extern PGDLLIMPORT int VacuumCostPageHit;
extern PGDLLIMPORT int VacuumCostPageMiss;
extern PGDLLIMPORT int VacuumCostPageDirty;
-extern PGDLLIMPORT int VacuumCostLimit;
-extern PGDLLIMPORT double VacuumCostDelay;

Do we need PGDLLIMPORT too?

I was on the fence about this. I annotated the new guc variables
vacuum_cost_delay and vacuum_cost_limit with PGDLLIMPORT, but I did not
annotate the variables used in vacuum code (VacuumCostLimit/Delay). I
think whether or not this is the right choice depends on two things:
whether or not my understanding of PGDLLIMPORT is correct and, if it is,
whether or not we want extensions to be able to access
VacuumCostLimit/Delay or if just access to the guc variables is
sufficient/desirable.

I guess it would be better to keep both accessible for backward
compatibility. Extensions are able to access both GUC values and
values that are actually used for vacuum delays (as we used to use the
same variables).

Here are some review comments for 0002-0004 patches:

0002:
-        if (MyWorkerInfo)
+        if (am_autovacuum_launcher)
+                return;
+
+        if (am_autovacuum_worker)
{
-                VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+                VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
+        }

Isn't it a bit safer to check MyWorkerInfo instead of
am_autovacuum_worker?

Ah, since we access it. I've made the change.

Also, I don't think there is any reason why we want to exclude only
the autovacuum launcher.

My rationale is that the launcher is the only other process type which
might reasonably be executing this code besides autovac workers, client
backends doing VACUUM/ANALYZE, and parallel vacuum workers. Is it
confusing to have the launcher have VacuumCostLimt and VacuumCostDelay
set to the guc values for explicit VACUUM and ANALYZE -- even if the
launcher doesn't use these variables?

I've removed the check, because I do agree with you that it may be
unnecessarily confusing in the code.

---
+ * TODO: should VacuumCostLimit and VacuumCostDelay be initialized to valid or
+ * invalid values?

How about using the default value of normal backends, 200 and 0?

I've gone with this suggestion

0003:

@@ -83,6 +84,7 @@ int                   vacuum_cost_limit;
*/
int                    VacuumCostLimit = 0;
double         VacuumCostDelay = -1;
+static bool vacuum_can_reload_config = false;

In vacuum.c, we use snake case for GUC parameters and camel case for
other global variables, so it seems better to rename it
VacuumCanReloadConfig. Sorry, that's my fault.

I have renamed it.

0004:

+                if (tab->at_dobalance)
+                        pg_atomic_test_set_flag(&MyWorkerInfo->wi_dobalance);
+                else

The comment of pg_atomic_test_set_flag() says that it returns false if
the flag has not successfully been set:

* pg_atomic_test_set_flag - TAS()
*
* Returns true if the flag has successfully been set, false otherwise.
*
* Acquire (including read barrier) semantics.

But IIUC we don't need to worry about that as only one process updates
the flag, right? It might be a good idea to add some comments why we
don't need to check the return value.

I have added this comment.

---
- if (worker->wi_proc != NULL)
- elog(DEBUG2, "autovac_balance_cost(pid=%d
db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d,
cost_delay=%g)",
- worker->wi_proc->pid,
worker->wi_dboid, worker->wi_tableoid,
- worker->wi_dobalance ? "yes" : "no",
- worker->wi_cost_limit,
worker->wi_cost_limit_base,
- worker->wi_cost_delay);

I think it's better to keep this kind of log in some form for
debugging. For example, we can show these values of autovacuum workers
in VacuumUpdateCosts().

I added a message to do_autovacuum() after calling VacuumUpdateCosts()
in the loop vacuuming each table. That means it will happen once per
table. It's not ideal that I had to move the call to VacuumUpdateCosts()
behind the shared lock in that loop so that we could access the pid and
such in the logging message after updating the cost and delay, but it is
probably okay. Though noone is going to be changing those at this
point, it still seemed better to access them under the lock.

This does mean we won't log anything when we do change the values of
VacuumCostDelay and VacuumCostLimit while vacuuming a table. Is it worth
adding some code to do that in VacuumUpdateCosts() (only when the value
has changed not on every call to VacuumUpdateCosts())? Or perhaps we
could add it in the config reload branch that is already in
vacuum_delay_point()?

On Tue, Apr 4, 2023 at 9:36 AM Daniel Gustafsson <daniel@yesql.se> wrote:

Thanks for the review!

On 4 Apr 2023, at 00:35, Melanie Plageman <melanieplageman@gmail.com> wrote:

On Mon, Apr 3, 2023 at 3:08 PM Andres Freund <andres@anarazel.de> wrote:

On 2023-04-03 14:43:14 -0400, Tom Lane wrote:

Melanie Plageman <melanieplageman@gmail.com> writes:

v13 attached with requested updates.

I'm afraid I'd not been paying any attention to this discussion,
but better late than never. I'm okay with letting autovacuum
processes reload config files more often than now. However,
I object to allowing ProcessConfigFile to be called from within
commands in a normal user backend. The existing semantics are
that user backends respond to SIGHUP only at the start of processing
a user command, and I'm uncomfortable with suddenly deciding that
that can work differently if the command happens to be VACUUM.
It seems unprincipled and perhaps actively unsafe.

I think it should be ok in commands like VACUUM that already internally start
their own transactions, and thus require to be run outside of a transaction
and at the toplevel. I share your concerns about allowing config reload in
arbitrary places. While we might want to go there, it would require a lot more
analysis.

Thinking more on this I'm leaning towards going with allowing more frequent
reloads in autovacuum, and saving the same for VACUUM for more careful study.
The general case is probably fine but I'm not convinced that there aren't error
cases which can present unpleasant scenarios.

In attached v15, I've dropped support for VACUUM and non-nested ANALYZE.
It is like a 5 line change and could be added back at any time.

As an alternative for your consideration, attached v14 set implements
the config file reload for autovacuum only (in 0003) and then enables it
for VACUUM and ANALYZE not in a nested transaction command (in 0004).

Previously I had the commits in the reverse order for ease of review (to
separate changes to worker limit balancing logic from config reload
code).

A few comments on top of already submitted reviews, will do another pass over
this later today.

+ * VacuumFailsafeActive is a defined as a global so that we can determine
+ * whether or not to re-enable cost-based vacuum delay when vacuuming a table.

This comment should be expanded to document who we expect to inspect this
variable in order to decide on cost-based vacuum.

Moving the failsafe switch into a global context means we face the risk of an
extension changing it independently of the GUCs that control it (or the code
relying on it) such that these are out of sync. External code messing up
internal state is not new and of course outside of our control, but it's worth
at least considering. There isn't too much we can do here, but perhaps expand
this comment to include a "do not change this" note?

I've updated the comment to mention how table AM-agnostic VACUUM code
uses it and to say that table AMs can set it if they want that behavior.

+extern bool VacuumFailsafeActive;

While I agree with upthread review comments that extensions shoulnd't poke at
this, not decorating it with PGDLLEXPORT adds little protection and only cause
inconsistencies in symbol exports across platforms. We only explicitly hide
symbols in shared libraries IIRC.

I've updated this.

+extern int VacuumCostLimit;
+extern double VacuumCostDelay;
...
-extern PGDLLIMPORT int VacuumCostLimit;
-extern PGDLLIMPORT double VacuumCostDelay;

Same with these, I don't think this is according to our default visibility.
Moreover, I'm not sure it's a good idea to perform this rename. This will keep
VacuumCostLimit and VacuumCostDelay exported, but change their meaning. Any
external code referring to these thinking they are backing the GUCs will still
compile, but may be broken in subtle ways. Is there a reason for not keeping
the current GUC variables and instead add net new ones?

When VacuumCostLimit was the same variable in the code and for the GUC
vacuum_cost_limit, everytime we reload the config file, VacuumCostLimit
is overwritten. Autovacuum workers have to overwrite this value with the
appropriate one for themselves given the balancing logic and the value
of autovacuum_vacuum_cost_limit. However, the problem is, because you
can specify -1 for autovacuum_vacuum_cost_limit to indicate it should
fall back to vacuum_cost_limit, we have to reference the value of
VacuumCostLimit when calculating the new autovacuum worker's cost limit
after a config reload.

But, you have to be sure you *only* do this after a config reload when
the value of VacuumCostLimit is fresh and unmodified or you risk
dividing the value of VacuumCostLimit over and over. That means it is
unsafe to call functions updating the cost limit more than once.

This orchestration wasn't as difficult when we only reloaded the config
file once every table. We were careful about it and also kept the
original "base" cost limit around from table_recheck_autovac(). However,
once we started reloading the config file more often, this no longer
works.

By separating the variables modified when the gucs are set and the ones
used the code, we can make sure we always have the original value the
guc was set to in vacuum_cost_limit and autovacuum_vacuum_cost_limit,
whenever we need to reference it.

That being said, perhaps we should document what extensions should do?
Do you think they will want to use the variables backing the gucs or to
be able to overwrite the variables being used in the code?

Oh, also I've annotated these with PGDLLIMPORT too.

+ * TODO: should VacuumCostLimit and VacuumCostDelay be initialized to valid or
+ * invalid values?
+ */
+int                    VacuumCostLimit = 0;
+double         VacuumCostDelay = -1;

I think the important part is to make sure they are never accessed without
VacuumUpdateCosts having been called first. I think that's the case here, but
it's not entirely clear. Do you see a codepath where that could happen? If
they are initialized to a sentinel value we also need to check for that, so
initializing to the defaults from the corresponding GUCs seems better.

I don't see a case where autovacuum could access these without calling
VacuumUpdateCosts() first. I think the other callers of
vacuum_delay_point() are the issue (gist/gin/hash/etc).

It might need a bit more thought.

My concern was that these variables correspond to multiple GUCs each
depending on the backend type, and those backends have different
defaults (e.g. autovac workers default cost delay is different than
client backend doing vacuum cost delay).

However, what I have done in this version is initialize them to the
defaults for a client backend executing VACUUM or ANALYZE, since I am
fairly confident that autovacuum will not use them without calling
VacuumUpdateCosts().

+* Update VacuumCostLimit with the correct value for an autovacuum worker, given

Trivial whitespace error in function comment.

Fixed.

+static double av_relopt_cost_delay = -1;
+static int av_relopt_cost_limit = 0;

These need a comment IMO, ideally one that explain why they are initialized to
those values.

I've added a comment.

+       /* There is at least 1 autovac worker (this worker). */
+       Assert(nworkers_for_balance > 0);

Is there a scenario where this is expected to fail? If so I think this should
be handled and not just an Assert.

No, this isn't expected to happen because an autovacuum worker would
have called autovac_recalculate_workers_for_balance() before calling
VacuumUpdateCosts() (which calls AutoVacuumUpdateLimit()) in
do_autovacuum(). But, if someone were to move around or add a call to
VacuumUpdateCosts() there is a chance it could happen.

- Melanie

Attachments:

v15-0001-Make-vacuum-s-failsafe_active-a-global.patchtext/x-patch; charset=US-ASCII; name=v15-0001-Make-vacuum-s-failsafe_active-a-global.patchDownload
From 5b11ac2a9b1ddba24b95c5b2a33d791d99e615ba Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Fri, 31 Mar 2023 10:38:39 -0400
Subject: [PATCH v15 1/3] Make vacuum's failsafe_active a global

While vacuuming a table in failsafe mode, VacuumCostActive should not be
re-enabled. This currently isn't a problem because vacuum cost
parameters are only refreshed in between vacuuming tables and failsafe
status is reset for every table. In preparation for allowing vacuum cost
parameters to be updated more frequently, elevate
LVRelState->failsafe_active to a global, VacuumFailsafeActive, which
will be checked when determining whether or not to re-enable vacuum
cost-related delays.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 16 +++++++---------
 src/backend/commands/vacuum.c        | 12 ++++++++++++
 src/include/commands/vacuum.h        |  1 +
 3 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 639179aa46..2ba85bd3d6 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -153,8 +153,6 @@ typedef struct LVRelState
 	bool		aggressive;
 	/* Use visibility map to skip? (disabled by DISABLE_PAGE_SKIPPING) */
 	bool		skipwithvm;
-	/* Wraparound failsafe has been triggered? */
-	bool		failsafe_active;
 	/* Consider index vacuuming bypass optimization? */
 	bool		consider_bypass_optimization;
 
@@ -391,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
 	Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
 		   params->truncate != VACOPTVALUE_AUTO);
-	vacrel->failsafe_active = false;
+	VacuumFailsafeActive = false;
 	vacrel->consider_bypass_optimization = true;
 	vacrel->do_index_vacuuming = true;
 	vacrel->do_index_cleanup = true;
@@ -709,7 +707,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 			}
 			else
 			{
-				if (!vacrel->failsafe_active)
+				if (!VacuumFailsafeActive)
 					appendStringInfoString(&buf, _("index scan bypassed: "));
 				else
 					appendStringInfoString(&buf, _("index scan bypassed by failsafe: "));
@@ -2293,7 +2291,7 @@ lazy_vacuum(LVRelState *vacrel)
 		 * vacuuming or heap vacuuming.  This VACUUM operation won't end up
 		 * back here again.
 		 */
-		Assert(vacrel->failsafe_active);
+		Assert(VacuumFailsafeActive);
 	}
 
 	/*
@@ -2374,7 +2372,7 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	 */
 	Assert(vacrel->num_index_scans > 0 ||
 		   vacrel->dead_items->num_items == vacrel->lpdead_items);
-	Assert(allindexes || vacrel->failsafe_active);
+	Assert(allindexes || VacuumFailsafeActive);
 
 	/*
 	 * Increase and report the number of index scans.
@@ -2616,12 +2614,12 @@ static bool
 lazy_check_wraparound_failsafe(LVRelState *vacrel)
 {
 	/* Don't warn more than once per VACUUM */
-	if (vacrel->failsafe_active)
+	if (VacuumFailsafeActive)
 		return true;
 
 	if (unlikely(vacuum_xid_failsafe_check(&vacrel->cutoffs)))
 	{
-		vacrel->failsafe_active = true;
+		VacuumFailsafeActive = true;
 
 		/*
 		 * Abandon use of a buffer access strategy to allow use of all of
@@ -2820,7 +2818,7 @@ should_attempt_truncation(LVRelState *vacrel)
 {
 	BlockNumber possibly_freeable;
 
-	if (!vacrel->do_rel_truncate || vacrel->failsafe_active ||
+	if (!vacrel->do_rel_truncate || VacuumFailsafeActive ||
 		old_snapshot_threshold >= 0)
 		return false;
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index da85330ef4..cf3abb072c 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -72,6 +72,18 @@ int			vacuum_multixact_freeze_table_age;
 int			vacuum_failsafe_age;
 int			vacuum_multixact_failsafe_age;
 
+/*
+ * VacuumFailsafeActive is a defined as a global so that we can determine
+ * whether or not to re-enable cost-based vacuum delay when vacuuming a table.
+ * If failsafe mode has been engaged, we will not re-enable cost-based delay
+ * for the table until after vacuuming has completed, regardless of other
+ * settings. Only VACUUM code should inspect this variable and only table
+ * access methods should set it. In Table AM-agnostic VACUUM code, this
+ * variable controls whether or not to allow cost-based delays. Table AMs are
+ * free to use it if they desire this behavior.
+ */
+bool		VacuumFailsafeActive = false;
+
 /*
  * Variables for cost-based parallel vacuum.  See comments atop
  * compute_parallel_delay to understand how it works.
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index bdfd96cfec..7219c6ba9c 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -306,6 +306,7 @@ extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
 
+extern PGDLLIMPORT bool VacuumFailsafeActive;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
-- 
2.37.2

v15-0003-Autovacuum-refreshes-cost-based-delay-params-mor.patchtext/x-patch; charset=US-ASCII; name=v15-0003-Autovacuum-refreshes-cost-based-delay-params-mor.patchDownload
From 010136b5ad848590e6dfc139b45a0008e1d2afb8 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 25 Mar 2023 14:14:55 -0400
Subject: [PATCH v15 3/3] Autovacuum refreshes cost-based delay params more
 often

Allow autovacuum to reload the config file more often so that cost-based
delay parameters can take effect while VACUUMing a relation. Previously
autovacuum workers only reloaded the config file once per relation
vacuumed, so config changes could not take effect until beginning to
vacuum the next table.

Now, check if a reload is pending roughly once per block, when checking
if we need to delay.

In order for autovacuum workers to safely update their own cost delay
and cost limit parameters without impacting performance, we had to
rethink when and how these values were accessed.

Previously, an autovacuum worker's wi_cost_limit was set only at the
beginning of vacuuming a table, after reloading the config file.
Therefore, at the time that autovac_balance_cost() is called, workers
vacuuming tables with no table options could still have different values
for their wi_cost_limit_base and wi_cost_delay.

Now that the cost parameters can be updated while vacuuming a table,
workers will (within some margin of error) have no reason to have
different values for cost limit and cost delay (in the absence of table
options). This removes the rationale for keeping cost limit and cost
delay in shared memory. Balancing the cost limit requires only the
number of active autovacuum workers vacuuming a table with no cost-based
table options.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c  |   2 +-
 src/backend/commands/vacuum.c         |  44 ++++-
 src/backend/commands/vacuumparallel.c |   1 -
 src/backend/postmaster/autovacuum.c   | 266 +++++++++++++++-----------
 src/include/commands/vacuum.h         |   1 +
 5 files changed, 196 insertions(+), 118 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 2ba85bd3d6..0a9ebd22bd 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -389,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
 	Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
 		   params->truncate != VACOPTVALUE_AUTO);
-	VacuumFailsafeActive = false;
+	Assert(!VacuumFailsafeActive);
 	vacrel->consider_bypass_optimization = true;
 	vacrel->do_index_vacuuming = true;
 	vacrel->do_index_cleanup = true;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index b1fc7a0efc..9964c9ea4d 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -48,6 +48,7 @@
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/interrupt.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/pmsignal.h"
@@ -512,9 +513,9 @@ vacuum(List *relations, VacuumParams *params,
 	{
 		ListCell   *cur;
 
-		VacuumUpdateCosts();
 		in_vacuum = true;
-		VacuumCostActive = (VacuumCostDelay > 0);
+		VacuumFailsafeActive = false;
+		VacuumUpdateCosts();
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
@@ -568,12 +569,20 @@ vacuum(List *relations, VacuumParams *params,
 					CommandCounterIncrement();
 				}
 			}
+
+			/*
+			 * Ensure VacuumFailsafeActive has been reset before vacuuming the
+			 * next relation.
+			 */
+			VacuumFailsafeActive = false;
 		}
 	}
 	PG_FINALLY();
 	{
 		in_vacuum = false;
 		VacuumCostActive = false;
+		VacuumFailsafeActive = false;
+		VacuumCostBalance = 0;
 	}
 	PG_END_TRY();
 
@@ -2240,7 +2249,27 @@ vacuum_delay_point(void)
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	if (!VacuumCostActive || InterruptPending)
+	if (InterruptPending ||
+		(!VacuumCostActive && !ConfigReloadPending))
+		return;
+
+	/*
+	 * Reload the configuration file if requested. This allows changes to
+	 * autovacuum_vacuum_cost_limit and autovacuum_vacuum_cost_delay to take
+	 * effect while a table is being vacuumed or analyzed.
+	 */
+	if (ConfigReloadPending && IsAutoVacuumWorkerProcess())
+	{
+		ConfigReloadPending = false;
+		ProcessConfigFile(PGC_SIGHUP);
+		VacuumUpdateCosts();
+	}
+
+	/*
+	 * If we disabled cost-based delays after reloading the config file,
+	 * return.
+	 */
+	if (!VacuumCostActive)
 		return;
 
 	/*
@@ -2273,7 +2302,14 @@ vacuum_delay_point(void)
 
 		VacuumCostBalance = 0;
 
-		VacuumUpdateCosts();
+		/*
+		 * Balance and update limit values for autovacuum workers. We must
+		 * always do this in case the autovacuum launcher or another
+		 * autovacuum worker has recalculated the number of workers across
+		 * which we must balance the limit. This is done by the launcher when
+		 * launching a new worker and by workers before vacuuming each table.
+		 */
+		AutoVacuumUpdateLimit();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0b59c922e4..e200d5caf8 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -995,7 +995,6 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												 false);
 
 	/* Set cost-based vacuum delay */
-	VacuumCostActive = (VacuumCostDelay > 0);
 	VacuumUpdateCosts();
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index ce7e009576..0c1e4652fb 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -139,6 +139,17 @@ int			Log_autovacuum_min_duration = 600000;
 static bool am_autovacuum_launcher = false;
 static bool am_autovacuum_worker = false;
 
+/*
+ * Variables to save the cost-related table options for the current relation
+ * being vacuumed by this autovacuum worker. Using these, we can ensure we
+ * don't overwrite the values of VacuumCostDelay and VacuumCostLimit after
+ * reloading the configuration file. They are initialized to "invalid" values
+ * to indicate no table options were specified and will be set in
+ * do_autovacuum() after checking the table options in table_recheck_autovac().
+ */
+static double av_relopt_cost_delay = -1;
+static int	av_relopt_cost_limit = 0;
+
 /* Flags set by signal handlers */
 static volatile sig_atomic_t got_SIGUSR2 = false;
 
@@ -189,8 +200,8 @@ typedef struct autovac_table
 {
 	Oid			at_relid;
 	VacuumParams at_params;
-	double		at_vacuum_cost_delay;
-	int			at_vacuum_cost_limit;
+	double		at_relopt_vac_cost_delay;
+	int			at_relopt_vac_cost_limit;
 	bool		at_dobalance;
 	bool		at_sharedrel;
 	char	   *at_relname;
@@ -209,7 +220,7 @@ typedef struct autovac_table
  * wi_sharedrel flag indicating whether table is marked relisshared
  * wi_proc		pointer to PGPROC of the running worker, NULL if not started
  * wi_launchtime Time at which this worker was launched
- * wi_cost_*	Vacuum cost-based delay parameters current in this worker
+ * wi_dobalance Whether this worker should be included in balance calculations
  *
  * All fields are protected by AutovacuumLock, except for wi_tableoid and
  * wi_sharedrel which are protected by AutovacuumScheduleLock (note these
@@ -223,11 +234,8 @@ typedef struct WorkerInfoData
 	Oid			wi_tableoid;
 	PGPROC	   *wi_proc;
 	TimestampTz wi_launchtime;
-	bool		wi_dobalance;
+	pg_atomic_flag wi_dobalance;
 	bool		wi_sharedrel;
-	double		wi_cost_delay;
-	int			wi_cost_limit;
-	int			wi_cost_limit_base;
 } WorkerInfoData;
 
 typedef struct WorkerInfoData *WorkerInfo;
@@ -273,6 +281,8 @@ typedef struct AutoVacuumWorkItem
  * av_startingWorker pointer to WorkerInfo currently being started (cleared by
  *					the worker itself as soon as it's up and running)
  * av_workItems		work item array
+ * av_nworkersForBalance the number of autovacuum workers to use when
+ * 					calculating the per worker cost limit
  *
  * This struct is protected by AutovacuumLock, except for av_signal and parts
  * of the worker list (see above).
@@ -286,6 +296,7 @@ typedef struct
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+	pg_atomic_uint32 av_nworkersForBalance;
 } AutoVacuumShmemStruct;
 
 static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -319,7 +330,7 @@ static void launch_worker(TimestampTz now);
 static List *get_database_list(void);
 static void rebuild_database_list(Oid newdb);
 static int	db_comparator(const void *a, const void *b);
-static void autovac_balance_cost(void);
+static void autovac_recalculate_workers_for_balance(void);
 
 static void do_autovacuum(void);
 static void FreeWorkerInfo(int code, Datum arg);
@@ -670,7 +681,7 @@ AutoVacLauncherMain(int argc, char *argv[])
 			{
 				LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 				AutoVacuumShmem->av_signal[AutoVacRebalance] = false;
-				autovac_balance_cost();
+				autovac_recalculate_workers_for_balance();
 				LWLockRelease(AutovacuumLock);
 			}
 
@@ -820,8 +831,8 @@ HandleAutoVacLauncherInterrupts(void)
 			AutoVacLauncherShutdown();
 
 		/* rebalance in case the default cost parameters changed */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-		autovac_balance_cost();
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
+		autovac_recalculate_workers_for_balance();
 		LWLockRelease(AutovacuumLock);
 
 		/* rebuild the list in case the naptime changed */
@@ -1755,10 +1766,7 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_sharedrel = false;
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
-		MyWorkerInfo->wi_dobalance = false;
-		MyWorkerInfo->wi_cost_delay = 0;
-		MyWorkerInfo->wi_cost_limit = 0;
-		MyWorkerInfo->wi_cost_limit_base = 0;
+		pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
 		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 						&MyWorkerInfo->wi_links);
 		/* not mine anymore */
@@ -1784,97 +1792,119 @@ VacuumUpdateCosts(void)
 {
 	if (MyWorkerInfo)
 	{
-		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
-		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+		if (av_relopt_cost_delay >= 0)
+			VacuumCostDelay = av_relopt_cost_delay;
+		else if (autovacuum_vac_cost_delay >= 0)
+			VacuumCostDelay = autovacuum_vac_cost_delay;
+		else
+			/* fall back to vacuum_cost_delay */
+			VacuumCostDelay = vacuum_cost_delay;
+
+		AutoVacuumUpdateLimit();
 	}
 	else
 	{
 		/* Must be explicit VACUUM or ANALYZE */
-		VacuumCostLimit = vacuum_cost_limit;
 		VacuumCostDelay = vacuum_cost_delay;
+		VacuumCostLimit = vacuum_cost_limit;
+	}
+
+	/*
+	 * If configuration changes are allowed to impact VacuumCostActive, make
+	 * sure it is updated.
+	 */
+	if (VacuumFailsafeActive)
+	{
+		Assert(!VacuumCostActive);
+		return;
+	}
+
+	if (VacuumCostDelay > 0)
+		VacuumCostActive = true;
+	else
+	{
+		VacuumCostActive = false;
+		VacuumCostBalance = 0;
 	}
 }
 
 /*
- * autovac_balance_cost
- *		Recalculate the cost limit setting for each active worker.
- *
- * Caller must hold the AutovacuumLock in exclusive mode.
+ * Update VacuumCostLimit with the correct value for an autovacuum worker, given
+ * the value of other relevant cost limit parameters and the number of workers
+ * across which the limit must be balanced. Autovacuum workers must call this
+ * regularly in case av_nworkers_for_balance has been updated by another worker
+ * or by the autovacuum launcher. They must also call it after a config reload.
  */
-static void
-autovac_balance_cost(void)
+void
+AutoVacuumUpdateLimit(void)
 {
+	if (!MyWorkerInfo)
+		return;
+
 	/*
-	 * The idea here is that we ration out I/O equally.  The amount of I/O
-	 * that a worker can consume is determined by cost_limit/cost_delay, so we
-	 * try to equalize those ratios rather than the raw limit settings.
-	 *
 	 * note: in cost_limit, zero also means use value from elsewhere, because
 	 * zero is not a valid value.
 	 */
-	int			vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
-								  autovacuum_vac_cost_limit : vacuum_cost_limit);
-	double		vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
-								  autovacuum_vac_cost_delay : vacuum_cost_delay);
-	double		cost_total;
-	double		cost_avail;
-	dlist_iter	iter;
 
-	/* not set? nothing to do */
-	if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
-		return;
-
-	/* calculate the total base cost limit of participating active workers */
-	cost_total = 0.0;
-	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
+	if (av_relopt_cost_limit > 0)
+		VacuumCostLimit = av_relopt_cost_limit;
+	else
 	{
-		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
+		int			nworkers_for_balance;
+
+		if (autovacuum_vac_cost_limit > 0)
+			VacuumCostLimit = autovacuum_vac_cost_limit;
+		else
+			VacuumCostLimit = vacuum_cost_limit;
+
+		/* Only balance limit if no table options specified */
+		if (pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance))
+			return;
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-			cost_total +=
-				(double) worker->wi_cost_limit_base / worker->wi_cost_delay;
+		Assert(VacuumCostLimit > 0);
+
+		nworkers_for_balance = pg_atomic_read_u32(
+								&AutoVacuumShmem->av_nworkersForBalance);
+
+		/* There is at least 1 autovac worker (this worker). */
+		Assert(nworkers_for_balance > 0);
+
+		VacuumCostLimit = Max(VacuumCostLimit / nworkers_for_balance, 1);
 	}
+}
 
-	/* there are no cost limits -- nothing to do */
-	if (cost_total <= 0)
-		return;
+/*
+ * autovac_recalculate_workers_for_balance
+ *		Recalculate the number of workers to consider, given table options and
+ *		the current number of active workers.
+ *
+ * Caller must hold the AutovacuumLock in at least shared mode to access
+ * worker->wi_proc.
+ */
+static void
+autovac_recalculate_workers_for_balance(void)
+{
+	dlist_iter	iter;
+	int			orig_nworkers_for_balance;
+	int			nworkers_for_balance = 0;
+
+	orig_nworkers_for_balance =
+		pg_atomic_read_u32(&AutoVacuumShmem->av_nworkersForBalance);
 
-	/*
-	 * Adjust cost limit of each active worker to balance the total of cost
-	 * limit to autovacuum_vacuum_cost_limit.
-	 */
-	cost_avail = (double) vac_cost_limit / vac_cost_delay;
 	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
 	{
 		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-		{
-			int			limit = (int)
-			(cost_avail * worker->wi_cost_limit_base / cost_total);
-
-			/*
-			 * We put a lower bound of 1 on the cost_limit, to avoid division-
-			 * by-zero in the vacuum code.  Also, in case of roundoff trouble
-			 * in these calculations, let's be sure we don't ever set
-			 * cost_limit to more than the base value.
-			 */
-			worker->wi_cost_limit = Max(Min(limit,
-											worker->wi_cost_limit_base),
-										1);
-		}
+		if (worker->wi_proc == NULL ||
+			pg_atomic_unlocked_test_flag(&worker->wi_dobalance))
+			continue;
 
-		if (worker->wi_proc != NULL)
-			elog(DEBUG2, "autovac_balance_cost(pid=%d db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d, cost_delay=%g)",
-				 worker->wi_proc->pid, worker->wi_dboid, worker->wi_tableoid,
-				 worker->wi_dobalance ? "yes" : "no",
-				 worker->wi_cost_limit, worker->wi_cost_limit_base,
-				 worker->wi_cost_delay);
+		nworkers_for_balance++;
 	}
+
+	if (nworkers_for_balance != orig_nworkers_for_balance)
+		pg_atomic_write_u32(&AutoVacuumShmem->av_nworkersForBalance,
+							nworkers_for_balance);
 }
 
 /*
@@ -2422,22 +2452,39 @@ do_autovacuum(void)
 			continue;
 		}
 
-		/* Must hold AutovacuumLock while mucking with cost balance info */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+		/*
+		 * Save the cost-related table options in global variables for
+		 * reference when updating VacuumCostLimit and VacuumCostDelay during
+		 * vacuuming this table.
+		 */
+		av_relopt_cost_limit = tab->at_relopt_vac_cost_limit;
+		av_relopt_cost_delay = tab->at_relopt_vac_cost_delay;
 
-		/* advertise my cost delay parameters for the balancing algorithm */
-		MyWorkerInfo->wi_dobalance = tab->at_dobalance;
-		MyWorkerInfo->wi_cost_delay = tab->at_vacuum_cost_delay;
-		MyWorkerInfo->wi_cost_limit = tab->at_vacuum_cost_limit;
-		MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
+		/*
+		 * We only expect this worker to ever set the flag, so don't bother
+		 * checking the return value. We shouldn't have to retry.
+		 */
+		if (tab->at_dobalance)
+			pg_atomic_test_set_flag(&MyWorkerInfo->wi_dobalance);
+		else
+			pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
 
-		/* do a balance */
-		autovac_balance_cost();
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
 
-		/* set the active cost parameters from the result of that */
+		autovac_recalculate_workers_for_balance();
+
+		/*
+		 * We wait until this point to update cost delay and cost limit
+		 * values, even though we reloaded the configuration file above, so
+		 * that we can take into account the cost-related table options.
+		 */
 		VacuumUpdateCosts();
 
-		/* done */
+		elog(DEBUG2, "VacuumUpdateCosts(pid=%d db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_delay=%g)",
+			 MyWorkerInfo->wi_proc->pid, MyWorkerInfo->wi_dboid, MyWorkerInfo->wi_tableoid,
+			 pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance) ? "no" : "yes",
+			 VacuumCostLimit, VacuumCostDelay);
+
 		LWLockRelease(AutovacuumLock);
 
 		/* clean up memory before each iteration */
@@ -2523,10 +2570,10 @@ deleted:
 
 		/*
 		 * Remove my info from shared memory.  We could, but intentionally
-		 * don't, clear wi_cost_limit and friends --- this is on the
-		 * assumption that we probably have more to do with similar cost
-		 * settings, so we don't want to give up our share of I/O for a very
-		 * short interval and thereby thrash the global balance.
+		 * don't, unset wi_dobalance on the assumption that we are more likely
+		 * than not to vacuum a table with no table options next, so we don't
+		 * want to give up our share of I/O for a very short interval and
+		 * thereby thrash the global balance.
 		 */
 		LWLockAcquire(AutovacuumScheduleLock, LW_EXCLUSIVE);
 		MyWorkerInfo->wi_tableoid = InvalidOid;
@@ -2563,6 +2610,7 @@ deleted:
 		{
 			ConfigReloadPending = false;
 			ProcessConfigFile(PGC_SIGHUP);
+			VacuumUpdateCosts();
 		}
 
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
@@ -2798,8 +2846,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		int			freeze_table_age;
 		int			multixact_freeze_min_age;
 		int			multixact_freeze_table_age;
-		int			vac_cost_limit;
-		double		vac_cost_delay;
 		int			log_min_duration;
 
 		/*
@@ -2809,20 +2855,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 * defaults, autovacuum's own first and plain vacuum second.
 		 */
 
-		/* -1 in autovac setting means use plain vacuum_cost_delay */
-		vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
-			? avopts->vacuum_cost_delay
-			: (autovacuum_vac_cost_delay >= 0)
-			? autovacuum_vac_cost_delay
-			: vacuum_cost_delay;
-
-		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
-		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
-			? avopts->vacuum_cost_limit
-			: (autovacuum_vac_cost_limit > 0)
-			? autovacuum_vac_cost_limit
-			: vacuum_cost_limit;
-
 		/* -1 in autovac setting means use log_autovacuum_min_duration */
 		log_min_duration = (avopts && avopts->log_min_duration >= 0)
 			? avopts->log_min_duration
@@ -2878,8 +2910,10 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab->at_params.multixact_freeze_table_age = multixact_freeze_table_age;
 		tab->at_params.is_wraparound = wraparound;
 		tab->at_params.log_min_duration = log_min_duration;
-		tab->at_vacuum_cost_limit = vac_cost_limit;
-		tab->at_vacuum_cost_delay = vac_cost_delay;
+		tab->at_relopt_vac_cost_limit = avopts ?
+			avopts->vacuum_cost_limit : 0;
+		tab->at_relopt_vac_cost_delay = avopts ?
+			avopts->vacuum_cost_delay : -1;
 		tab->at_relname = NULL;
 		tab->at_nspname = NULL;
 		tab->at_datname = NULL;
@@ -3371,10 +3405,18 @@ AutoVacuumShmemInit(void)
 		worker = (WorkerInfo) ((char *) AutoVacuumShmem +
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
+
 		/* initialize the WorkerInfo free list */
 		for (i = 0; i < autovacuum_max_workers; i++)
+		{
 			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 							&worker[i].wi_links);
+
+			pg_atomic_init_flag(&worker[i].wi_dobalance);
+		}
+
+		pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
+
 	}
 	else
 		Assert(found);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 17cf58255f..ef938fb692 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -351,6 +351,7 @@ extern IndexBulkDeleteResult *vac_cleanup_one_index(IndexVacuumInfo *ivinfo,
 extern Size vac_max_items_to_alloc_size(int max_items);
 
 /* In postmaster/autovacuum.c */
+extern void AutoVacuumUpdateLimit(void);
 extern void VacuumUpdateCosts(void);
 
 /* in commands/vacuumparallel.c */
-- 
2.37.2

v15-0002-Separate-vacuum-cost-variables-from-gucs.patchtext/x-patch; charset=US-ASCII; name=v15-0002-Separate-vacuum-cost-variables-from-gucs.patchDownload
From 00ddceeb3e565bc88f9691da29e01903aa69cf22 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Mon, 3 Apr 2023 11:22:18 -0400
Subject: [PATCH v15 2/3] Separate vacuum cost variables from gucs

Vacuum code run both by autovacuum workers and a backend doing
VACUUM/ANALYZE previously used VacuumCostLimit and VacuumCostDelay which
were the global variables for the gucs vacuum_cost_limit and
vacuum_cost_delay. Autovacuum workers needed to override these variables
with their own values, derived from autovacuum_vacuum_cost_limit and
autovacuum_vacuum_cost_delay and worker cost limit balancing logic. This
led to confusing code which, in some cases, both derived and set a new
value of VacuumCostLimit from VacuumCostLimit.

In preparation for refreshing these guc values more often, separate
these variables from the gucs themselves and add a function to update
the global variables using the gucs and existing logic.

Per suggestion by Kyotaro Horiguchi

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
---
 src/backend/commands/vacuum.c         | 15 +++++++++--
 src/backend/commands/vacuumparallel.c |  1 +
 src/backend/postmaster/autovacuum.c   | 38 +++++++++++----------------
 src/backend/utils/init/globals.c      |  2 --
 src/backend/utils/misc/guc_tables.c   |  4 +--
 src/include/commands/vacuum.h         |  7 +++++
 src/include/miscadmin.h               |  2 --
 src/include/postmaster/autovacuum.h   |  3 ---
 8 files changed, 39 insertions(+), 33 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index cf3abb072c..b1fc7a0efc 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -71,6 +71,17 @@ int			vacuum_multixact_freeze_min_age;
 int			vacuum_multixact_freeze_table_age;
 int			vacuum_failsafe_age;
 int			vacuum_multixact_failsafe_age;
+double		vacuum_cost_delay;
+int			vacuum_cost_limit;
+
+/*
+ * Variables for cost-based vacuum delay. The defaults differ between
+ * autovacuum and vacuum. These should be overridden with the appropriate GUC
+ * value in vacuum code. These are initialized here to the defaults for client
+ * backends executing VACUUM or ANALYZE.
+ */
+int			VacuumCostLimit = 200;
+double		VacuumCostDelay = 0;
 
 /*
  * VacuumFailsafeActive is a defined as a global so that we can determine
@@ -501,6 +512,7 @@ vacuum(List *relations, VacuumParams *params,
 	{
 		ListCell   *cur;
 
+		VacuumUpdateCosts();
 		in_vacuum = true;
 		VacuumCostActive = (VacuumCostDelay > 0);
 		VacuumCostBalance = 0;
@@ -2261,8 +2273,7 @@ vacuum_delay_point(void)
 
 		VacuumCostBalance = 0;
 
-		/* update balance values for workers */
-		AutoVacuumUpdateDelay();
+		VacuumUpdateCosts();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 563117a8f6..0b59c922e4 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -996,6 +996,7 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 
 	/* Set cost-based vacuum delay */
 	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumUpdateCosts();
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
 	VacuumPageMiss = 0;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 585d28148c..ce7e009576 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1774,17 +1774,25 @@ FreeWorkerInfo(int code, Datum arg)
 }
 
 /*
- * Update the cost-based delay parameters, so that multiple workers consume
- * each a fraction of the total available I/O.
+ * Update vacuum cost-based delay-related parameters for autovacuum workers and
+ * backends executing VACUUM or ANALYZE using the value of relevant gucs and
+ * global state. This must be called during setup for vacuum and after every
+ * config reload to ensure up-to-date values.
  */
 void
-AutoVacuumUpdateDelay(void)
+VacuumUpdateCosts(void)
 {
 	if (MyWorkerInfo)
 	{
 		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
 		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
 	}
+	else
+	{
+		/* Must be explicit VACUUM or ANALYZE */
+		VacuumCostLimit = vacuum_cost_limit;
+		VacuumCostDelay = vacuum_cost_delay;
+	}
 }
 
 /*
@@ -1805,9 +1813,9 @@ autovac_balance_cost(void)
 	 * zero is not a valid value.
 	 */
 	int			vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
-								  autovacuum_vac_cost_limit : VacuumCostLimit);
+								  autovacuum_vac_cost_limit : vacuum_cost_limit);
 	double		vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
-								  autovacuum_vac_cost_delay : VacuumCostDelay);
+								  autovacuum_vac_cost_delay : vacuum_cost_delay);
 	double		cost_total;
 	double		cost_avail;
 	dlist_iter	iter;
@@ -2312,8 +2320,6 @@ do_autovacuum(void)
 		autovac_table *tab;
 		bool		isshared;
 		bool		skipit;
-		double		stdVacuumCostDelay;
-		int			stdVacuumCostLimit;
 		dlist_iter	iter;
 
 		CHECK_FOR_INTERRUPTS();
@@ -2416,14 +2422,6 @@ do_autovacuum(void)
 			continue;
 		}
 
-		/*
-		 * Remember the prevailing values of the vacuum cost GUCs.  We have to
-		 * restore these at the bottom of the loop, else we'll compute wrong
-		 * values in the next iteration of autovac_balance_cost().
-		 */
-		stdVacuumCostDelay = VacuumCostDelay;
-		stdVacuumCostLimit = VacuumCostLimit;
-
 		/* Must hold AutovacuumLock while mucking with cost balance info */
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 
@@ -2437,7 +2435,7 @@ do_autovacuum(void)
 		autovac_balance_cost();
 
 		/* set the active cost parameters from the result of that */
-		AutoVacuumUpdateDelay();
+		VacuumUpdateCosts();
 
 		/* done */
 		LWLockRelease(AutovacuumLock);
@@ -2534,10 +2532,6 @@ deleted:
 		MyWorkerInfo->wi_tableoid = InvalidOid;
 		MyWorkerInfo->wi_sharedrel = false;
 		LWLockRelease(AutovacuumScheduleLock);
-
-		/* restore vacuum cost GUCs for the next iteration */
-		VacuumCostDelay = stdVacuumCostDelay;
-		VacuumCostLimit = stdVacuumCostLimit;
 	}
 
 	/*
@@ -2820,14 +2814,14 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			? avopts->vacuum_cost_delay
 			: (autovacuum_vac_cost_delay >= 0)
 			? autovacuum_vac_cost_delay
-			: VacuumCostDelay;
+			: vacuum_cost_delay;
 
 		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
 		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
 			? avopts->vacuum_cost_limit
 			: (autovacuum_vac_cost_limit > 0)
 			? autovacuum_vac_cost_limit
-			: VacuumCostLimit;
+			: vacuum_cost_limit;
 
 		/* -1 in autovac setting means use log_autovacuum_min_duration */
 		log_min_duration = (avopts && avopts->log_min_duration >= 0)
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 1b1d814254..8e5b065e8f 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -142,8 +142,6 @@ int			MaxBackends = 0;
 int			VacuumCostPageHit = 1;	/* GUC parameters for vacuum */
 int			VacuumCostPageMiss = 2;
 int			VacuumCostPageDirty = 20;
-int			VacuumCostLimit = 200;
-double		VacuumCostDelay = 0;
 
 int64		VacuumPageHit = 0;
 int64		VacuumPageMiss = 0;
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8062589efd..77db1a146c 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2409,7 +2409,7 @@ struct config_int ConfigureNamesInt[] =
 			gettext_noop("Vacuum cost amount available before napping."),
 			NULL
 		},
-		&VacuumCostLimit,
+		&vacuum_cost_limit,
 		200, 1, 10000,
 		NULL, NULL, NULL
 	},
@@ -3701,7 +3701,7 @@ struct config_real ConfigureNamesReal[] =
 			NULL,
 			GUC_UNIT_MS
 		},
-		&VacuumCostDelay,
+		&vacuum_cost_delay,
 		0, 0, 100,
 		NULL, NULL, NULL
 	},
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 7219c6ba9c..17cf58255f 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,8 @@ extern PGDLLIMPORT int vacuum_multixact_freeze_min_age;
 extern PGDLLIMPORT int vacuum_multixact_freeze_table_age;
 extern PGDLLIMPORT int vacuum_failsafe_age;
 extern PGDLLIMPORT int vacuum_multixact_failsafe_age;
+extern PGDLLIMPORT double vacuum_cost_delay;
+extern PGDLLIMPORT int vacuum_cost_limit;
 
 /* Variables for cost-based parallel vacuum */
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
@@ -307,6 +309,8 @@ extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
 
 extern PGDLLIMPORT bool VacuumFailsafeActive;
+extern PGDLLIMPORT int	VacuumCostLimit;
+extern PGDLLIMPORT double VacuumCostDelay;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
@@ -346,6 +350,9 @@ extern IndexBulkDeleteResult *vac_cleanup_one_index(IndexVacuumInfo *ivinfo,
 													IndexBulkDeleteResult *istat);
 extern Size vac_max_items_to_alloc_size(int max_items);
 
+/* In postmaster/autovacuum.c */
+extern void VacuumUpdateCosts(void);
+
 /* in commands/vacuumparallel.c */
 extern ParallelVacuumState *parallel_vacuum_init(Relation rel, Relation *indrels,
 												 int nindexes, int nrequested_workers,
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 06a86f9ac1..66db1b2c69 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -266,8 +266,6 @@ extern PGDLLIMPORT int max_parallel_maintenance_workers;
 extern PGDLLIMPORT int VacuumCostPageHit;
 extern PGDLLIMPORT int VacuumCostPageMiss;
 extern PGDLLIMPORT int VacuumCostPageDirty;
-extern PGDLLIMPORT int VacuumCostLimit;
-extern PGDLLIMPORT double VacuumCostDelay;
 
 extern PGDLLIMPORT int64 VacuumPageHit;
 extern PGDLLIMPORT int64 VacuumPageMiss;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index c140371b51..65afd1ea1e 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -63,9 +63,6 @@ extern int	StartAutoVacWorker(void);
 /* called from postmaster when a worker could not be forked */
 extern void AutoVacWorkerFailed(void);
 
-/* autovacuum cost-delay balancer */
-extern void AutoVacuumUpdateDelay(void);
-
 #ifdef EXEC_BACKEND
 extern void AutoVacLauncherMain(int argc, char *argv[]) pg_attribute_noreturn();
 extern void AutoVacWorkerMain(int argc, char *argv[]) pg_attribute_noreturn();
-- 
2.37.2

#51Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Melanie Plageman (#50)
Re: Should vacuum process config file reload more often

On Wed, Apr 5, 2023 at 5:05 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Tue, Apr 4, 2023 at 4:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

---
- if (worker->wi_proc != NULL)
- elog(DEBUG2, "autovac_balance_cost(pid=%d
db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d,
cost_delay=%g)",
- worker->wi_proc->pid,
worker->wi_dboid, worker->wi_tableoid,
- worker->wi_dobalance ? "yes" : "no",
- worker->wi_cost_limit,
worker->wi_cost_limit_base,
- worker->wi_cost_delay);

I think it's better to keep this kind of log in some form for
debugging. For example, we can show these values of autovacuum workers
in VacuumUpdateCosts().

I added a message to do_autovacuum() after calling VacuumUpdateCosts()
in the loop vacuuming each table. That means it will happen once per
table. It's not ideal that I had to move the call to VacuumUpdateCosts()
behind the shared lock in that loop so that we could access the pid and
such in the logging message after updating the cost and delay, but it is
probably okay. Though noone is going to be changing those at this
point, it still seemed better to access them under the lock.

This does mean we won't log anything when we do change the values of
VacuumCostDelay and VacuumCostLimit while vacuuming a table. Is it worth
adding some code to do that in VacuumUpdateCosts() (only when the value
has changed not on every call to VacuumUpdateCosts())? Or perhaps we
could add it in the config reload branch that is already in
vacuum_delay_point()?

Previously, we used to show the pid in the log since a worker/launcher
set other workers' delay costs. But now that the worker sets its delay
costs, we don't need to show the pid in the log. Also, I think it's
useful for debugging and investigating the system if we log it when
changing the values. The log I imagined to add was like:

@@ -1801,6 +1801,13 @@ VacuumUpdateCosts(void)
VacuumCostDelay = vacuum_cost_delay;

        AutoVacuumUpdateLimit();
+
+       elog(DEBUG2, "autovacuum update costs (db=%u, rel=%u,
dobalance=%s, cost_limit=%d, cost_delay=%g active=%s failsafe=%s)",
+            MyWorkerInfo->wi_dboid, MyWorkerInfo->wi_tableoid,
+            pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance)
? "no" : "yes",
+            VacuumCostLimit, VacuumCostDelay,
+            VacuumCostDelay > 0 ? "yes" : "no",
+            VacuumFailsafeActive ? "yes" : "no");
    }
    else
    {

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#52Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Melanie Plageman (#50)
Re: Should vacuum process config file reload more often

Hi.

About 0001:

+ * VacuumFailsafeActive is a defined as a global so that we can determine
+ * whether or not to re-enable cost-based vacuum delay when vacuuming a table.
+ * If failsafe mode has been engaged, we will not re-enable cost-based delay
+ * for the table until after vacuuming has completed, regardless of other
+ * settings. Only VACUUM code should inspect this variable and only table
+ * access methods should set it. In Table AM-agnostic VACUUM code, this
+ * variable controls whether or not to allow cost-based delays. Table AMs are
+ * free to use it if they desire this behavior.
+ */
+bool		VacuumFailsafeActive = false;

If I understand this correctly, there seems to be an issue. The
AM-agnostic VACUUM code is setting it and no table AMs actually do
that.

0003:
+
+			/*
+			 * Ensure VacuumFailsafeActive has been reset before vacuuming the
+			 * next relation.
+			 */
+			VacuumFailsafeActive = false;
 		}
 	}
 	PG_FINALLY();
 	{
 		in_vacuum = false;
 		VacuumCostActive = false;
+		VacuumFailsafeActive = false;
+		VacuumCostBalance = 0;

There is no need to reset VacuumFailsafeActive in the PG_TRY() block.

+	/*
+	 * Reload the configuration file if requested. This allows changes to
+	 * autovacuum_vacuum_cost_limit and autovacuum_vacuum_cost_delay to take
+	 * effect while a table is being vacuumed or analyzed.
+	 */
+	if (ConfigReloadPending && IsAutoVacuumWorkerProcess())
+	{
+		ConfigReloadPending = false;
+		ProcessConfigFile(PGC_SIGHUP);
+		VacuumUpdateCosts();
+	}

I believe we should prevent unnecessary reloading when
VacuumFailsafeActive is true.

+ AutoVacuumUpdateLimit();

I'm not entirely sure, but it might be better to name this
AutoVacuumUpdateCostLimit().

+	pg_atomic_flag wi_dobalance;
...
+		/*
+		 * We only expect this worker to ever set the flag, so don't bother
+		 * checking the return value. We shouldn't have to retry.
+		 */
+		if (tab->at_dobalance)
+			pg_atomic_test_set_flag(&MyWorkerInfo->wi_dobalance);
+		else
+			pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);

LWLockAcquire(AutovacuumLock, LW_SHARED);

autovac_recalculate_workers_for_balance();

I don't see the need for using atomic here. The code is executed
infrequently and we already take a lock while counting do_balance
workers. So sticking with the old locking method (taking LW_EXCLUSIVE
then set wi_dobalance then do balance) should be fine.

+void
+AutoVacuumUpdateLimit(void)
...
+	if (av_relopt_cost_limit > 0)
+		VacuumCostLimit = av_relopt_cost_limit;
+	else

I think we should use wi_dobalance to decide if we need to do balance
or not. We don't need to take a lock to do that since only the process
updates it.

/*
 		 * Remove my info from shared memory.  We could, but intentionally
-		 * don't, clear wi_cost_limit and friends --- this is on the
-		 * assumption that we probably have more to do with similar cost
-		 * settings, so we don't want to give up our share of I/O for a very
-		 * short interval and thereby thrash the global balance.
+		 * don't, unset wi_dobalance on the assumption that we are more likely
+		 * than not to vacuum a table with no table options next, so we don't
+		 * want to give up our share of I/O for a very short interval and
+		 * thereby thrash the global balance.
 		 */
 		LWLockAcquire(AutovacuumScheduleLock, LW_EXCLUSIVE);
 		MyWorkerInfo->wi_tableoid = InvalidOid;

The comment mentions wi_dobalance, but it doesn't appear here..

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#53Daniel Gustafsson
daniel@yesql.se
In reply to: Melanie Plageman (#50)
Re: Should vacuum process config file reload more often

On 4 Apr 2023, at 22:04, Melanie Plageman <melanieplageman@gmail.com> wrote:

On Tue, Apr 4, 2023 at 4:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Also, I don't think there is any reason why we want to exclude only
the autovacuum launcher.

My rationale is that the launcher is the only other process type which
might reasonably be executing this code besides autovac workers, client
backends doing VACUUM/ANALYZE, and parallel vacuum workers. Is it
confusing to have the launcher have VacuumCostLimt and VacuumCostDelay
set to the guc values for explicit VACUUM and ANALYZE -- even if the
launcher doesn't use these variables?

I've removed the check, because I do agree with you that it may be
unnecessarily confusing in the code.

+1

On Tue, Apr 4, 2023 at 9:36 AM Daniel Gustafsson <daniel@yesql.se> wrote:

On 4 Apr 2023, at 00:35, Melanie Plageman <melanieplageman@gmail.com> wrote:

Thinking more on this I'm leaning towards going with allowing more frequent
reloads in autovacuum, and saving the same for VACUUM for more careful study.
The general case is probably fine but I'm not convinced that there aren't error
cases which can present unpleasant scenarios.

In attached v15, I've dropped support for VACUUM and non-nested ANALYZE.
It is like a 5 line change and could be added back at any time.

I think thats the best option for now.

+extern int VacuumCostLimit;
+extern double VacuumCostDelay;
...
-extern PGDLLIMPORT int VacuumCostLimit;
-extern PGDLLIMPORT double VacuumCostDelay;

Same with these, I don't think this is according to our default visibility.
Moreover, I'm not sure it's a good idea to perform this rename. This will keep
VacuumCostLimit and VacuumCostDelay exported, but change their meaning. Any
external code referring to these thinking they are backing the GUCs will still
compile, but may be broken in subtle ways. Is there a reason for not keeping
the current GUC variables and instead add net new ones?

When VacuumCostLimit was the same variable in the code and for the GUC
vacuum_cost_limit, everytime we reload the config file, VacuumCostLimit
is overwritten. Autovacuum workers have to overwrite this value with the
appropriate one for themselves given the balancing logic and the value
of autovacuum_vacuum_cost_limit. However, the problem is, because you
can specify -1 for autovacuum_vacuum_cost_limit to indicate it should
fall back to vacuum_cost_limit, we have to reference the value of
VacuumCostLimit when calculating the new autovacuum worker's cost limit
after a config reload.

But, you have to be sure you *only* do this after a config reload when
the value of VacuumCostLimit is fresh and unmodified or you risk
dividing the value of VacuumCostLimit over and over. That means it is
unsafe to call functions updating the cost limit more than once.

This orchestration wasn't as difficult when we only reloaded the config
file once every table. We were careful about it and also kept the
original "base" cost limit around from table_recheck_autovac(). However,
once we started reloading the config file more often, this no longer
works.

By separating the variables modified when the gucs are set and the ones
used the code, we can make sure we always have the original value the
guc was set to in vacuum_cost_limit and autovacuum_vacuum_cost_limit,
whenever we need to reference it.

That being said, perhaps we should document what extensions should do?
Do you think they will want to use the variables backing the gucs or to
be able to overwrite the variables being used in the code?

I think I wasn't clear in my comment, sorry. I don't have a problem with
introducing a new variable to split the balanced value from the GUC value.
What I don't think we should do is repurpose an exported symbol into doing a
new thing. In the case at hand I think VacuumCostLimit and VacuumCostDelay
should remain the backing variables for the GUCs, with vacuum_cost_limit and
vacuum_cost_delay carrying the balanced values. So the inverse of what is in
the patch now.

The risk of these symbols being used in extensions might be very low but on
principle it seems unwise to alter a symbol and risk subtle breakage.

Oh, also I've annotated these with PGDLLIMPORT too.

+ * TODO: should VacuumCostLimit and VacuumCostDelay be initialized to valid or
+ * invalid values?
+ */
+int                    VacuumCostLimit = 0;
+double         VacuumCostDelay = -1;

I think the important part is to make sure they are never accessed without
VacuumUpdateCosts having been called first. I think that's the case here, but
it's not entirely clear. Do you see a codepath where that could happen? If
they are initialized to a sentinel value we also need to check for that, so
initializing to the defaults from the corresponding GUCs seems better.

I don't see a case where autovacuum could access these without calling
VacuumUpdateCosts() first. I think the other callers of
vacuum_delay_point() are the issue (gist/gin/hash/etc).

It might need a bit more thought.

My concern was that these variables correspond to multiple GUCs each
depending on the backend type, and those backends have different
defaults (e.g. autovac workers default cost delay is different than
client backend doing vacuum cost delay).

However, what I have done in this version is initialize them to the
defaults for a client backend executing VACUUM or ANALYZE, since I am
fairly confident that autovacuum will not use them without calling
VacuumUpdateCosts().

Another question along these lines, we only call AutoVacuumUpdateLimit() in
case there is a sleep in vacuum_delay_point():

+       /*
+        * Balance and update limit values for autovacuum workers. We must
+        * always do this in case the autovacuum launcher or another
+        * autovacuum worker has recalculated the number of workers across
+        * which we must balance the limit. This is done by the launcher when
+        * launching a new worker and by workers before vacuuming each table.
+        */
+       AutoVacuumUpdateLimit();

Shouldn't we always call that in case we had a config reload, or am I being
thick?

+static double av_relopt_cost_delay = -1;
+static int av_relopt_cost_limit = 0;

Sorry, I didn't catch this earlier, shouldn't this be -1 to match the default
value of autovacuum_vacuum_cost_limit?

These need a comment IMO, ideally one that explain why they are initialized to
those values.

I've added a comment.

+ * Variables to save the cost-related table options for the current relation

The "table options" nomenclature is right now only used for FDW foreign table
options, I think we should use "storage parameters" or "relation options" here.

+       /* There is at least 1 autovac worker (this worker). */
+       Assert(nworkers_for_balance > 0);

Is there a scenario where this is expected to fail? If so I think this should
be handled and not just an Assert.

No, this isn't expected to happen because an autovacuum worker would
have called autovac_recalculate_workers_for_balance() before calling
VacuumUpdateCosts() (which calls AutoVacuumUpdateLimit()) in
do_autovacuum(). But, if someone were to move around or add a call to
VacuumUpdateCosts() there is a chance it could happen.

Thinking more on this I'm tempted to recommend that we promote this to an
elog(), mainly due to the latter. An accidental call to VacuumUpdateCosts()
doesn't seem entirely unlikely to happen

--
Daniel Gustafsson

#54Melanie Plageman
melanieplageman@gmail.com
In reply to: Daniel Gustafsson (#53)
3 attachment(s)
Re: Should vacuum process config file reload more often

Thanks all for the reviews.

v16 attached. I put it together rather quickly, so there might be a few
spurious whitespaces or similar. There is one rather annoying pgindent
outlier that I have to figure out what to do about as well.

The remaining functional TODOs that I know of are:

- Resolve what to do about names of GUC and vacuum variables for cost
limit and cost delay (since it may affect extensions)

- Figure out what to do about the logging message which accesses dboid
and tableoid (lock/no lock, where to put it, etc)

- I see several places in docs which reference the balancing algorithm
for autovac workers. I did not read them in great detail, but we may
want to review them to see if any require updates.

- Consider whether or not the initial two commits should just be
squashed with the third commit

- Anything else reviewers are still unhappy with

On Wed, Apr 5, 2023 at 1:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Wed, Apr 5, 2023 at 5:05 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Tue, Apr 4, 2023 at 4:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

---
- if (worker->wi_proc != NULL)
- elog(DEBUG2, "autovac_balance_cost(pid=%d
db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d,
cost_delay=%g)",
- worker->wi_proc->pid,
worker->wi_dboid, worker->wi_tableoid,
- worker->wi_dobalance ? "yes" : "no",
- worker->wi_cost_limit,
worker->wi_cost_limit_base,
- worker->wi_cost_delay);

I think it's better to keep this kind of log in some form for
debugging. For example, we can show these values of autovacuum workers
in VacuumUpdateCosts().

I added a message to do_autovacuum() after calling VacuumUpdateCosts()
in the loop vacuuming each table. That means it will happen once per
table. It's not ideal that I had to move the call to VacuumUpdateCosts()
behind the shared lock in that loop so that we could access the pid and
such in the logging message after updating the cost and delay, but it is
probably okay. Though noone is going to be changing those at this
point, it still seemed better to access them under the lock.

This does mean we won't log anything when we do change the values of
VacuumCostDelay and VacuumCostLimit while vacuuming a table. Is it worth
adding some code to do that in VacuumUpdateCosts() (only when the value
has changed not on every call to VacuumUpdateCosts())? Or perhaps we
could add it in the config reload branch that is already in
vacuum_delay_point()?

Previously, we used to show the pid in the log since a worker/launcher
set other workers' delay costs. But now that the worker sets its delay
costs, we don't need to show the pid in the log. Also, I think it's
useful for debugging and investigating the system if we log it when
changing the values. The log I imagined to add was like:

@@ -1801,6 +1801,13 @@ VacuumUpdateCosts(void)
VacuumCostDelay = vacuum_cost_delay;

AutoVacuumUpdateLimit();
+
+       elog(DEBUG2, "autovacuum update costs (db=%u, rel=%u,
dobalance=%s, cost_limit=%d, cost_delay=%g active=%s failsafe=%s)",
+            MyWorkerInfo->wi_dboid, MyWorkerInfo->wi_tableoid,
+            pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance)
? "no" : "yes",
+            VacuumCostLimit, VacuumCostDelay,
+            VacuumCostDelay > 0 ? "yes" : "no",
+            VacuumFailsafeActive ? "yes" : "no");
}
else
{

Makes sense. I've updated the log message to roughly what you suggested.
I also realized I think it does make sense to call it in
VacuumUpdateCosts() -- only for autovacuum workers of course. I've done
this. I haven't taken the lock though and can't decide if I must since
they access dboid and tableoid -- those are not going to change at this
point, but I still don't know if I can access them lock-free...
Perhaps there is a way to condition it on the log level?

If I have to take a lock, then I don't know if we should put these in
VacuumUpdateCosts()...

On Wed, Apr 5, 2023 at 3:16 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

About 0001:

+ * VacuumFailsafeActive is a defined as a global so that we can determine
+ * whether or not to re-enable cost-based vacuum delay when vacuuming a table.
+ * If failsafe mode has been engaged, we will not re-enable cost-based delay
+ * for the table until after vacuuming has completed, regardless of other
+ * settings. Only VACUUM code should inspect this variable and only table
+ * access methods should set it. In Table AM-agnostic VACUUM code, this
+ * variable controls whether or not to allow cost-based delays. Table AMs are
+ * free to use it if they desire this behavior.
+ */
+bool           VacuumFailsafeActive = false;

If I understand this correctly, there seems to be an issue. The
AM-agnostic VACUUM code is setting it and no table AMs actually do
that.

No, it is not set in table AM-agnostic VACUUM code. I meant it is
used/read from/inspected in table AM-agnostic VACUUM code. Table AMs can
set it if they want to avoid cost-based delays being re-enabled. It is
only set to true heap-specific code and is initialized to false and
reset in table AM-agnostic code back to false in between each relation
being vacuumed. I updated the comment to reflect this. Let me know if
you think it is clear.

0003:
+
+                       /*
+                        * Ensure VacuumFailsafeActive has been reset before vacuuming the
+                        * next relation.
+                        */
+                       VacuumFailsafeActive = false;
}
}
PG_FINALLY();
{
in_vacuum = false;
VacuumCostActive = false;
+               VacuumFailsafeActive = false;
+               VacuumCostBalance = 0;

There is no need to reset VacuumFailsafeActive in the PG_TRY() block.

I think that is true -- since it is initialized to false and reset to
false after vacuuming every relation. However, I am leaning toward
keeping it because I haven't thought through every codepath and
determined if there is ever a way where it could be true here.

+       /*
+        * Reload the configuration file if requested. This allows changes to
+        * autovacuum_vacuum_cost_limit and autovacuum_vacuum_cost_delay to take
+        * effect while a table is being vacuumed or analyzed.
+        */
+       if (ConfigReloadPending && IsAutoVacuumWorkerProcess())
+       {
+               ConfigReloadPending = false;
+               ProcessConfigFile(PGC_SIGHUP);
+               VacuumUpdateCosts();
+       }

I believe we should prevent unnecessary reloading when
VacuumFailsafeActive is true.

This is in conflict with two of the other reviewers feedback:
Sawada-san:

+         * Reload the configuration file if requested. This allows changes to
+         * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay to
+         * take effect while a table is being vacuumed or analyzed.
+         */
+        if (ConfigReloadPending && !analyze_in_outer_xact)
+        {
+                ConfigReloadPending = false;
+                ProcessConfigFile(PGC_SIGHUP);
+                AutoVacuumUpdateDelay();
+                AutoVacuumUpdateLimit();
+        }

It makes sense to me that we need to reload the config file even when
vacuum-delay is disabled. But I think it's not convenient for users
that we don't reload the configuration file once the failsafe is
triggered. I think users might want to change some GUCs such as
log_autovacuum_min_duration.

and Daniel in response to this:

It makes sense to me that we need to reload the config file even when
vacuum-delay is disabled. But I think it's not convenient for users
that we don't reload the configuration file once the failsafe is
triggered. I think users might want to change some GUCs such as
log_autovacuum_min_duration.

I agree with this.

+ AutoVacuumUpdateLimit();

I'm not entirely sure, but it might be better to name this
AutoVacuumUpdateCostLimit().

I have made this change.

+       pg_atomic_flag wi_dobalance;
...
+               /*
+                * We only expect this worker to ever set the flag, so don't bother
+                * checking the return value. We shouldn't have to retry.
+                */
+               if (tab->at_dobalance)
+                       pg_atomic_test_set_flag(&MyWorkerInfo->wi_dobalance);
+               else
+                       pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);

LWLockAcquire(AutovacuumLock, LW_SHARED);

autovac_recalculate_workers_for_balance();

I don't see the need for using atomic here. The code is executed
infrequently and we already take a lock while counting do_balance
workers. So sticking with the old locking method (taking LW_EXCLUSIVE
then set wi_dobalance then do balance) should be fine.

We access wi_dobalance on every call to AutoVacuumUpdateLimit() which is
executed in vacuum_delay_point(). I do not think we can justify take a
shared lock in a function that is called so frequently.

+void
+AutoVacuumUpdateLimit(void)
...
+       if (av_relopt_cost_limit > 0)
+               VacuumCostLimit = av_relopt_cost_limit;
+       else

I think we should use wi_dobalance to decide if we need to do balance
or not. We don't need to take a lock to do that since only the process
updates it.

We do do that below in the "else" before balancing. But we for sure
don't need to balance if relopt for cost limit is set. We can save an
access to an atomic variable this way. I think the atomic is a
relatively cheap way of avoiding this whole locking question.

/*
* Remove my info from shared memory.  We could, but intentionally
-                * don't, clear wi_cost_limit and friends --- this is on the
-                * assumption that we probably have more to do with similar cost
-                * settings, so we don't want to give up our share of I/O for a very
-                * short interval and thereby thrash the global balance.
+                * don't, unset wi_dobalance on the assumption that we are more likely
+                * than not to vacuum a table with no table options next, so we don't
+                * want to give up our share of I/O for a very short interval and
+                * thereby thrash the global balance.
*/
LWLockAcquire(AutovacuumScheduleLock, LW_EXCLUSIVE);
MyWorkerInfo->wi_tableoid = InvalidOid;

The comment mentions wi_dobalance, but it doesn't appear here..

The point of the comment is that we don't do anything with wi_dobalance
here. It is explaining why it doesn't appear. The previous comment
mentioned not doing anything with wi_cost_delay and wi_cost_limit which
also didn't appear here.

On Wed, Apr 5, 2023 at 9:10 AM Daniel Gustafsson <daniel@yesql.se> wrote:

On 4 Apr 2023, at 22:04, Melanie Plageman <melanieplageman@gmail.com> wrote:

+extern int VacuumCostLimit;
+extern double VacuumCostDelay;
...
-extern PGDLLIMPORT int VacuumCostLimit;
-extern PGDLLIMPORT double VacuumCostDelay;

Same with these, I don't think this is according to our default visibility.
Moreover, I'm not sure it's a good idea to perform this rename. This will keep
VacuumCostLimit and VacuumCostDelay exported, but change their meaning. Any
external code referring to these thinking they are backing the GUCs will still
compile, but may be broken in subtle ways. Is there a reason for not keeping
the current GUC variables and instead add net new ones?

When VacuumCostLimit was the same variable in the code and for the GUC
vacuum_cost_limit, everytime we reload the config file, VacuumCostLimit
is overwritten. Autovacuum workers have to overwrite this value with the
appropriate one for themselves given the balancing logic and the value
of autovacuum_vacuum_cost_limit. However, the problem is, because you
can specify -1 for autovacuum_vacuum_cost_limit to indicate it should
fall back to vacuum_cost_limit, we have to reference the value of
VacuumCostLimit when calculating the new autovacuum worker's cost limit
after a config reload.

But, you have to be sure you *only* do this after a config reload when
the value of VacuumCostLimit is fresh and unmodified or you risk
dividing the value of VacuumCostLimit over and over. That means it is
unsafe to call functions updating the cost limit more than once.

This orchestration wasn't as difficult when we only reloaded the config
file once every table. We were careful about it and also kept the
original "base" cost limit around from table_recheck_autovac(). However,
once we started reloading the config file more often, this no longer
works.

By separating the variables modified when the gucs are set and the ones
used the code, we can make sure we always have the original value the
guc was set to in vacuum_cost_limit and autovacuum_vacuum_cost_limit,
whenever we need to reference it.

That being said, perhaps we should document what extensions should do?
Do you think they will want to use the variables backing the gucs or to
be able to overwrite the variables being used in the code?

I think I wasn't clear in my comment, sorry. I don't have a problem with
introducing a new variable to split the balanced value from the GUC value.
What I don't think we should do is repurpose an exported symbol into doing a
new thing. In the case at hand I think VacuumCostLimit and VacuumCostDelay
should remain the backing variables for the GUCs, with vacuum_cost_limit and
vacuum_cost_delay carrying the balanced values. So the inverse of what is in
the patch now.

The risk of these symbols being used in extensions might be very low but on
principle it seems unwise to alter a symbol and risk subtle breakage.

I totally see what you are saying. The only complication is that all of
the other variables used in vacuum code are the camelcase and the gucs
follow the snake case -- as pointed out in a previous review comment by
Sawada-san:

@@ -83,6 +84,7 @@ int                   vacuum_cost_limit;
*/
int                    VacuumCostLimit = 0;
double         VacuumCostDelay = -1;
+static bool vacuum_can_reload_config = false;

In vacuum.c, we use snake case for GUC parameters and camel case for
other global variables, so it seems better to rename it
VacuumCanReloadConfig. Sorry, that's my fault.

This is less of a compelling argument than subtle breakage for extension
code, though.

I am, however, wondering if extensions expect to have access to the guc
variable or the global variable -- or both?

Left it as is in this version until we resolve the question.

Oh, also I've annotated these with PGDLLIMPORT too.

+ * TODO: should VacuumCostLimit and VacuumCostDelay be initialized to valid or
+ * invalid values?
+ */
+int                    VacuumCostLimit = 0;
+double         VacuumCostDelay = -1;

I think the important part is to make sure they are never accessed without
VacuumUpdateCosts having been called first. I think that's the case here, but
it's not entirely clear. Do you see a codepath where that could happen? If
they are initialized to a sentinel value we also need to check for that, so
initializing to the defaults from the corresponding GUCs seems better.

I don't see a case where autovacuum could access these without calling
VacuumUpdateCosts() first. I think the other callers of
vacuum_delay_point() are the issue (gist/gin/hash/etc).

It might need a bit more thought.

My concern was that these variables correspond to multiple GUCs each
depending on the backend type, and those backends have different
defaults (e.g. autovac workers default cost delay is different than
client backend doing vacuum cost delay).

However, what I have done in this version is initialize them to the
defaults for a client backend executing VACUUM or ANALYZE, since I am
fairly confident that autovacuum will not use them without calling
VacuumUpdateCosts().

Another question along these lines, we only call AutoVacuumUpdateLimit() in
case there is a sleep in vacuum_delay_point():

+       /*
+        * Balance and update limit values for autovacuum workers. We must
+        * always do this in case the autovacuum launcher or another
+        * autovacuum worker has recalculated the number of workers across
+        * which we must balance the limit. This is done by the launcher when
+        * launching a new worker and by workers before vacuuming each table.
+        */
+       AutoVacuumUpdateLimit();

Shouldn't we always call that in case we had a config reload, or am I being
thick?

We actually also call it from inside VacuumUpdateCosts(), which is
always called in the case of a config reload.

+static double av_relopt_cost_delay = -1;
+static int av_relopt_cost_limit = 0;

Sorry, I didn't catch this earlier, shouldn't this be -1 to match the default
value of autovacuum_vacuum_cost_limit?

Yea, this is a bit tricky. Initial values of -1 and 0 have the same
effect when we are referencing av_relopt_vacuum_cost_limit in
AutoVacuumUpdateCostLimit(). However, I was trying to initialize both
av_relopt_vacuum_cost_limit and av_relopt_vacuum_cost_delay to "invalid"
values which were not the default for the associated autovacuum gucs,
since initializing av_relopt_cost_delay to the default for
autovacuum_vacuum_cost_delay (2 ms) would cause it to be used even if
storage params were not set for the relation.

I have updated the initial value to -1, as you suggested -- but I don't
know if it is more or less confusing the explain what I just explained
in the comment above it.

These need a comment IMO, ideally one that explain why they are initialized to
those values.

I've added a comment.

+ * Variables to save the cost-related table options for the current relation

The "table options" nomenclature is right now only used for FDW foreign table
options, I think we should use "storage parameters" or "relation options" here.

I've updated these to "storage parameters" to match the docs. I poked
around looking for other places I referred to them as table options and
tried to fix those as well. I've also changed all relevant variable
names.

+       /* There is at least 1 autovac worker (this worker). */
+       Assert(nworkers_for_balance > 0);

Is there a scenario where this is expected to fail? If so I think this should
be handled and not just an Assert.

No, this isn't expected to happen because an autovacuum worker would
have called autovac_recalculate_workers_for_balance() before calling
VacuumUpdateCosts() (which calls AutoVacuumUpdateLimit()) in
do_autovacuum(). But, if someone were to move around or add a call to
VacuumUpdateCosts() there is a chance it could happen.

Thinking more on this I'm tempted to recommend that we promote this to an
elog(), mainly due to the latter. An accidental call to VacuumUpdateCosts()
doesn't seem entirely unlikely to happen

Makes sense. I've added a trivial elog ERROR, but I didn't spend quite
enough time thinking about what (if any) other context to include in it.

- Melanie

Attachments:

v16-0001-Make-vacuum-s-failsafe_active-a-global.patchtext/x-patch; charset=US-ASCII; name=v16-0001-Make-vacuum-s-failsafe_active-a-global.patchDownload
From 8e87c95522d54fdacd77acf1c5b314968d3c7e68 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Fri, 31 Mar 2023 10:38:39 -0400
Subject: [PATCH v16 1/3] Make vacuum's failsafe_active a global

While vacuuming a table in failsafe mode, VacuumCostActive should not be
re-enabled. This currently isn't a problem because vacuum cost
parameters are only refreshed in between vacuuming tables and failsafe
status is reset for every table. In preparation for allowing vacuum cost
parameters to be updated more frequently, elevate
LVRelState->failsafe_active to a global, VacuumFailsafeActive, which
will be checked when determining whether or not to re-enable vacuum
cost-related delays.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 16 +++++++---------
 src/backend/commands/vacuum.c        | 15 +++++++++++++++
 src/include/commands/vacuum.h        |  1 +
 3 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 639179aa46..2ba85bd3d6 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -153,8 +153,6 @@ typedef struct LVRelState
 	bool		aggressive;
 	/* Use visibility map to skip? (disabled by DISABLE_PAGE_SKIPPING) */
 	bool		skipwithvm;
-	/* Wraparound failsafe has been triggered? */
-	bool		failsafe_active;
 	/* Consider index vacuuming bypass optimization? */
 	bool		consider_bypass_optimization;
 
@@ -391,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
 	Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
 		   params->truncate != VACOPTVALUE_AUTO);
-	vacrel->failsafe_active = false;
+	VacuumFailsafeActive = false;
 	vacrel->consider_bypass_optimization = true;
 	vacrel->do_index_vacuuming = true;
 	vacrel->do_index_cleanup = true;
@@ -709,7 +707,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 			}
 			else
 			{
-				if (!vacrel->failsafe_active)
+				if (!VacuumFailsafeActive)
 					appendStringInfoString(&buf, _("index scan bypassed: "));
 				else
 					appendStringInfoString(&buf, _("index scan bypassed by failsafe: "));
@@ -2293,7 +2291,7 @@ lazy_vacuum(LVRelState *vacrel)
 		 * vacuuming or heap vacuuming.  This VACUUM operation won't end up
 		 * back here again.
 		 */
-		Assert(vacrel->failsafe_active);
+		Assert(VacuumFailsafeActive);
 	}
 
 	/*
@@ -2374,7 +2372,7 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	 */
 	Assert(vacrel->num_index_scans > 0 ||
 		   vacrel->dead_items->num_items == vacrel->lpdead_items);
-	Assert(allindexes || vacrel->failsafe_active);
+	Assert(allindexes || VacuumFailsafeActive);
 
 	/*
 	 * Increase and report the number of index scans.
@@ -2616,12 +2614,12 @@ static bool
 lazy_check_wraparound_failsafe(LVRelState *vacrel)
 {
 	/* Don't warn more than once per VACUUM */
-	if (vacrel->failsafe_active)
+	if (VacuumFailsafeActive)
 		return true;
 
 	if (unlikely(vacuum_xid_failsafe_check(&vacrel->cutoffs)))
 	{
-		vacrel->failsafe_active = true;
+		VacuumFailsafeActive = true;
 
 		/*
 		 * Abandon use of a buffer access strategy to allow use of all of
@@ -2820,7 +2818,7 @@ should_attempt_truncation(LVRelState *vacrel)
 {
 	BlockNumber possibly_freeable;
 
-	if (!vacrel->do_rel_truncate || vacrel->failsafe_active ||
+	if (!vacrel->do_rel_truncate || VacuumFailsafeActive ||
 		old_snapshot_threshold >= 0)
 		return false;
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index da85330ef4..c74b21fce9 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -72,6 +72,21 @@ int			vacuum_multixact_freeze_table_age;
 int			vacuum_failsafe_age;
 int			vacuum_multixact_failsafe_age;
 
+/*
+ * VacuumFailsafeActive is a defined as a global so that we can determine
+ * whether or not to re-enable cost-based vacuum delay when vacuuming a table.
+ * If failsafe mode has been engaged, we will not re-enable cost-based delay
+ * for the table until after vacuuming has completed, regardless of other
+ * settings.
+ *
+ * Only VACUUM code should inspect this variable and only table access methods
+ * should set it to true. In Table AM-agnostic VACUUM code, this variable is
+ * inspected to determine whether or not to allow cost-based delays. Table AMs
+ * are free to set it if they desire this behavior, but it is false by default
+ * and reset to false in between vacuuming each relation.
+ */
+bool		VacuumFailsafeActive = false;
+
 /*
  * Variables for cost-based parallel vacuum.  See comments atop
  * compute_parallel_delay to understand how it works.
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index bdfd96cfec..7219c6ba9c 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -306,6 +306,7 @@ extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
 
+extern PGDLLIMPORT bool VacuumFailsafeActive;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
-- 
2.37.2

v16-0002-Separate-vacuum-cost-variables-from-gucs.patchtext/x-patch; charset=US-ASCII; name=v16-0002-Separate-vacuum-cost-variables-from-gucs.patchDownload
From e6f7ceb95600149268ba87c3f62c9f549d9e2ca1 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Mon, 3 Apr 2023 11:22:18 -0400
Subject: [PATCH v16 2/3] Separate vacuum cost variables from gucs

Vacuum code run both by autovacuum workers and a backend doing
VACUUM/ANALYZE previously used VacuumCostLimit and VacuumCostDelay which
were the global variables for the gucs vacuum_cost_limit and
vacuum_cost_delay. Autovacuum workers needed to override these variables
with their own values, derived from autovacuum_vacuum_cost_limit and
autovacuum_vacuum_cost_delay and worker cost limit balancing logic. This
led to confusing code which, in some cases, both derived and set a new
value of VacuumCostLimit from VacuumCostLimit.

In preparation for refreshing these guc values more often, separate
these variables from the gucs themselves and add a function to update
the global variables using the gucs and existing logic.

Per suggestion by Kyotaro Horiguchi

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
---
 src/backend/commands/vacuum.c         | 15 +++++++++--
 src/backend/commands/vacuumparallel.c |  1 +
 src/backend/postmaster/autovacuum.c   | 38 +++++++++++----------------
 src/backend/utils/init/globals.c      |  2 --
 src/backend/utils/misc/guc_tables.c   |  4 +--
 src/include/commands/vacuum.h         |  7 +++++
 src/include/miscadmin.h               |  2 --
 src/include/postmaster/autovacuum.h   |  3 ---
 8 files changed, 39 insertions(+), 33 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index c74b21fce9..c842d8f1e9 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -71,6 +71,17 @@ int			vacuum_multixact_freeze_min_age;
 int			vacuum_multixact_freeze_table_age;
 int			vacuum_failsafe_age;
 int			vacuum_multixact_failsafe_age;
+double		vacuum_cost_delay;
+int			vacuum_cost_limit;
+
+/*
+ * Variables for cost-based vacuum delay. The defaults differ between
+ * autovacuum and vacuum. These should be overridden with the appropriate GUC
+ * value in vacuum code. These are initialized here to the defaults for client
+ * backends executing VACUUM or ANALYZE.
+ */
+int			VacuumCostLimit = 200;
+double		VacuumCostDelay = 0;
 
 /*
  * VacuumFailsafeActive is a defined as a global so that we can determine
@@ -504,6 +515,7 @@ vacuum(List *relations, VacuumParams *params,
 	{
 		ListCell   *cur;
 
+		VacuumUpdateCosts();
 		in_vacuum = true;
 		VacuumCostActive = (VacuumCostDelay > 0);
 		VacuumCostBalance = 0;
@@ -2264,8 +2276,7 @@ vacuum_delay_point(void)
 
 		VacuumCostBalance = 0;
 
-		/* update balance values for workers */
-		AutoVacuumUpdateDelay();
+		VacuumUpdateCosts();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 563117a8f6..0b59c922e4 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -996,6 +996,7 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 
 	/* Set cost-based vacuum delay */
 	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumUpdateCosts();
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
 	VacuumPageMiss = 0;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 585d28148c..ce7e009576 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1774,17 +1774,25 @@ FreeWorkerInfo(int code, Datum arg)
 }
 
 /*
- * Update the cost-based delay parameters, so that multiple workers consume
- * each a fraction of the total available I/O.
+ * Update vacuum cost-based delay-related parameters for autovacuum workers and
+ * backends executing VACUUM or ANALYZE using the value of relevant gucs and
+ * global state. This must be called during setup for vacuum and after every
+ * config reload to ensure up-to-date values.
  */
 void
-AutoVacuumUpdateDelay(void)
+VacuumUpdateCosts(void)
 {
 	if (MyWorkerInfo)
 	{
 		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
 		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
 	}
+	else
+	{
+		/* Must be explicit VACUUM or ANALYZE */
+		VacuumCostLimit = vacuum_cost_limit;
+		VacuumCostDelay = vacuum_cost_delay;
+	}
 }
 
 /*
@@ -1805,9 +1813,9 @@ autovac_balance_cost(void)
 	 * zero is not a valid value.
 	 */
 	int			vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
-								  autovacuum_vac_cost_limit : VacuumCostLimit);
+								  autovacuum_vac_cost_limit : vacuum_cost_limit);
 	double		vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
-								  autovacuum_vac_cost_delay : VacuumCostDelay);
+								  autovacuum_vac_cost_delay : vacuum_cost_delay);
 	double		cost_total;
 	double		cost_avail;
 	dlist_iter	iter;
@@ -2312,8 +2320,6 @@ do_autovacuum(void)
 		autovac_table *tab;
 		bool		isshared;
 		bool		skipit;
-		double		stdVacuumCostDelay;
-		int			stdVacuumCostLimit;
 		dlist_iter	iter;
 
 		CHECK_FOR_INTERRUPTS();
@@ -2416,14 +2422,6 @@ do_autovacuum(void)
 			continue;
 		}
 
-		/*
-		 * Remember the prevailing values of the vacuum cost GUCs.  We have to
-		 * restore these at the bottom of the loop, else we'll compute wrong
-		 * values in the next iteration of autovac_balance_cost().
-		 */
-		stdVacuumCostDelay = VacuumCostDelay;
-		stdVacuumCostLimit = VacuumCostLimit;
-
 		/* Must hold AutovacuumLock while mucking with cost balance info */
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 
@@ -2437,7 +2435,7 @@ do_autovacuum(void)
 		autovac_balance_cost();
 
 		/* set the active cost parameters from the result of that */
-		AutoVacuumUpdateDelay();
+		VacuumUpdateCosts();
 
 		/* done */
 		LWLockRelease(AutovacuumLock);
@@ -2534,10 +2532,6 @@ deleted:
 		MyWorkerInfo->wi_tableoid = InvalidOid;
 		MyWorkerInfo->wi_sharedrel = false;
 		LWLockRelease(AutovacuumScheduleLock);
-
-		/* restore vacuum cost GUCs for the next iteration */
-		VacuumCostDelay = stdVacuumCostDelay;
-		VacuumCostLimit = stdVacuumCostLimit;
 	}
 
 	/*
@@ -2820,14 +2814,14 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			? avopts->vacuum_cost_delay
 			: (autovacuum_vac_cost_delay >= 0)
 			? autovacuum_vac_cost_delay
-			: VacuumCostDelay;
+			: vacuum_cost_delay;
 
 		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
 		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
 			? avopts->vacuum_cost_limit
 			: (autovacuum_vac_cost_limit > 0)
 			? autovacuum_vac_cost_limit
-			: VacuumCostLimit;
+			: vacuum_cost_limit;
 
 		/* -1 in autovac setting means use log_autovacuum_min_duration */
 		log_min_duration = (avopts && avopts->log_min_duration >= 0)
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 1b1d814254..8e5b065e8f 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -142,8 +142,6 @@ int			MaxBackends = 0;
 int			VacuumCostPageHit = 1;	/* GUC parameters for vacuum */
 int			VacuumCostPageMiss = 2;
 int			VacuumCostPageDirty = 20;
-int			VacuumCostLimit = 200;
-double		VacuumCostDelay = 0;
 
 int64		VacuumPageHit = 0;
 int64		VacuumPageMiss = 0;
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8062589efd..77db1a146c 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2409,7 +2409,7 @@ struct config_int ConfigureNamesInt[] =
 			gettext_noop("Vacuum cost amount available before napping."),
 			NULL
 		},
-		&VacuumCostLimit,
+		&vacuum_cost_limit,
 		200, 1, 10000,
 		NULL, NULL, NULL
 	},
@@ -3701,7 +3701,7 @@ struct config_real ConfigureNamesReal[] =
 			NULL,
 			GUC_UNIT_MS
 		},
-		&VacuumCostDelay,
+		&vacuum_cost_delay,
 		0, 0, 100,
 		NULL, NULL, NULL
 	},
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 7219c6ba9c..d048bb6e0d 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,8 @@ extern PGDLLIMPORT int vacuum_multixact_freeze_min_age;
 extern PGDLLIMPORT int vacuum_multixact_freeze_table_age;
 extern PGDLLIMPORT int vacuum_failsafe_age;
 extern PGDLLIMPORT int vacuum_multixact_failsafe_age;
+extern PGDLLIMPORT double vacuum_cost_delay;
+extern PGDLLIMPORT int vacuum_cost_limit;
 
 /* Variables for cost-based parallel vacuum */
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
@@ -307,6 +309,8 @@ extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
 
 extern PGDLLIMPORT bool VacuumFailsafeActive;
+extern PGDLLIMPORT int VacuumCostLimit;
+extern PGDLLIMPORT double VacuumCostDelay;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
@@ -346,6 +350,9 @@ extern IndexBulkDeleteResult *vac_cleanup_one_index(IndexVacuumInfo *ivinfo,
 													IndexBulkDeleteResult *istat);
 extern Size vac_max_items_to_alloc_size(int max_items);
 
+/* In postmaster/autovacuum.c */
+extern void VacuumUpdateCosts(void);
+
 /* in commands/vacuumparallel.c */
 extern ParallelVacuumState *parallel_vacuum_init(Relation rel, Relation *indrels,
 												 int nindexes, int nrequested_workers,
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 06a86f9ac1..66db1b2c69 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -266,8 +266,6 @@ extern PGDLLIMPORT int max_parallel_maintenance_workers;
 extern PGDLLIMPORT int VacuumCostPageHit;
 extern PGDLLIMPORT int VacuumCostPageMiss;
 extern PGDLLIMPORT int VacuumCostPageDirty;
-extern PGDLLIMPORT int VacuumCostLimit;
-extern PGDLLIMPORT double VacuumCostDelay;
 
 extern PGDLLIMPORT int64 VacuumPageHit;
 extern PGDLLIMPORT int64 VacuumPageMiss;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index c140371b51..65afd1ea1e 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -63,9 +63,6 @@ extern int	StartAutoVacWorker(void);
 /* called from postmaster when a worker could not be forked */
 extern void AutoVacWorkerFailed(void);
 
-/* autovacuum cost-delay balancer */
-extern void AutoVacuumUpdateDelay(void);
-
 #ifdef EXEC_BACKEND
 extern void AutoVacLauncherMain(int argc, char *argv[]) pg_attribute_noreturn();
 extern void AutoVacWorkerMain(int argc, char *argv[]) pg_attribute_noreturn();
-- 
2.37.2

v16-0003-Autovacuum-refreshes-cost-based-delay-params-mor.patchtext/x-patch; charset=US-ASCII; name=v16-0003-Autovacuum-refreshes-cost-based-delay-params-mor.patchDownload
From e2a52318a35fbe7236b675f5a7c210d02568aadf Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 25 Mar 2023 14:14:55 -0400
Subject: [PATCH v16 3/3] Autovacuum refreshes cost-based delay params more
 often

Allow autovacuum to reload the config file more often so that cost-based
delay parameters can take effect while VACUUMing a relation. Previously
autovacuum workers only reloaded the config file once per relation
vacuumed, so config changes could not take effect until beginning to
vacuum the next table.

Now, check if a reload is pending roughly once per block, when checking
if we need to delay.

In order for autovacuum workers to safely update their own cost delay
and cost limit parameters without impacting performance, we had to
rethink when and how these values were accessed.

Previously, an autovacuum worker's wi_cost_limit was set only at the
beginning of vacuuming a table, after reloading the config file.
Therefore, at the time that autovac_balance_cost() is called, workers
vacuuming tables with no cost-related storage parameters could still
have different values for their wi_cost_limit_base and wi_cost_delay.

Now that the cost parameters can be updated while vacuuming a table,
workers will (within some margin of error) have no reason to have
different values for cost limit and cost delay (in the absence of
cost-related storage parameters). This removes the rationale for keeping
cost limit and cost delay in shared memory. Balancing the cost limit
requires only the number of active autovacuum workers vacuuming a table
with no cost-based storage parameters.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c  |   2 +-
 src/backend/commands/vacuum.c         |  44 ++++-
 src/backend/commands/vacuumparallel.c |   1 -
 src/backend/postmaster/autovacuum.c   | 271 +++++++++++++++-----------
 src/include/commands/vacuum.h         |   1 +
 5 files changed, 200 insertions(+), 119 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 2ba85bd3d6..0a9ebd22bd 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -389,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
 	Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
 		   params->truncate != VACOPTVALUE_AUTO);
-	VacuumFailsafeActive = false;
+	Assert(!VacuumFailsafeActive);
 	vacrel->consider_bypass_optimization = true;
 	vacrel->do_index_vacuuming = true;
 	vacrel->do_index_cleanup = true;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index c842d8f1e9..37fbbe008c 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -48,6 +48,7 @@
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/interrupt.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/pmsignal.h"
@@ -515,9 +516,9 @@ vacuum(List *relations, VacuumParams *params,
 	{
 		ListCell   *cur;
 
-		VacuumUpdateCosts();
 		in_vacuum = true;
-		VacuumCostActive = (VacuumCostDelay > 0);
+		VacuumFailsafeActive = false;
+		VacuumUpdateCosts();
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
@@ -571,12 +572,20 @@ vacuum(List *relations, VacuumParams *params,
 					CommandCounterIncrement();
 				}
 			}
+
+			/*
+			 * Ensure VacuumFailsafeActive has been reset before vacuuming the
+			 * next relation.
+			 */
+			VacuumFailsafeActive = false;
 		}
 	}
 	PG_FINALLY();
 	{
 		in_vacuum = false;
 		VacuumCostActive = false;
+		VacuumFailsafeActive = false;
+		VacuumCostBalance = 0;
 	}
 	PG_END_TRY();
 
@@ -2243,7 +2252,27 @@ vacuum_delay_point(void)
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	if (!VacuumCostActive || InterruptPending)
+	if (InterruptPending ||
+		(!VacuumCostActive && !ConfigReloadPending))
+		return;
+
+	/*
+	 * Reload the configuration file if requested. This allows changes to
+	 * autovacuum_vacuum_cost_limit and autovacuum_vacuum_cost_delay to take
+	 * effect while a table is being vacuumed or analyzed.
+	 */
+	if (ConfigReloadPending && IsAutoVacuumWorkerProcess())
+	{
+		ConfigReloadPending = false;
+		ProcessConfigFile(PGC_SIGHUP);
+		VacuumUpdateCosts();
+	}
+
+	/*
+	 * If we disabled cost-based delays after reloading the config file,
+	 * return.
+	 */
+	if (!VacuumCostActive)
 		return;
 
 	/*
@@ -2276,7 +2305,14 @@ vacuum_delay_point(void)
 
 		VacuumCostBalance = 0;
 
-		VacuumUpdateCosts();
+		/*
+		 * Balance and update limit values for autovacuum workers. We must
+		 * always do this in case the autovacuum launcher or another
+		 * autovacuum worker has recalculated the number of workers across
+		 * which we must balance the limit. This is done by the launcher when
+		 * launching a new worker and by workers before vacuuming each table.
+		 */
+		AutoVacuumUpdateCostLimit();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0b59c922e4..e200d5caf8 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -995,7 +995,6 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												 false);
 
 	/* Set cost-based vacuum delay */
-	VacuumCostActive = (VacuumCostDelay > 0);
 	VacuumUpdateCosts();
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index ce7e009576..15ad2f3df7 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -139,6 +139,18 @@ int			Log_autovacuum_min_duration = 600000;
 static bool am_autovacuum_launcher = false;
 static bool am_autovacuum_worker = false;
 
+/*
+ * Variables to save the cost-related storage parameters for the current
+ * relation being vacuumed by this autovacuum worker. Using these, we can
+ * ensure we don't overwrite the values of VacuumCostDelay and VacuumCostLimit
+ * after reloading the configuration file. They are initialized to "invalid"
+ * values to indicate no cost-related storage parameters were specified and
+ * will be set in do_autovacuum() after checking the storage parameters in
+ * table_recheck_autovac().
+ */
+static double av_storage_param_cost_delay = -1;
+static int	av_storage_param_cost_limit = -1;
+
 /* Flags set by signal handlers */
 static volatile sig_atomic_t got_SIGUSR2 = false;
 
@@ -189,8 +201,8 @@ typedef struct autovac_table
 {
 	Oid			at_relid;
 	VacuumParams at_params;
-	double		at_vacuum_cost_delay;
-	int			at_vacuum_cost_limit;
+	double		at_storage_param_vac_cost_delay;
+	int			at_storage_param_vac_cost_limit;
 	bool		at_dobalance;
 	bool		at_sharedrel;
 	char	   *at_relname;
@@ -209,7 +221,7 @@ typedef struct autovac_table
  * wi_sharedrel flag indicating whether table is marked relisshared
  * wi_proc		pointer to PGPROC of the running worker, NULL if not started
  * wi_launchtime Time at which this worker was launched
- * wi_cost_*	Vacuum cost-based delay parameters current in this worker
+ * wi_dobalance Whether this worker should be included in balance calculations
  *
  * All fields are protected by AutovacuumLock, except for wi_tableoid and
  * wi_sharedrel which are protected by AutovacuumScheduleLock (note these
@@ -223,11 +235,8 @@ typedef struct WorkerInfoData
 	Oid			wi_tableoid;
 	PGPROC	   *wi_proc;
 	TimestampTz wi_launchtime;
-	bool		wi_dobalance;
+	pg_atomic_flag wi_dobalance;
 	bool		wi_sharedrel;
-	double		wi_cost_delay;
-	int			wi_cost_limit;
-	int			wi_cost_limit_base;
 } WorkerInfoData;
 
 typedef struct WorkerInfoData *WorkerInfo;
@@ -273,6 +282,8 @@ typedef struct AutoVacuumWorkItem
  * av_startingWorker pointer to WorkerInfo currently being started (cleared by
  *					the worker itself as soon as it's up and running)
  * av_workItems		work item array
+ * av_nworkersForBalance the number of autovacuum workers to use when
+ * 					calculating the per worker cost limit
  *
  * This struct is protected by AutovacuumLock, except for av_signal and parts
  * of the worker list (see above).
@@ -286,6 +297,7 @@ typedef struct
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+	pg_atomic_uint32 av_nworkersForBalance;
 } AutoVacuumShmemStruct;
 
 static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -319,7 +331,7 @@ static void launch_worker(TimestampTz now);
 static List *get_database_list(void);
 static void rebuild_database_list(Oid newdb);
 static int	db_comparator(const void *a, const void *b);
-static void autovac_balance_cost(void);
+static void autovac_recalculate_workers_for_balance(void);
 
 static void do_autovacuum(void);
 static void FreeWorkerInfo(int code, Datum arg);
@@ -670,7 +682,7 @@ AutoVacLauncherMain(int argc, char *argv[])
 			{
 				LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 				AutoVacuumShmem->av_signal[AutoVacRebalance] = false;
-				autovac_balance_cost();
+				autovac_recalculate_workers_for_balance();
 				LWLockRelease(AutovacuumLock);
 			}
 
@@ -820,8 +832,8 @@ HandleAutoVacLauncherInterrupts(void)
 			AutoVacLauncherShutdown();
 
 		/* rebalance in case the default cost parameters changed */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-		autovac_balance_cost();
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
+		autovac_recalculate_workers_for_balance();
 		LWLockRelease(AutovacuumLock);
 
 		/* rebuild the list in case the naptime changed */
@@ -1755,10 +1767,7 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_sharedrel = false;
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
-		MyWorkerInfo->wi_dobalance = false;
-		MyWorkerInfo->wi_cost_delay = 0;
-		MyWorkerInfo->wi_cost_limit = 0;
-		MyWorkerInfo->wi_cost_limit_base = 0;
+		pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
 		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 						&MyWorkerInfo->wi_links);
 		/* not mine anymore */
@@ -1784,97 +1793,127 @@ VacuumUpdateCosts(void)
 {
 	if (MyWorkerInfo)
 	{
-		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
-		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+		if (av_storage_param_cost_delay >= 0)
+			VacuumCostDelay = av_storage_param_cost_delay;
+		else if (autovacuum_vac_cost_delay >= 0)
+			VacuumCostDelay = autovacuum_vac_cost_delay;
+		else
+			/* fall back to vacuum_cost_delay */
+			VacuumCostDelay = vacuum_cost_delay;
+
+		AutoVacuumUpdateCostLimit();
 	}
 	else
 	{
 		/* Must be explicit VACUUM or ANALYZE */
-		VacuumCostLimit = vacuum_cost_limit;
 		VacuumCostDelay = vacuum_cost_delay;
+		VacuumCostLimit = vacuum_cost_limit;
+	}
+
+	/*
+	 * If configuration changes are allowed to impact VacuumCostActive, make
+	 * sure it is updated.
+	 */
+	if (VacuumFailsafeActive)
+		Assert(!VacuumCostActive);
+	else if (VacuumCostDelay > 0)
+		VacuumCostActive = true;
+	else
+	{
+		VacuumCostActive = false;
+		VacuumCostBalance = 0;
+	}
+
+	if (MyWorkerInfo)
+	{
+		elog(DEBUG2,
+			 "Autovacuum VacuumUpdateCosts(db=%u, rel=%u, dobalance=%s, cost_limit=%d, cost_delay=%g active=%s failsafe=%s)",
+			 MyWorkerInfo->wi_dboid, MyWorkerInfo->wi_tableoid,
+			 pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance) ? "no" : "yes",
+			 VacuumCostLimit, VacuumCostDelay,
+			 VacuumCostDelay > 0 ? "yes" : "no",
+			 VacuumFailsafeActive ? "yes" : "no");
 	}
 }
 
 /*
- * autovac_balance_cost
- *		Recalculate the cost limit setting for each active worker.
- *
- * Caller must hold the AutovacuumLock in exclusive mode.
+ * Update VacuumCostLimit with the correct value for an autovacuum worker, given
+ * the value of other relevant cost limit parameters and the number of workers
+ * across which the limit must be balanced. Autovacuum workers must call this
+ * regularly in case av_nworkers_for_balance has been updated by another worker
+ * or by the autovacuum launcher. They must also call it after a config reload.
  */
-static void
-autovac_balance_cost(void)
+void
+AutoVacuumUpdateCostLimit(void)
 {
+	if (!MyWorkerInfo)
+		return;
+
 	/*
-	 * The idea here is that we ration out I/O equally.  The amount of I/O
-	 * that a worker can consume is determined by cost_limit/cost_delay, so we
-	 * try to equalize those ratios rather than the raw limit settings.
-	 *
 	 * note: in cost_limit, zero also means use value from elsewhere, because
 	 * zero is not a valid value.
 	 */
-	int			vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
-								  autovacuum_vac_cost_limit : vacuum_cost_limit);
-	double		vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
-								  autovacuum_vac_cost_delay : vacuum_cost_delay);
-	double		cost_total;
-	double		cost_avail;
-	dlist_iter	iter;
 
-	/* not set? nothing to do */
-	if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
-		return;
-
-	/* calculate the total base cost limit of participating active workers */
-	cost_total = 0.0;
-	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
+	if (av_storage_param_cost_limit > 0)
+		VacuumCostLimit = av_storage_param_cost_limit;
+	else
 	{
-		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
+		int			nworkers_for_balance;
+
+		if (autovacuum_vac_cost_limit > 0)
+			VacuumCostLimit = autovacuum_vac_cost_limit;
+		else
+			VacuumCostLimit = vacuum_cost_limit;
+
+		/* Only balance limit if no cost-related storage parameters specified */
+		if (pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance))
+			return;
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-			cost_total +=
-				(double) worker->wi_cost_limit_base / worker->wi_cost_delay;
+		Assert(VacuumCostLimit > 0);
+
+		nworkers_for_balance = pg_atomic_read_u32(
+								&AutoVacuumShmem->av_nworkersForBalance);
+
+		/* There is at least 1 autovac worker (this worker). */
+		if (nworkers_for_balance <= 0)
+			elog(ERROR, "nworkers_for_balance must be > 0");
+
+		VacuumCostLimit = Max(VacuumCostLimit / nworkers_for_balance, 1);
 	}
+}
 
-	/* there are no cost limits -- nothing to do */
-	if (cost_total <= 0)
-		return;
+/*
+ * autovac_recalculate_workers_for_balance
+ *		Recalculate the number of workers to consider, given cost-related
+ *		storage parameters and the current number of active workers.
+ *
+ * Caller must hold the AutovacuumLock in at least shared mode to access
+ * worker->wi_proc.
+ */
+static void
+autovac_recalculate_workers_for_balance(void)
+{
+	dlist_iter	iter;
+	int			orig_nworkers_for_balance;
+	int			nworkers_for_balance = 0;
+
+	orig_nworkers_for_balance =
+		pg_atomic_read_u32(&AutoVacuumShmem->av_nworkersForBalance);
 
-	/*
-	 * Adjust cost limit of each active worker to balance the total of cost
-	 * limit to autovacuum_vacuum_cost_limit.
-	 */
-	cost_avail = (double) vac_cost_limit / vac_cost_delay;
 	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
 	{
 		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-		{
-			int			limit = (int)
-			(cost_avail * worker->wi_cost_limit_base / cost_total);
-
-			/*
-			 * We put a lower bound of 1 on the cost_limit, to avoid division-
-			 * by-zero in the vacuum code.  Also, in case of roundoff trouble
-			 * in these calculations, let's be sure we don't ever set
-			 * cost_limit to more than the base value.
-			 */
-			worker->wi_cost_limit = Max(Min(limit,
-											worker->wi_cost_limit_base),
-										1);
-		}
+		if (worker->wi_proc == NULL ||
+			pg_atomic_unlocked_test_flag(&worker->wi_dobalance))
+			continue;
 
-		if (worker->wi_proc != NULL)
-			elog(DEBUG2, "autovac_balance_cost(pid=%d db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d, cost_delay=%g)",
-				 worker->wi_proc->pid, worker->wi_dboid, worker->wi_tableoid,
-				 worker->wi_dobalance ? "yes" : "no",
-				 worker->wi_cost_limit, worker->wi_cost_limit_base,
-				 worker->wi_cost_delay);
+		nworkers_for_balance++;
 	}
+
+	if (nworkers_for_balance != orig_nworkers_for_balance)
+		pg_atomic_write_u32(&AutoVacuumShmem->av_nworkersForBalance,
+							nworkers_for_balance);
 }
 
 /*
@@ -2422,23 +2461,34 @@ do_autovacuum(void)
 			continue;
 		}
 
-		/* Must hold AutovacuumLock while mucking with cost balance info */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+		/*
+		 * Save the cost-related storage parameter values in global variables
+		 * for reference when updating VacuumCostLimit and VacuumCostDelay
+		 * during vacuuming this table.
+		 */
+		av_storage_param_cost_limit = tab->at_storage_param_vac_cost_limit;
+		av_storage_param_cost_delay = tab->at_storage_param_vac_cost_delay;
 
-		/* advertise my cost delay parameters for the balancing algorithm */
-		MyWorkerInfo->wi_dobalance = tab->at_dobalance;
-		MyWorkerInfo->wi_cost_delay = tab->at_vacuum_cost_delay;
-		MyWorkerInfo->wi_cost_limit = tab->at_vacuum_cost_limit;
-		MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
+		/*
+		 * We only expect this worker to ever set the flag, so don't bother
+		 * checking the return value. We shouldn't have to retry.
+		 */
+		if (tab->at_dobalance)
+			pg_atomic_test_set_flag(&MyWorkerInfo->wi_dobalance);
+		else
+			pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
 
-		/* do a balance */
-		autovac_balance_cost();
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
+		autovac_recalculate_workers_for_balance();
+		LWLockRelease(AutovacuumLock);
 
-		/* set the active cost parameters from the result of that */
+		/*
+		 * We wait until this point to update cost delay and cost limit
+		 * values, even though we reloaded the configuration file above, so
+		 * that we can take into account the cost-related storage parameters.
+		 */
 		VacuumUpdateCosts();
 
-		/* done */
-		LWLockRelease(AutovacuumLock);
 
 		/* clean up memory before each iteration */
 		MemoryContextResetAndDeleteChildren(PortalContext);
@@ -2523,10 +2573,10 @@ deleted:
 
 		/*
 		 * Remove my info from shared memory.  We could, but intentionally
-		 * don't, clear wi_cost_limit and friends --- this is on the
-		 * assumption that we probably have more to do with similar cost
-		 * settings, so we don't want to give up our share of I/O for a very
-		 * short interval and thereby thrash the global balance.
+		 * don't, unset wi_dobalance on the assumption that we are more likely
+		 * than not to vacuum a table with no cost-related storage parameters
+		 * next, so we don't want to give up our share of I/O for a very short
+		 * interval and thereby thrash the global balance.
 		 */
 		LWLockAcquire(AutovacuumScheduleLock, LW_EXCLUSIVE);
 		MyWorkerInfo->wi_tableoid = InvalidOid;
@@ -2563,6 +2613,7 @@ deleted:
 		{
 			ConfigReloadPending = false;
 			ProcessConfigFile(PGC_SIGHUP);
+			VacuumUpdateCosts();
 		}
 
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
@@ -2798,8 +2849,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		int			freeze_table_age;
 		int			multixact_freeze_min_age;
 		int			multixact_freeze_table_age;
-		int			vac_cost_limit;
-		double		vac_cost_delay;
 		int			log_min_duration;
 
 		/*
@@ -2809,20 +2858,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 * defaults, autovacuum's own first and plain vacuum second.
 		 */
 
-		/* -1 in autovac setting means use plain vacuum_cost_delay */
-		vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
-			? avopts->vacuum_cost_delay
-			: (autovacuum_vac_cost_delay >= 0)
-			? autovacuum_vac_cost_delay
-			: vacuum_cost_delay;
-
-		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
-		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
-			? avopts->vacuum_cost_limit
-			: (autovacuum_vac_cost_limit > 0)
-			? autovacuum_vac_cost_limit
-			: vacuum_cost_limit;
-
 		/* -1 in autovac setting means use log_autovacuum_min_duration */
 		log_min_duration = (avopts && avopts->log_min_duration >= 0)
 			? avopts->log_min_duration
@@ -2878,8 +2913,10 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab->at_params.multixact_freeze_table_age = multixact_freeze_table_age;
 		tab->at_params.is_wraparound = wraparound;
 		tab->at_params.log_min_duration = log_min_duration;
-		tab->at_vacuum_cost_limit = vac_cost_limit;
-		tab->at_vacuum_cost_delay = vac_cost_delay;
+		tab->at_storage_param_vac_cost_limit = avopts ?
+			avopts->vacuum_cost_limit : 0;
+		tab->at_storage_param_vac_cost_delay = avopts ?
+			avopts->vacuum_cost_delay : -1;
 		tab->at_relname = NULL;
 		tab->at_nspname = NULL;
 		tab->at_datname = NULL;
@@ -3371,10 +3408,18 @@ AutoVacuumShmemInit(void)
 		worker = (WorkerInfo) ((char *) AutoVacuumShmem +
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
+
 		/* initialize the WorkerInfo free list */
 		for (i = 0; i < autovacuum_max_workers; i++)
+		{
 			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 							&worker[i].wi_links);
+
+			pg_atomic_init_flag(&worker[i].wi_dobalance);
+		}
+
+		pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
+
 	}
 	else
 		Assert(found);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index d048bb6e0d..38c8bdf0fc 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -351,6 +351,7 @@ extern IndexBulkDeleteResult *vac_cleanup_one_index(IndexVacuumInfo *ivinfo,
 extern Size vac_max_items_to_alloc_size(int max_items);
 
 /* In postmaster/autovacuum.c */
+extern void AutoVacuumUpdateCostLimit(void);
 extern void VacuumUpdateCosts(void);
 
 /* in commands/vacuumparallel.c */
-- 
2.37.2

#55Daniel Gustafsson
daniel@yesql.se
In reply to: Melanie Plageman (#54)
Re: Should vacuum process config file reload more often

On 5 Apr 2023, at 17:29, Melanie Plageman <melanieplageman@gmail.com> wrote:

I think I wasn't clear in my comment, sorry. I don't have a problem with
introducing a new variable to split the balanced value from the GUC value.
What I don't think we should do is repurpose an exported symbol into doing a
new thing. In the case at hand I think VacuumCostLimit and VacuumCostDelay
should remain the backing variables for the GUCs, with vacuum_cost_limit and
vacuum_cost_delay carrying the balanced values. So the inverse of what is in
the patch now.

The risk of these symbols being used in extensions might be very low but on
principle it seems unwise to alter a symbol and risk subtle breakage.

I totally see what you are saying. The only complication is that all of
the other variables used in vacuum code are the camelcase and the gucs
follow the snake case -- as pointed out in a previous review comment by
Sawada-san:

Fair point.

@@ -83,6 +84,7 @@ int                   vacuum_cost_limit;
*/
int                    VacuumCostLimit = 0;
double         VacuumCostDelay = -1;
+static bool vacuum_can_reload_config = false;

In vacuum.c, we use snake case for GUC parameters and camel case for
other global variables, so it seems better to rename it
VacuumCanReloadConfig. Sorry, that's my fault.

This is less of a compelling argument than subtle breakage for extension
code, though.

How about if we rename the variable into something which also acts at bit as
self documenting why there are two in the first place? Perhaps
BalancedVacuumCostLimit or something similar (I'm terrible with names)?

I am, however, wondering if extensions expect to have access to the guc
variable or the global variable -- or both?

Extensions have access to all exported symbols, and I think it's not uncommon
for extension authors to expect to have access to at least read GUC variables.

--
Daniel Gustafsson

#56Robert Haas
robertmhaas@gmail.com
In reply to: Melanie Plageman (#54)
Re: Should vacuum process config file reload more often

On Wed, Apr 5, 2023 at 11:29 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Thanks all for the reviews.

v16 attached. I put it together rather quickly, so there might be a few
spurious whitespaces or similar. There is one rather annoying pgindent
outlier that I have to figure out what to do about as well.

The remaining functional TODOs that I know of are:

- Resolve what to do about names of GUC and vacuum variables for cost
limit and cost delay (since it may affect extensions)

- Figure out what to do about the logging message which accesses dboid
and tableoid (lock/no lock, where to put it, etc)

- I see several places in docs which reference the balancing algorithm
for autovac workers. I did not read them in great detail, but we may
want to review them to see if any require updates.

- Consider whether or not the initial two commits should just be
squashed with the third commit

- Anything else reviewers are still unhappy with

I really like having the first couple of patches split out -- it makes
them super-easy to understand. A committer can always choose to squash
at commit time if they want. I kind of wish the patch set were split
up more, for even easier understanding. I don't think that's a thing
to get hung up on, but it's an opinion that I have.

I strongly agree with the goals of the patch set, as I understand
them. Being able to change the config file and SIGHUP the server and
have the new values affect running autovacuum workers seems pretty
huge. It would make it possible to solve problems that currently can
only be solved by using gdb on a production instance, which is not a
fun thing to be doing.

+ /*
+ * Balance and update limit values for autovacuum workers. We must
+ * always do this in case the autovacuum launcher or another
+ * autovacuum worker has recalculated the number of workers across
+ * which we must balance the limit. This is done by the launcher when
+ * launching a new worker and by workers before vacuuming each table.
+ */

I don't quite understand what's going on here. A big reason that I'm
worried about this whole issue in the first place is that sometimes
there's a vacuum going on a giant table and you can't get it to go
fast. You want it to absorb new settings, and to do so quickly. I
realize that this is about the number of workers, not the actual cost
limit, so that makes what I'm about to say less important. But ... is
this often enough? Like, the time before we move onto the next table
could be super long. The time before a new worker is launched should
be ~autovacuum_naptime/autovacuum_max_workers or ~20s with default
settings, so that's not horrible, but I'm kind of struggling to
understand the rationale for this particular choice. Maybe it's fine.

To be honest, I think that the whole system where we divide the cost
limit across the workers is the wrong idea. Does anyone actually like
that behavior? This patch probably shouldn't touch that, just in the
interest of getting something done that is an improvement over where
we are now, but I think this behavior is really counterintuitive.
People expect that they can increase autovacuum_max_workers to get
more vacuuming done, and actually in most cases that does not work.
And if that behavior didn't exist, this patch would also be a whole
lot simpler. Again, I don't think this is something we should try to
address right now under time pressure, but in the future, I think we
should consider ripping this behavior out.

+               if (autovacuum_vac_cost_limit > 0)
+                       VacuumCostLimit = autovacuum_vac_cost_limit;
+               else
+                       VacuumCostLimit = vacuum_cost_limit;
+
+               /* Only balance limit if no cost-related storage
parameters specified */
+               if (pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance))
+                       return;
+               Assert(VacuumCostLimit > 0);
+
+               nworkers_for_balance = pg_atomic_read_u32(
+
&AutoVacuumShmem->av_nworkersForBalance);
+
+               /* There is at least 1 autovac worker (this worker). */
+               if (nworkers_for_balance <= 0)
+                       elog(ERROR, "nworkers_for_balance must be > 0");
+
+               VacuumCostLimit = Max(VacuumCostLimit /
nworkers_for_balance, 1);

I think it would be better stylistically to use a temporary variable
here and only assign the final value to VacuumCostLimit.

Daniel: Are you intending to commit this?

--
Robert Haas
EDB: http://www.enterprisedb.com

#57Daniel Gustafsson
daniel@yesql.se
In reply to: Robert Haas (#56)
Re: Should vacuum process config file reload more often

On 5 Apr 2023, at 20:55, Robert Haas <robertmhaas@gmail.com> wrote:

Again, I don't think this is something we should try to
address right now under time pressure, but in the future, I think we
should consider ripping this behavior out.

I would not be opposed to that, but I wholeheartedly agree that it's not the
job of this patch (or any patch at this point in the cycle).

+               if (autovacuum_vac_cost_limit > 0)
+                       VacuumCostLimit = autovacuum_vac_cost_limit;
+               else
+                       VacuumCostLimit = vacuum_cost_limit;
+
+               /* Only balance limit if no cost-related storage
parameters specified */
+               if (pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance))
+                       return;
+               Assert(VacuumCostLimit > 0);
+
+               nworkers_for_balance = pg_atomic_read_u32(
+
&AutoVacuumShmem->av_nworkersForBalance);
+
+               /* There is at least 1 autovac worker (this worker). */
+               if (nworkers_for_balance <= 0)
+                       elog(ERROR, "nworkers_for_balance must be > 0");
+
+               VacuumCostLimit = Max(VacuumCostLimit /
nworkers_for_balance, 1);

I think it would be better stylistically to use a temporary variable
here and only assign the final value to VacuumCostLimit.

I can agree with that. Another supertiny nitpick on the above is to not end a
single-line comment with a period.

Daniel: Are you intending to commit this?

Yes, my plan is to get it in before feature freeze. I notice now that I had
missed setting myself as committer in the CF to signal this intent, sorry about
that.

--
Daniel Gustafsson

#58Robert Haas
robertmhaas@gmail.com
In reply to: Daniel Gustafsson (#57)
Re: Should vacuum process config file reload more often

On Wed, Apr 5, 2023 at 3:04 PM Daniel Gustafsson <daniel@yesql.se> wrote:

Daniel: Are you intending to commit this?

Yes, my plan is to get it in before feature freeze.

All right, let's make it happen! I think this is pretty close to ready
to ship, and it would solve a problem that is real, annoying, and
serious.

--
Robert Haas
EDB: http://www.enterprisedb.com

#59Melanie Plageman
melanieplageman@gmail.com
In reply to: Robert Haas (#56)
Re: Should vacuum process config file reload more often

On Wed, Apr 5, 2023 at 2:56 PM Robert Haas <robertmhaas@gmail.com> wrote:

+ /*
+ * Balance and update limit values for autovacuum workers. We must
+ * always do this in case the autovacuum launcher or another
+ * autovacuum worker has recalculated the number of workers across
+ * which we must balance the limit. This is done by the launcher when
+ * launching a new worker and by workers before vacuuming each table.
+ */

I don't quite understand what's going on here. A big reason that I'm
worried about this whole issue in the first place is that sometimes
there's a vacuum going on a giant table and you can't get it to go
fast. You want it to absorb new settings, and to do so quickly. I
realize that this is about the number of workers, not the actual cost
limit, so that makes what I'm about to say less important. But ... is
this often enough? Like, the time before we move onto the next table
could be super long. The time before a new worker is launched should
be ~autovacuum_naptime/autovacuum_max_workers or ~20s with default
settings, so that's not horrible, but I'm kind of struggling to
understand the rationale for this particular choice. Maybe it's fine.

VacuumUpdateCosts() also calls AutoVacuumUpdateCostLimit(), so this will
happen if a config reload is pending the next time vacuum_delay_point()
is called (which is pretty often -- roughly once per block vacuumed but
definitely more than once per table).

Relevant code is at the top of vacuum_delay_point():

if (ConfigReloadPending && IsAutoVacuumWorkerProcess())
{
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
}

- Melanie

#60Robert Haas
robertmhaas@gmail.com
In reply to: Melanie Plageman (#59)
Re: Should vacuum process config file reload more often

On Wed, Apr 5, 2023 at 3:44 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

VacuumUpdateCosts() also calls AutoVacuumUpdateCostLimit(), so this will
happen if a config reload is pending the next time vacuum_delay_point()
is called (which is pretty often -- roughly once per block vacuumed but
definitely more than once per table).

Relevant code is at the top of vacuum_delay_point():

if (ConfigReloadPending && IsAutoVacuumWorkerProcess())
{
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
}

Yeah, that all makes sense, and I did see that logic, but I'm
struggling to reconcile it with what that comment says.

Maybe I'm just confused about what that comment is trying to explain.

--
Robert Haas
EDB: http://www.enterprisedb.com

In reply to: Robert Haas (#56)
Re: Should vacuum process config file reload more often

On Wed, Apr 5, 2023 at 11:56 AM Robert Haas <robertmhaas@gmail.com> wrote:

To be honest, I think that the whole system where we divide the cost
limit across the workers is the wrong idea. Does anyone actually like
that behavior? This patch probably shouldn't touch that, just in the
interest of getting something done that is an improvement over where
we are now, but I think this behavior is really counterintuitive.
People expect that they can increase autovacuum_max_workers to get
more vacuuming done, and actually in most cases that does not work.

I disagree. Increasing autovacuum_max_workers as a method of
increasing the overall aggressiveness of autovacuum seems like the
wrong idea. I'm sure that users do that at times, but they really
ought to have a better way of getting the same result.

ISTM that autovacuum_max_workers confuses the question of what the
maximum possible number of workers should ever be (in extreme cases)
with the question of how many workers might be a good idea given
present conditions.

And if that behavior didn't exist, this patch would also be a whole
lot simpler.

Probably, but the fact remains that the system level view of things is
mostly what matters. The competition between the amount of vacuuming
that we can afford to do right now and the amount of vacuuming that
we'd ideally be able to do really matters. In fact, I'd argue that the
amount of vacuuming that we'd ideally be able to do isn't a
particularly meaningful concept on its own. It's just too hard to
model what we need to do accurately -- emphasizing what we can afford
to do seems much more promising.

Again, I don't think this is something we should try to
address right now under time pressure, but in the future, I think we
should consider ripping this behavior out.

-1. The delay stuff might not work as well as it should, but it at
least seems like roughly the right idea. The bigger problem seems to
be everything else -- the way that tuning autovacuum_max_workers kinda
makes sense (it shouldn't be an interesting tunable), and the problems
with the autovacuum.c scheduling being so primitive.

--
Peter Geoghegan

#62Daniel Gustafsson
daniel@yesql.se
In reply to: Peter Geoghegan (#61)
Re: Should vacuum process config file reload more often

On 5 Apr 2023, at 22:19, Peter Geoghegan <pg@bowt.ie> wrote:

The bigger problem seems to
be everything else -- the way that tuning autovacuum_max_workers kinda
makes sense (it shouldn't be an interesting tunable)

Not to derail this thread, and pre-empt a thread where this can be discussed in
its own context, but isn't that kind of the main problem? Tuning autovacuum is
really complicated and one of the parameters that I think universally seem to
make sense to users is just autovacuum_max_workers. I agree that it doesn't do
what most think it should, but a quick skim of the name and docs can probably
lead to a lot of folks trying to use it as hammer.

--
Daniel Gustafsson

In reply to: Daniel Gustafsson (#62)
Re: Should vacuum process config file reload more often

On Wed, Apr 5, 2023 at 1:38 PM Daniel Gustafsson <daniel@yesql.se> wrote:

Not to derail this thread, and pre-empt a thread where this can be discussed in
its own context, but isn't that kind of the main problem? Tuning autovacuum is
really complicated and one of the parameters that I think universally seem to
make sense to users is just autovacuum_max_workers. I agree that it doesn't do
what most think it should, but a quick skim of the name and docs can probably
lead to a lot of folks trying to use it as hammer.

I think that I agree. I think that the difficulty of tuning autovacuum
is the actual real problem. (Or maybe it's just very closely related
to the real problem -- the precise definition doesn't seem important.)

There seems to be a kind of physics envy to some of these things.
False precision. The way that the mechanisms actually work (the
autovacuum scheduling, freeze_min_age, and quite a few other things)
*are* simple. But so are the rules of Conway's game of life, yet
people seem to have a great deal of difficulty predicting how it will
behave in any given situation. Any design that focuses on the
immediate consequences of any particular policy while ignoring second
order effects isn't going to work particularly well. Users ought to be
able to constrain the behavior of autovacuum using settings that
express what they want in high level terms. And VACUUM ought to have
much more freedom around finding the best way to meet those high level
goals over time (e.g., very loose rules about how much we need to
advance relfrozenxid by during any individual VACUUM).

--
Peter Geoghegan

#64Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#59)
Re: Should vacuum process config file reload more often

On Wed, Apr 5, 2023 at 3:43 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Wed, Apr 5, 2023 at 2:56 PM Robert Haas <robertmhaas@gmail.com> wrote:

+ /*
+ * Balance and update limit values for autovacuum workers. We must
+ * always do this in case the autovacuum launcher or another
+ * autovacuum worker has recalculated the number of workers across
+ * which we must balance the limit. This is done by the launcher when
+ * launching a new worker and by workers before vacuuming each table.
+ */

I don't quite understand what's going on here. A big reason that I'm
worried about this whole issue in the first place is that sometimes
there's a vacuum going on a giant table and you can't get it to go
fast. You want it to absorb new settings, and to do so quickly. I
realize that this is about the number of workers, not the actual cost
limit, so that makes what I'm about to say less important. But ... is
this often enough? Like, the time before we move onto the next table
could be super long. The time before a new worker is launched should
be ~autovacuum_naptime/autovacuum_max_workers or ~20s with default
settings, so that's not horrible, but I'm kind of struggling to
understand the rationale for this particular choice. Maybe it's fine.

VacuumUpdateCosts() also calls AutoVacuumUpdateCostLimit(), so this will
happen if a config reload is pending the next time vacuum_delay_point()
is called (which is pretty often -- roughly once per block vacuumed but
definitely more than once per table).

Relevant code is at the top of vacuum_delay_point():

if (ConfigReloadPending && IsAutoVacuumWorkerProcess())
{
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
}

Gah, I think I misunderstood you. You are saying that only calling
AutoVacuumUpdateCostLimit() after napping while vacuuming a table may
not be enough. The frequency at which the number of workers changes will
likely be different. This is a good point.
It's kind of weird to call AutoVacuumUpdateCostLimit() only after napping...

Hmm. Well, I don't think we want to call AutoVacuumUpdateCostLimit() on
every call to vacuum_delay_point(), though, do we? It includes two
atomic operations. Maybe that pales in comparison to what we are doing
on each page we are vacuuming. I haven't properly thought about it.

Is there some other relevant condition we can use to determine whether
or not to call AutoVacuumUpdateCostLimit() on a given invocation of
vacuum_delay_point()? Maybe something with naptime/max workers?

I'm not sure if there is a more reliable place than vacuum_delay_point()
for us to do this. I poked around heap_vacuum_rel(), but I think we
would want this cost limit update to happen table AM-agnostically.

Thank you for bringing this up!

- Melanie

#65Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Melanie Plageman (#54)
Re: Should vacuum process config file reload more often

On Thu, Apr 6, 2023 at 12:29 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Thanks all for the reviews.

v16 attached. I put it together rather quickly, so there might be a few
spurious whitespaces or similar. There is one rather annoying pgindent
outlier that I have to figure out what to do about as well.

The remaining functional TODOs that I know of are:

- Resolve what to do about names of GUC and vacuum variables for cost
limit and cost delay (since it may affect extensions)

- Figure out what to do about the logging message which accesses dboid
and tableoid (lock/no lock, where to put it, etc)

- I see several places in docs which reference the balancing algorithm
for autovac workers. I did not read them in great detail, but we may
want to review them to see if any require updates.

- Consider whether or not the initial two commits should just be
squashed with the third commit

- Anything else reviewers are still unhappy with

On Wed, Apr 5, 2023 at 1:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Wed, Apr 5, 2023 at 5:05 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Tue, Apr 4, 2023 at 4:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

---
- if (worker->wi_proc != NULL)
- elog(DEBUG2, "autovac_balance_cost(pid=%d
db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d,
cost_delay=%g)",
- worker->wi_proc->pid,
worker->wi_dboid, worker->wi_tableoid,
- worker->wi_dobalance ? "yes" : "no",
- worker->wi_cost_limit,
worker->wi_cost_limit_base,
- worker->wi_cost_delay);

I think it's better to keep this kind of log in some form for
debugging. For example, we can show these values of autovacuum workers
in VacuumUpdateCosts().

I added a message to do_autovacuum() after calling VacuumUpdateCosts()
in the loop vacuuming each table. That means it will happen once per
table. It's not ideal that I had to move the call to VacuumUpdateCosts()
behind the shared lock in that loop so that we could access the pid and
such in the logging message after updating the cost and delay, but it is
probably okay. Though noone is going to be changing those at this
point, it still seemed better to access them under the lock.

This does mean we won't log anything when we do change the values of
VacuumCostDelay and VacuumCostLimit while vacuuming a table. Is it worth
adding some code to do that in VacuumUpdateCosts() (only when the value
has changed not on every call to VacuumUpdateCosts())? Or perhaps we
could add it in the config reload branch that is already in
vacuum_delay_point()?

Previously, we used to show the pid in the log since a worker/launcher
set other workers' delay costs. But now that the worker sets its delay
costs, we don't need to show the pid in the log. Also, I think it's
useful for debugging and investigating the system if we log it when
changing the values. The log I imagined to add was like:

@@ -1801,6 +1801,13 @@ VacuumUpdateCosts(void)
VacuumCostDelay = vacuum_cost_delay;

AutoVacuumUpdateLimit();
+
+       elog(DEBUG2, "autovacuum update costs (db=%u, rel=%u,
dobalance=%s, cost_limit=%d, cost_delay=%g active=%s failsafe=%s)",
+            MyWorkerInfo->wi_dboid, MyWorkerInfo->wi_tableoid,
+            pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance)
? "no" : "yes",
+            VacuumCostLimit, VacuumCostDelay,
+            VacuumCostDelay > 0 ? "yes" : "no",
+            VacuumFailsafeActive ? "yes" : "no");
}
else
{

Makes sense. I've updated the log message to roughly what you suggested.
I also realized I think it does make sense to call it in
VacuumUpdateCosts() -- only for autovacuum workers of course. I've done
this. I haven't taken the lock though and can't decide if I must since
they access dboid and tableoid -- those are not going to change at this
point, but I still don't know if I can access them lock-free...
Perhaps there is a way to condition it on the log level?

If I have to take a lock, then I don't know if we should put these in
VacuumUpdateCosts()...

I think we don't need to acquire a lock there as both values are
updated only by workers reporting this message. Also I agree with
where to put the log but I think the log message should start with
lower cases:

+                elog(DEBUG2,
+                         "Autovacuum VacuumUpdateCosts(db=%u, rel=%u,
dobalance=%s, cost_limit=%d, cost_delay=%g active=%s failsafe=%s)",
+                         MyWorkerInfo->wi_dboid, MyWorkerInfo->wi_tableoid,
+
pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance) ? "no" :
"yes",
+                         VacuumCostLimit, VacuumCostDelay,
+                         VacuumCostDelay > 0 ? "yes" : "no",
+                         VacuumFailsafeActive ? "yes" : "no");

Some minor comments on 0003:

+/*
+ * autovac_recalculate_workers_for_balance
+ *             Recalculate the number of workers to consider, given
cost-related
+ *             storage parameters and the current number of active workers.
+ *
+ * Caller must hold the AutovacuumLock in at least shared mode to access
+ * worker->wi_proc.
+ */

Does it make sense to add Assert(LWLockHeldByMe(AutovacuumLock)) at
the beginning of this function?

---
                 /* rebalance in case the default cost parameters changed */
-                LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-                autovac_balance_cost();
+                LWLockAcquire(AutovacuumLock, LW_SHARED);
+                autovac_recalculate_workers_for_balance();
                 LWLockRelease(AutovacuumLock);

Do we really need to have the autovacuum launcher recalculate
av_nworkersForBalance after reloading the config file? Since the cost
parameters are not used inautovac_recalculate_workers_for_balance()
the comment also needs to be updated.

---
+                /*
+                 * Balance and update limit values for autovacuum
workers. We must
+                 * always do this in case the autovacuum launcher or another
+                 * autovacuum worker has recalculated the number of
workers across
+                 * which we must balance the limit. This is done by
the launcher when
+                 * launching a new worker and by workers before
vacuuming each table.
+                 */
+                AutoVacuumUpdateCostLimit();

I think the last sentence is not correct. IIUC recalculation of
av_nworkersForBalance is done by the launcher after a worker finished
and by workers before vacuuming each table.

---
It's not a problem of this patch, but IIUC since we don't reset
wi_dobalance after vacuuming each table we use the last value of
wi_dobalance for performing autovacuum items. At end of the loop for
tables in do_autovacuum() we have the following code that explains why
we don't reset wi_dobalance:

/*
* Remove my info from shared memory. We could, but intentionally
* don't, unset wi_dobalance on the assumption that we are more likely
* than not to vacuum a table with no cost-related storage parameters
* next, so we don't want to give up our share of I/O for a very short
* interval and thereby thrash the global balance.
*/
LWLockAcquire(AutovacuumScheduleLock, LW_EXCLUSIVE);
MyWorkerInfo->wi_tableoid = InvalidOid;
MyWorkerInfo->wi_sharedrel = false;
LWLockRelease(AutovacuumScheduleLock);

Assuming we agree with that, probably we need to reset it to true
after vacuuming all tables?

0001 and 0002 patches look good to me except for the renaming GUCs
stuff as the discussion is ongoing.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#66Daniel Gustafsson
daniel@yesql.se
In reply to: Masahiko Sawada (#65)
Re: Should vacuum process config file reload more often

On 6 Apr 2023, at 08:39, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Also I agree with
where to put the log but I think the log message should start with
lower cases:

+                elog(DEBUG2,
+                         "Autovacuum VacuumUpdateCosts(db=%u, rel=%u,

In principle I agree, but in this case Autovacuum is a name and should IMO in
userfacing messages start with capital A.

+/*
+ * autovac_recalculate_workers_for_balance
+ *             Recalculate the number of workers to consider, given
cost-related
+ *             storage parameters and the current number of active workers.
+ *
+ * Caller must hold the AutovacuumLock in at least shared mode to access
+ * worker->wi_proc.
+ */

Does it make sense to add Assert(LWLockHeldByMe(AutovacuumLock)) at
the beginning of this function?

That's probably not a bad idea.

---
/* rebalance in case the default cost parameters changed */
-                LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-                autovac_balance_cost();
+                LWLockAcquire(AutovacuumLock, LW_SHARED);
+                autovac_recalculate_workers_for_balance();
LWLockRelease(AutovacuumLock);

Do we really need to have the autovacuum launcher recalculate
av_nworkersForBalance after reloading the config file? Since the cost
parameters are not used inautovac_recalculate_workers_for_balance()
the comment also needs to be updated.

If I understand this comment right; there was a discussion upthread that simply
doing it in both launcher and worker simplifies the code with little overhead.
A comment can reflect that choice though.

--
Daniel Gustafsson

#67Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#64)
Re: Should vacuum process config file reload more often

On Wed, Apr 5, 2023 at 11:10 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Wed, Apr 5, 2023 at 3:43 PM Melanie Plageman <melanieplageman@gmail.com> wrote:

On Wed, Apr 5, 2023 at 2:56 PM Robert Haas <robertmhaas@gmail.com> wrote:

+ /*
+ * Balance and update limit values for autovacuum workers. We must
+ * always do this in case the autovacuum launcher or another
+ * autovacuum worker has recalculated the number of workers across
+ * which we must balance the limit. This is done by the launcher when
+ * launching a new worker and by workers before vacuuming each table.
+ */

I don't quite understand what's going on here. A big reason that I'm
worried about this whole issue in the first place is that sometimes
there's a vacuum going on a giant table and you can't get it to go
fast. You want it to absorb new settings, and to do so quickly. I
realize that this is about the number of workers, not the actual cost
limit, so that makes what I'm about to say less important. But ... is
this often enough? Like, the time before we move onto the next table
could be super long. The time before a new worker is launched should
be ~autovacuum_naptime/autovacuum_max_workers or ~20s with default
settings, so that's not horrible, but I'm kind of struggling to
understand the rationale for this particular choice. Maybe it's fine.

VacuumUpdateCosts() also calls AutoVacuumUpdateCostLimit(), so this will
happen if a config reload is pending the next time vacuum_delay_point()
is called (which is pretty often -- roughly once per block vacuumed but
definitely more than once per table).

Relevant code is at the top of vacuum_delay_point():

if (ConfigReloadPending && IsAutoVacuumWorkerProcess())
{
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
}

Gah, I think I misunderstood you. You are saying that only calling
AutoVacuumUpdateCostLimit() after napping while vacuuming a table may
not be enough. The frequency at which the number of workers changes will
likely be different. This is a good point.
It's kind of weird to call AutoVacuumUpdateCostLimit() only after napping...

A not fully baked idea for a solution:

Why not keep the balanced limit in the atomic instead of the number of
workers for balance. If we expect all of the workers to have the same
value for cost limit, then why would we just count the workers and not
also do the division and store that in the atomic variable. We are
worried about the division not being done often enough, not the number
of workers being out of date. This solves that, right?

- Melanie

#68Robert Haas
robertmhaas@gmail.com
In reply to: Melanie Plageman (#67)
Re: Should vacuum process config file reload more often

On Thu, Apr 6, 2023 at 11:52 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Gah, I think I misunderstood you. You are saying that only calling
AutoVacuumUpdateCostLimit() after napping while vacuuming a table may
not be enough. The frequency at which the number of workers changes will
likely be different. This is a good point.
It's kind of weird to call AutoVacuumUpdateCostLimit() only after napping...

A not fully baked idea for a solution:

Why not keep the balanced limit in the atomic instead of the number of
workers for balance. If we expect all of the workers to have the same
value for cost limit, then why would we just count the workers and not
also do the division and store that in the atomic variable. We are
worried about the division not being done often enough, not the number
of workers being out of date. This solves that, right?

A bird in the hand is worth two in the bush, though. We don't really
have time to redesign the patch before feature freeze, and I can't
convince myself that there's a big enough problem with what you
already did that it would be worth putting off fixing this for another
year. Reading your newer emails, I think that the answer to my
original question is "we don't want to do it at every
vacuum_delay_point because it might be too costly," which is
reasonable.

I don't particularly like this new idea, either, I think. While it may
be true that we expect all the workers to come up with the same
answer, they need not, because rereading the configuration file isn't
synchronized. It would be pretty lame if a worker that had reread an
updated value from the configuration file recomputed the value, and
then another worker that still had an older value recalculated it
again just afterward. Keeping only the number of workers in memory
avoids the possibility of thrashing around in situations like that.

I do kind of wonder if it would be possible to rejigger things so that
we didn't have to keep recalculating av_nworkersForBalance, though.
Perhaps now is not the time due to the impending freeze, but maybe we
should explore maintaining that value in such a way that it is correct
at every instant, instead of recalculating it at intervals.

--
Robert Haas
EDB: http://www.enterprisedb.com

#69Robert Haas
robertmhaas@gmail.com
In reply to: Peter Geoghegan (#63)
Re: Should vacuum process config file reload more often

On Wed, Apr 5, 2023 at 4:59 PM Peter Geoghegan <pg@bowt.ie> wrote:

I think that I agree. I think that the difficulty of tuning autovacuum
is the actual real problem. (Or maybe it's just very closely related
to the real problem -- the precise definition doesn't seem important.)

I agree, and I think that bad choices around what the parameters do
are a big part of the problem. autovacuum_max_workers is one example
of that, but there are a bunch of others. It's not at all intuitive
that if your database gets really big you either need to raise
autovacuum_vacuum_cost_limit or lower autovacuum_vacuum_cost_delay.
And, it's not intuitive either that raising autovacuum_max_workers
doesn't increase the amount of vacuuming that gets done. In my
experience, it's very common for people to observe that autovacuum is
running constantly, and to figure out that the number of running
workers is equal to autovacuum_max_workers at all times, and to then
conclude that they need more workers. So they raise
autovacuum_max_workers and nothing gets any better. In fact, things
might get *worse*, because the time required to complete vacuuming of
a large table can increase if the available bandwidth is potentially
spread across more workers, and it's very often the time to vacuum the
largest tables that determines whether things hold together adequately
or not.

This kind of stuff drives me absolutely batty. It's impossible to make
every database behavior completely intuitive, but here we have a
parameter that seems like it is exactly the right thing to solve the
problem that the user knows they have, and it actually does nothing on
a good day and causes a regression on a bad one. That's incredibly
poor design.

The way it works at the implementation level is pretty kooky, too. The
available resources are split between the workers, but if any of the
relevant vacuum parameters are set for the table currently being
vacuumed, then that worker gets the full resources configured for that
table, and everyone else divides up the amount that's configured
globally. So if you went and set the cost delay and cost limit for all
of your tables to exactly the same values that are configured
globally, you'd vacuum 3 times faster than if you relied on the
identical global defaults (or N times faster, where N is the value
you've picked for autovacuum_max_workers). If you have one really big
table that requires continuous vacuuming, you could slow down
vacuuming on that table through manual configuration settings and
still end up speeding up vacuuming overall, because the remaining
workers would be dividing the budget implied by the default settings
among N-1 workers instead of N workers. As far as I can see, none of
this is documented, which is perhaps for the best, because IMV it
makes no sense.

I think we need to move more toward a model where VACUUM just keeps
up. Emergency mode is a step in that direction, because the definition
of an emergency is that we're definitely not keeping up, but I think
we need something less Boolean. If the database gets bigger or smaller
or more or less active, autovacuum should somehow just adjust to that,
without so much manual fiddling. I think it's good to have the
possibility of some manual fiddling to handle problematic situations,
but you shouldn't have to do it just because you made a table bigger.

--
Robert Haas
EDB: http://www.enterprisedb.com

#70Daniel Gustafsson
daniel@yesql.se
In reply to: Robert Haas (#68)
Re: Should vacuum process config file reload more often

On 6 Apr 2023, at 19:18, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Apr 6, 2023 at 11:52 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Gah, I think I misunderstood you. You are saying that only calling
AutoVacuumUpdateCostLimit() after napping while vacuuming a table may
not be enough. The frequency at which the number of workers changes will
likely be different. This is a good point.
It's kind of weird to call AutoVacuumUpdateCostLimit() only after napping...

A not fully baked idea for a solution:

Why not keep the balanced limit in the atomic instead of the number of
workers for balance. If we expect all of the workers to have the same
value for cost limit, then why would we just count the workers and not
also do the division and store that in the atomic variable. We are
worried about the division not being done often enough, not the number
of workers being out of date. This solves that, right?

A bird in the hand is worth two in the bush, though. We don't really
have time to redesign the patch before feature freeze, and I can't
convince myself that there's a big enough problem with what you
already did that it would be worth putting off fixing this for another
year.

+1, I'd rather see we did a conservative version of the feature first and
expand upon it in the 17 cycle.

Reading your newer emails, I think that the answer to my
original question is "we don't want to do it at every
vacuum_delay_point because it might be too costly," which is
reasonable.

I think we kind of need to get to that granularity eventually, but it's not a
showstopper for this feature, and can probably benefit from being done in the
context of a larger av-worker re-think (the importance of which discussed
downthread).

--
Daniel Gustafsson

#71Melanie Plageman
melanieplageman@gmail.com
In reply to: Daniel Gustafsson (#70)
3 attachment(s)
Re: Should vacuum process config file reload more often

v17 attached does not yet fix the logging problem or variable naming
problem.

I have not changed where AutoVacuumUpdateCostLimit() is called either.

This is effectively just a round of cleanup. I hope I have managed to
address all other code review feedback so far, though some may have
slipped through the cracks.

On Wed, Apr 5, 2023 at 2:56 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Apr 5, 2023 at 11:29 AM Melanie Plageman <melanieplageman@gmail.com> wrote:
+ /*
+ * Balance and update limit values for autovacuum workers. We must
+ * always do this in case the autovacuum launcher or another
+ * autovacuum worker has recalculated the number of workers across
+ * which we must balance the limit. This is done by the launcher when
+ * launching a new worker and by workers before vacuuming each table.
+ */

I don't quite understand what's going on here. A big reason that I'm
worried about this whole issue in the first place is that sometimes
there's a vacuum going on a giant table and you can't get it to go
fast. You want it to absorb new settings, and to do so quickly. I
realize that this is about the number of workers, not the actual cost
limit, so that makes what I'm about to say less important. But ... is
this often enough? Like, the time before we move onto the next table
could be super long. The time before a new worker is launched should
be ~autovacuum_naptime/autovacuum_max_workers or ~20s with default
settings, so that's not horrible, but I'm kind of struggling to
understand the rationale for this particular choice. Maybe it's fine.

I've at least updated this comment to be more correct/less misleading.

+               if (autovacuum_vac_cost_limit > 0)
+                       VacuumCostLimit = autovacuum_vac_cost_limit;
+               else
+                       VacuumCostLimit = vacuum_cost_limit;
+
+               /* Only balance limit if no cost-related storage
parameters specified */
+               if (pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance))
+                       return;
+               Assert(VacuumCostLimit > 0);
+
+               nworkers_for_balance = pg_atomic_read_u32(
+
&AutoVacuumShmem->av_nworkersForBalance);
+
+               /* There is at least 1 autovac worker (this worker). */
+               if (nworkers_for_balance <= 0)
+                       elog(ERROR, "nworkers_for_balance must be > 0");
+
+               VacuumCostLimit = Max(VacuumCostLimit /
nworkers_for_balance, 1);

I think it would be better stylistically to use a temporary variable
here and only assign the final value to VacuumCostLimit.

I tried that and thought it adding confusing clutter. If it is a code
cleanliness issue, I am willing to change it, though.

On Wed, Apr 5, 2023 at 3:04 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 5 Apr 2023, at 20:55, Robert Haas <robertmhaas@gmail.com> wrote:

Again, I don't think this is something we should try to
address right now under time pressure, but in the future, I think we
should consider ripping this behavior out.

I would not be opposed to that, but I wholeheartedly agree that it's not the
job of this patch (or any patch at this point in the cycle).

+               if (autovacuum_vac_cost_limit > 0)
+                       VacuumCostLimit = autovacuum_vac_cost_limit;
+               else
+                       VacuumCostLimit = vacuum_cost_limit;
+
+               /* Only balance limit if no cost-related storage
parameters specified */
+               if (pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance))
+                       return;
+               Assert(VacuumCostLimit > 0);
+
+               nworkers_for_balance = pg_atomic_read_u32(
+
&AutoVacuumShmem->av_nworkersForBalance);
+
+               /* There is at least 1 autovac worker (this worker). */
+               if (nworkers_for_balance <= 0)
+                       elog(ERROR, "nworkers_for_balance must be > 0");
+
+               VacuumCostLimit = Max(VacuumCostLimit /
nworkers_for_balance, 1);

I think it would be better stylistically to use a temporary variable
here and only assign the final value to VacuumCostLimit.

I can agree with that. Another supertiny nitpick on the above is to not end a
single-line comment with a period.

I have fixed this.

On Thu, Apr 6, 2023 at 2:40 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Apr 6, 2023 at 12:29 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Thanks all for the reviews.

v16 attached. I put it together rather quickly, so there might be a few
spurious whitespaces or similar. There is one rather annoying pgindent
outlier that I have to figure out what to do about as well.

The remaining functional TODOs that I know of are:

- Resolve what to do about names of GUC and vacuum variables for cost
limit and cost delay (since it may affect extensions)

- Figure out what to do about the logging message which accesses dboid
and tableoid (lock/no lock, where to put it, etc)

- I see several places in docs which reference the balancing algorithm
for autovac workers. I did not read them in great detail, but we may
want to review them to see if any require updates.

- Consider whether or not the initial two commits should just be
squashed with the third commit

- Anything else reviewers are still unhappy with

On Wed, Apr 5, 2023 at 1:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Wed, Apr 5, 2023 at 5:05 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Tue, Apr 4, 2023 at 4:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

---
- if (worker->wi_proc != NULL)
- elog(DEBUG2, "autovac_balance_cost(pid=%d
db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d,
cost_delay=%g)",
- worker->wi_proc->pid,
worker->wi_dboid, worker->wi_tableoid,
- worker->wi_dobalance ? "yes" : "no",
- worker->wi_cost_limit,
worker->wi_cost_limit_base,
- worker->wi_cost_delay);

I think it's better to keep this kind of log in some form for
debugging. For example, we can show these values of autovacuum workers
in VacuumUpdateCosts().

I added a message to do_autovacuum() after calling VacuumUpdateCosts()
in the loop vacuuming each table. That means it will happen once per
table. It's not ideal that I had to move the call to VacuumUpdateCosts()
behind the shared lock in that loop so that we could access the pid and
such in the logging message after updating the cost and delay, but it is
probably okay. Though noone is going to be changing those at this
point, it still seemed better to access them under the lock.

This does mean we won't log anything when we do change the values of
VacuumCostDelay and VacuumCostLimit while vacuuming a table. Is it worth
adding some code to do that in VacuumUpdateCosts() (only when the value
has changed not on every call to VacuumUpdateCosts())? Or perhaps we
could add it in the config reload branch that is already in
vacuum_delay_point()?

Previously, we used to show the pid in the log since a worker/launcher
set other workers' delay costs. But now that the worker sets its delay
costs, we don't need to show the pid in the log. Also, I think it's
useful for debugging and investigating the system if we log it when
changing the values. The log I imagined to add was like:

@@ -1801,6 +1801,13 @@ VacuumUpdateCosts(void)
VacuumCostDelay = vacuum_cost_delay;

AutoVacuumUpdateLimit();
+
+       elog(DEBUG2, "autovacuum update costs (db=%u, rel=%u,
dobalance=%s, cost_limit=%d, cost_delay=%g active=%s failsafe=%s)",
+            MyWorkerInfo->wi_dboid, MyWorkerInfo->wi_tableoid,
+            pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance)
? "no" : "yes",
+            VacuumCostLimit, VacuumCostDelay,
+            VacuumCostDelay > 0 ? "yes" : "no",
+            VacuumFailsafeActive ? "yes" : "no");
}
else
{

Makes sense. I've updated the log message to roughly what you suggested.
I also realized I think it does make sense to call it in
VacuumUpdateCosts() -- only for autovacuum workers of course. I've done
this. I haven't taken the lock though and can't decide if I must since
they access dboid and tableoid -- those are not going to change at this
point, but I still don't know if I can access them lock-free...
Perhaps there is a way to condition it on the log level?

If I have to take a lock, then I don't know if we should put these in
VacuumUpdateCosts()...

I think we don't need to acquire a lock there as both values are
updated only by workers reporting this message.

I dunno. I just don't feel that comfortable saying, oh it's okay to
access these without a lock probably. I propose we do one of the
following:

- Take a shared lock inside VacuumUpdateCosts() (it is not called on every
call to vacuum_delay_point()) before reading from these variables.

Pros:
- We will log whenever there is a change to these parameters
Cons:
- This adds overhead in the common case when log level is < DEBUG2.
Is there a way to check the log level before taking the lock?
- Acquiring the lock inside the function is inconsistent with the
pattern that some of the other autovacuum functions requiring
locks use (they assume you have a lock if needed inside of the
function). But, we could assert that the lock is not already held.
- If we later decide we don't like this choice and want to move the
logging elsewhere, it will necessarily log less frequently which
seems like a harder change to make than logging more frequently.

- Move this logging into the loop through relations in do_autovacuum()
and the config reload code and take the shared lock before doing the
logging.

Pros:
- Seems safe and not expensive
- Covers most of the times we would want the logging
Cons:
- duplicates logging in two places

Some minor comments on 0003:

+/*
+ * autovac_recalculate_workers_for_balance
+ *             Recalculate the number of workers to consider, given
cost-related
+ *             storage parameters and the current number of active workers.
+ *
+ * Caller must hold the AutovacuumLock in at least shared mode to access
+ * worker->wi_proc.
+ */

Does it make sense to add Assert(LWLockHeldByMe(AutovacuumLock)) at
the beginning of this function?

I've added this. It is called infrequently enough to be okay, I think.

/* rebalance in case the default cost parameters changed */
-                LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-                autovac_balance_cost();
+                LWLockAcquire(AutovacuumLock, LW_SHARED);
+                autovac_recalculate_workers_for_balance();
LWLockRelease(AutovacuumLock);

Do we really need to have the autovacuum launcher recalculate
av_nworkersForBalance after reloading the config file? Since the cost
parameters are not used inautovac_recalculate_workers_for_balance()
the comment also needs to be updated.

Yep, almost certainly don't need this. I've removed this call to
autovac_recalculate_workers_for_balance().

+                /*
+                 * Balance and update limit values for autovacuum
workers. We must
+                 * always do this in case the autovacuum launcher or another
+                 * autovacuum worker has recalculated the number of
workers across
+                 * which we must balance the limit. This is done by
the launcher when
+                 * launching a new worker and by workers before
vacuuming each table.
+                 */
+                AutoVacuumUpdateCostLimit();

I think the last sentence is not correct. IIUC recalculation of
av_nworkersForBalance is done by the launcher after a worker finished
and by workers before vacuuming each table.

Yes, you are right. However, I think the comment was generally
misleading and I have reworded it.

It's not a problem of this patch, but IIUC since we don't reset
wi_dobalance after vacuuming each table we use the last value of
wi_dobalance for performing autovacuum items. At end of the loop for
tables in do_autovacuum() we have the following code that explains why
we don't reset wi_dobalance:

/*
* Remove my info from shared memory. We could, but intentionally
* don't, unset wi_dobalance on the assumption that we are more likely
* than not to vacuum a table with no cost-related storage parameters
* next, so we don't want to give up our share of I/O for a very short
* interval and thereby thrash the global balance.
*/
LWLockAcquire(AutovacuumScheduleLock, LW_EXCLUSIVE);
MyWorkerInfo->wi_tableoid = InvalidOid;
MyWorkerInfo->wi_sharedrel = false;
LWLockRelease(AutovacuumScheduleLock);

Assuming we agree with that, probably we need to reset it to true
after vacuuming all tables?

Ah, great point. I have done this.

On Thu, Apr 6, 2023 at 8:29 AM Daniel Gustafsson <daniel@yesql.se> wrote:

On 6 Apr 2023, at 08:39, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Also I agree with
where to put the log but I think the log message should start with
lower cases:

+                elog(DEBUG2,
+                         "Autovacuum VacuumUpdateCosts(db=%u, rel=%u,

In principle I agree, but in this case Autovacuum is a name and should IMO in
userfacing messages start with capital A.

I've left this unchanged while I agonize over what to do with the
placement of the log message in general. But I am happy to keep it
uppercase.

+/*
+ * autovac_recalculate_workers_for_balance
+ *             Recalculate the number of workers to consider, given
cost-related
+ *             storage parameters and the current number of active workers.
+ *
+ * Caller must hold the AutovacuumLock in at least shared mode to access
+ * worker->wi_proc.
+ */

Does it make sense to add Assert(LWLockHeldByMe(AutovacuumLock)) at
the beginning of this function?

That's probably not a bad idea.

Done.

---
/* rebalance in case the default cost parameters changed */
-                LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-                autovac_balance_cost();
+                LWLockAcquire(AutovacuumLock, LW_SHARED);
+                autovac_recalculate_workers_for_balance();
LWLockRelease(AutovacuumLock);

Do we really need to have the autovacuum launcher recalculate
av_nworkersForBalance after reloading the config file? Since the cost
parameters are not used inautovac_recalculate_workers_for_balance()
the comment also needs to be updated.

If I understand this comment right; there was a discussion upthread that simply
doing it in both launcher and worker simplifies the code with little overhead.
A comment can reflect that choice though.

Yes, but now that this function no longer deals with the cost limit and
delay values itself, we can remove it.

- Melanie

Attachments:

v17-0003-Autovacuum-refreshes-cost-based-delay-params-mor.patchtext/x-patch; charset=US-ASCII; name=v17-0003-Autovacuum-refreshes-cost-based-delay-params-mor.patchDownload
From 684f710af4cb0bf7cd7ff70baa0a8a8fdc13d48c Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 25 Mar 2023 14:14:55 -0400
Subject: [PATCH v17 3/3] Autovacuum refreshes cost-based delay params more
 often

Allow autovacuum to reload the config file more often so that cost-based
delay parameters can take effect while VACUUMing a relation. Previously
autovacuum workers only reloaded the config file once per relation
vacuumed, so config changes could not take effect until beginning to
vacuum the next table.

Now, check if a reload is pending roughly once per block, when checking
if we need to delay.

In order for autovacuum workers to safely update their own cost delay
and cost limit parameters without impacting performance, we had to
rethink when and how these values were accessed.

Previously, an autovacuum worker's wi_cost_limit was set only at the
beginning of vacuuming a table, after reloading the config file.
Therefore, at the time that autovac_balance_cost() is called, workers
vacuuming tables with no cost-related storage parameters could still
have different values for their wi_cost_limit_base and wi_cost_delay.

Now that the cost parameters can be updated while vacuuming a table,
workers will (within some margin of error) have no reason to have
different values for cost limit and cost delay (in the absence of
cost-related storage parameters). This removes the rationale for keeping
cost limit and cost delay in shared memory. Balancing the cost limit
requires only the number of active autovacuum workers vacuuming a table
with no cost-based storage parameters.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c  |   2 +-
 src/backend/commands/vacuum.c         |  45 ++++-
 src/backend/commands/vacuumparallel.c |   1 -
 src/backend/postmaster/autovacuum.c   | 276 +++++++++++++++-----------
 src/include/commands/vacuum.h         |   1 +
 5 files changed, 202 insertions(+), 123 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 2ba85bd3d6..0a9ebd22bd 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -389,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
 	Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
 		   params->truncate != VACOPTVALUE_AUTO);
-	VacuumFailsafeActive = false;
+	Assert(!VacuumFailsafeActive);
 	vacrel->consider_bypass_optimization = true;
 	vacrel->do_index_vacuuming = true;
 	vacrel->do_index_cleanup = true;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 5b6f8f5244..977e9c4c7e 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -48,6 +48,7 @@
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/interrupt.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/pmsignal.h"
@@ -525,9 +526,9 @@ vacuum(List *relations, VacuumParams *params, BufferAccessStrategy bstrategy,
 	{
 		ListCell   *cur;
 
-		VacuumUpdateCosts();
 		in_vacuum = true;
-		VacuumCostActive = (VacuumCostDelay > 0);
+		VacuumFailsafeActive = false;
+		VacuumUpdateCosts();
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
@@ -581,12 +582,20 @@ vacuum(List *relations, VacuumParams *params, BufferAccessStrategy bstrategy,
 					CommandCounterIncrement();
 				}
 			}
+
+			/*
+			 * Ensure VacuumFailsafeActive has been reset before vacuuming the
+			 * next relation.
+			 */
+			VacuumFailsafeActive = false;
 		}
 	}
 	PG_FINALLY();
 	{
 		in_vacuum = false;
 		VacuumCostActive = false;
+		VacuumFailsafeActive = false;
+		VacuumCostBalance = 0;
 	}
 	PG_END_TRY();
 
@@ -2247,7 +2256,27 @@ vacuum_delay_point(void)
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	if (!VacuumCostActive || InterruptPending)
+	if (InterruptPending ||
+		(!VacuumCostActive && !ConfigReloadPending))
+		return;
+
+	/*
+	 * Reload the configuration file if requested. This allows changes to
+	 * autovacuum_vacuum_cost_limit and autovacuum_vacuum_cost_delay to take
+	 * effect while a table is being vacuumed or analyzed.
+	 */
+	if (ConfigReloadPending && IsAutoVacuumWorkerProcess())
+	{
+		ConfigReloadPending = false;
+		ProcessConfigFile(PGC_SIGHUP);
+		VacuumUpdateCosts();
+	}
+
+	/*
+	 * If we disabled cost-based delays after reloading the config file,
+	 * return.
+	 */
+	if (!VacuumCostActive)
 		return;
 
 	/*
@@ -2280,7 +2309,15 @@ vacuum_delay_point(void)
 
 		VacuumCostBalance = 0;
 
-		VacuumUpdateCosts();
+		/*
+		 * Balance and update limit values for autovacuum workers. We must do
+		 * this periodically as the number of workers across which we are
+		 * balancing the limit may have changed.
+		 *
+		 * XXX: There may be better criteria for determining when to do this
+		 * besides "check after napping".
+		 */
+		AutoVacuumUpdateCostLimit();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0b59c922e4..e200d5caf8 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -995,7 +995,6 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												 false);
 
 	/* Set cost-based vacuum delay */
-	VacuumCostActive = (VacuumCostDelay > 0);
 	VacuumUpdateCosts();
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 296b1851e3..f7237fa5ea 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -139,6 +139,18 @@ int			Log_autovacuum_min_duration = 600000;
 static bool am_autovacuum_launcher = false;
 static bool am_autovacuum_worker = false;
 
+/*
+ * Variables to save the cost-related storage parameters for the current
+ * relation being vacuumed by this autovacuum worker. Using these, we can
+ * ensure we don't overwrite the values of VacuumCostDelay and VacuumCostLimit
+ * after reloading the configuration file. They are initialized to "invalid"
+ * values to indicate no cost-related storage parameters were specified and
+ * will be set in do_autovacuum() after checking the storage parameters in
+ * table_recheck_autovac().
+ */
+static double av_storage_param_cost_delay = -1;
+static int	av_storage_param_cost_limit = -1;
+
 /* Flags set by signal handlers */
 static volatile sig_atomic_t got_SIGUSR2 = false;
 
@@ -189,8 +201,8 @@ typedef struct autovac_table
 {
 	Oid			at_relid;
 	VacuumParams at_params;
-	double		at_vacuum_cost_delay;
-	int			at_vacuum_cost_limit;
+	double		at_storage_param_vac_cost_delay;
+	int			at_storage_param_vac_cost_limit;
 	bool		at_dobalance;
 	bool		at_sharedrel;
 	char	   *at_relname;
@@ -209,7 +221,7 @@ typedef struct autovac_table
  * wi_sharedrel flag indicating whether table is marked relisshared
  * wi_proc		pointer to PGPROC of the running worker, NULL if not started
  * wi_launchtime Time at which this worker was launched
- * wi_cost_*	Vacuum cost-based delay parameters current in this worker
+ * wi_dobalance Whether this worker should be included in balance calculations
  *
  * All fields are protected by AutovacuumLock, except for wi_tableoid and
  * wi_sharedrel which are protected by AutovacuumScheduleLock (note these
@@ -223,11 +235,8 @@ typedef struct WorkerInfoData
 	Oid			wi_tableoid;
 	PGPROC	   *wi_proc;
 	TimestampTz wi_launchtime;
-	bool		wi_dobalance;
+	pg_atomic_flag wi_dobalance;
 	bool		wi_sharedrel;
-	double		wi_cost_delay;
-	int			wi_cost_limit;
-	int			wi_cost_limit_base;
 } WorkerInfoData;
 
 typedef struct WorkerInfoData *WorkerInfo;
@@ -273,6 +282,8 @@ typedef struct AutoVacuumWorkItem
  * av_startingWorker pointer to WorkerInfo currently being started (cleared by
  *					the worker itself as soon as it's up and running)
  * av_workItems		work item array
+ * av_nworkersForBalance the number of autovacuum workers to use when
+ * 					calculating the per worker cost limit
  *
  * This struct is protected by AutovacuumLock, except for av_signal and parts
  * of the worker list (see above).
@@ -286,6 +297,7 @@ typedef struct
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+	pg_atomic_uint32 av_nworkersForBalance;
 } AutoVacuumShmemStruct;
 
 static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -319,7 +331,7 @@ static void launch_worker(TimestampTz now);
 static List *get_database_list(void);
 static void rebuild_database_list(Oid newdb);
 static int	db_comparator(const void *a, const void *b);
-static void autovac_balance_cost(void);
+static void autovac_recalculate_workers_for_balance(void);
 
 static void do_autovacuum(void);
 static void FreeWorkerInfo(int code, Datum arg);
@@ -669,7 +681,7 @@ AutoVacLauncherMain(int argc, char *argv[])
 			{
 				LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 				AutoVacuumShmem->av_signal[AutoVacRebalance] = false;
-				autovac_balance_cost();
+				autovac_recalculate_workers_for_balance();
 				LWLockRelease(AutovacuumLock);
 			}
 
@@ -818,11 +830,6 @@ HandleAutoVacLauncherInterrupts(void)
 		if (!AutoVacuumingActive())
 			AutoVacLauncherShutdown();
 
-		/* rebalance in case the default cost parameters changed */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-		autovac_balance_cost();
-		LWLockRelease(AutovacuumLock);
-
 		/* rebuild the list in case the naptime changed */
 		rebuild_database_list(InvalidOid);
 	}
@@ -1754,10 +1761,7 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_sharedrel = false;
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
-		MyWorkerInfo->wi_dobalance = false;
-		MyWorkerInfo->wi_cost_delay = 0;
-		MyWorkerInfo->wi_cost_limit = 0;
-		MyWorkerInfo->wi_cost_limit_base = 0;
+		pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
 		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 						&MyWorkerInfo->wi_links);
 		/* not mine anymore */
@@ -1783,97 +1787,128 @@ VacuumUpdateCosts(void)
 {
 	if (MyWorkerInfo)
 	{
-		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
-		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+		if (av_storage_param_cost_delay >= 0)
+			VacuumCostDelay = av_storage_param_cost_delay;
+		else if (autovacuum_vac_cost_delay >= 0)
+			VacuumCostDelay = autovacuum_vac_cost_delay;
+		else
+			/* fall back to vacuum_cost_delay */
+			VacuumCostDelay = vacuum_cost_delay;
+
+		AutoVacuumUpdateCostLimit();
 	}
 	else
 	{
 		/* Must be explicit VACUUM or ANALYZE */
-		VacuumCostLimit = vacuum_cost_limit;
 		VacuumCostDelay = vacuum_cost_delay;
+		VacuumCostLimit = vacuum_cost_limit;
+	}
+
+	/*
+	 * If configuration changes are allowed to impact VacuumCostActive, make
+	 * sure it is updated.
+	 */
+	if (VacuumFailsafeActive)
+		Assert(!VacuumCostActive);
+	else if (VacuumCostDelay > 0)
+		VacuumCostActive = true;
+	else
+	{
+		VacuumCostActive = false;
+		VacuumCostBalance = 0;
+	}
+
+	if (MyWorkerInfo)
+	{
+		elog(DEBUG2,
+			 "Autovacuum VacuumUpdateCosts(db=%u, rel=%u, dobalance=%s, cost_limit=%d, cost_delay=%g active=%s failsafe=%s)",
+			 MyWorkerInfo->wi_dboid, MyWorkerInfo->wi_tableoid,
+			 pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance) ? "no" : "yes",
+			 VacuumCostLimit, VacuumCostDelay,
+			 VacuumCostDelay > 0 ? "yes" : "no",
+			 VacuumFailsafeActive ? "yes" : "no");
 	}
 }
 
 /*
- * autovac_balance_cost
- *		Recalculate the cost limit setting for each active worker.
- *
- * Caller must hold the AutovacuumLock in exclusive mode.
+ * Update VacuumCostLimit with the correct value for an autovacuum worker, given
+ * the value of other relevant cost limit parameters and the number of workers
+ * across which the limit must be balanced. Autovacuum workers must call this
+ * regularly in case av_nworkers_for_balance has been updated by another worker
+ * or by the autovacuum launcher. They must also call it after a config reload.
  */
-static void
-autovac_balance_cost(void)
+void
+AutoVacuumUpdateCostLimit(void)
 {
+	if (!MyWorkerInfo)
+		return;
+
 	/*
-	 * The idea here is that we ration out I/O equally.  The amount of I/O
-	 * that a worker can consume is determined by cost_limit/cost_delay, so we
-	 * try to equalize those ratios rather than the raw limit settings.
-	 *
 	 * note: in cost_limit, zero also means use value from elsewhere, because
 	 * zero is not a valid value.
 	 */
-	int			vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
-								  autovacuum_vac_cost_limit : vacuum_cost_limit);
-	double		vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
-								  autovacuum_vac_cost_delay : vacuum_cost_delay);
-	double		cost_total;
-	double		cost_avail;
-	dlist_iter	iter;
 
-	/* not set? nothing to do */
-	if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
-		return;
-
-	/* calculate the total base cost limit of participating active workers */
-	cost_total = 0.0;
-	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
+	if (av_storage_param_cost_limit > 0)
+		VacuumCostLimit = av_storage_param_cost_limit;
+	else
 	{
-		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
+		int			nworkers_for_balance;
+
+		if (autovacuum_vac_cost_limit > 0)
+			VacuumCostLimit = autovacuum_vac_cost_limit;
+		else
+			VacuumCostLimit = vacuum_cost_limit;
+
+		/* Only balance limit if no cost-related storage parameters specified */
+		if (pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance))
+			return;
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-			cost_total +=
-				(double) worker->wi_cost_limit_base / worker->wi_cost_delay;
+		Assert(VacuumCostLimit > 0);
+
+		nworkers_for_balance = pg_atomic_read_u32(&AutoVacuumShmem->av_nworkersForBalance);
+
+		/* There is at least 1 autovac worker (this worker) */
+		if (nworkers_for_balance <= 0)
+			elog(ERROR, "nworkers_for_balance must be > 0");
+
+		VacuumCostLimit = Max(VacuumCostLimit / nworkers_for_balance, 1);
 	}
+}
 
-	/* there are no cost limits -- nothing to do */
-	if (cost_total <= 0)
-		return;
+/*
+ * autovac_recalculate_workers_for_balance
+ *		Recalculate the number of workers to consider, given cost-related
+ *		storage parameters and the current number of active workers.
+ *
+ * Caller must hold the AutovacuumLock in at least shared mode to access
+ * worker->wi_proc.
+ */
+static void
+autovac_recalculate_workers_for_balance(void)
+{
+	dlist_iter	iter;
+	int			orig_nworkers_for_balance;
+	int			nworkers_for_balance = 0;
+
+	Assert(LWLockHeldByMe(AutovacuumLock));
+
+	orig_nworkers_for_balance =
+		pg_atomic_read_u32(&AutoVacuumShmem->av_nworkersForBalance);
 
-	/*
-	 * Adjust cost limit of each active worker to balance the total of cost
-	 * limit to autovacuum_vacuum_cost_limit.
-	 */
-	cost_avail = (double) vac_cost_limit / vac_cost_delay;
 	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
 	{
 		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-		{
-			int			limit = (int)
-			(cost_avail * worker->wi_cost_limit_base / cost_total);
-
-			/*
-			 * We put a lower bound of 1 on the cost_limit, to avoid division-
-			 * by-zero in the vacuum code.  Also, in case of roundoff trouble
-			 * in these calculations, let's be sure we don't ever set
-			 * cost_limit to more than the base value.
-			 */
-			worker->wi_cost_limit = Max(Min(limit,
-											worker->wi_cost_limit_base),
-										1);
-		}
+		if (worker->wi_proc == NULL ||
+			pg_atomic_unlocked_test_flag(&worker->wi_dobalance))
+			continue;
 
-		if (worker->wi_proc != NULL)
-			elog(DEBUG2, "autovac_balance_cost(pid=%d db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d, cost_delay=%g)",
-				 worker->wi_proc->pid, worker->wi_dboid, worker->wi_tableoid,
-				 worker->wi_dobalance ? "yes" : "no",
-				 worker->wi_cost_limit, worker->wi_cost_limit_base,
-				 worker->wi_cost_delay);
+		nworkers_for_balance++;
 	}
+
+	if (nworkers_for_balance != orig_nworkers_for_balance)
+		pg_atomic_write_u32(&AutoVacuumShmem->av_nworkersForBalance,
+							nworkers_for_balance);
 }
 
 /*
@@ -2421,23 +2456,34 @@ do_autovacuum(void)
 			continue;
 		}
 
-		/* Must hold AutovacuumLock while mucking with cost balance info */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+		/*
+		 * Save the cost-related storage parameter values in global variables
+		 * for reference when updating VacuumCostLimit and VacuumCostDelay
+		 * during vacuuming this table.
+		 */
+		av_storage_param_cost_limit = tab->at_storage_param_vac_cost_limit;
+		av_storage_param_cost_delay = tab->at_storage_param_vac_cost_delay;
 
-		/* advertise my cost delay parameters for the balancing algorithm */
-		MyWorkerInfo->wi_dobalance = tab->at_dobalance;
-		MyWorkerInfo->wi_cost_delay = tab->at_vacuum_cost_delay;
-		MyWorkerInfo->wi_cost_limit = tab->at_vacuum_cost_limit;
-		MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
+		/*
+		 * We only expect this worker to ever set the flag, so don't bother
+		 * checking the return value. We shouldn't have to retry.
+		 */
+		if (tab->at_dobalance)
+			pg_atomic_test_set_flag(&MyWorkerInfo->wi_dobalance);
+		else
+			pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
 
-		/* do a balance */
-		autovac_balance_cost();
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
+		autovac_recalculate_workers_for_balance();
+		LWLockRelease(AutovacuumLock);
 
-		/* set the active cost parameters from the result of that */
+		/*
+		 * We wait until this point to update cost delay and cost limit
+		 * values, even though we reloaded the configuration file above, so
+		 * that we can take into account the cost-related storage parameters.
+		 */
 		VacuumUpdateCosts();
 
-		/* done */
-		LWLockRelease(AutovacuumLock);
 
 		/* clean up memory before each iteration */
 		MemoryContextResetAndDeleteChildren(PortalContext);
@@ -2521,16 +2567,17 @@ deleted:
 		pfree(tab);
 
 		/*
-		 * Remove my info from shared memory.  We could, but intentionally
-		 * don't, clear wi_cost_limit and friends --- this is on the
-		 * assumption that we probably have more to do with similar cost
-		 * settings, so we don't want to give up our share of I/O for a very
-		 * short interval and thereby thrash the global balance.
+		 * Remove my info from shared memory.  We set wi_dobalance on the
+		 * assumption that we are more likely than not to vacuum a table with
+		 * no cost-related storage parameters next, so we want to claim our
+		 * share of I/O as soon as possible to avoid thrashing the global
+		 * balance.
 		 */
 		LWLockAcquire(AutovacuumScheduleLock, LW_EXCLUSIVE);
 		MyWorkerInfo->wi_tableoid = InvalidOid;
 		MyWorkerInfo->wi_sharedrel = false;
 		LWLockRelease(AutovacuumScheduleLock);
+		pg_atomic_test_set_flag(&MyWorkerInfo->wi_dobalance);
 	}
 
 	/*
@@ -2562,6 +2609,7 @@ deleted:
 		{
 			ConfigReloadPending = false;
 			ProcessConfigFile(PGC_SIGHUP);
+			VacuumUpdateCosts();
 		}
 
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
@@ -2797,8 +2845,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		int			freeze_table_age;
 		int			multixact_freeze_min_age;
 		int			multixact_freeze_table_age;
-		int			vac_cost_limit;
-		double		vac_cost_delay;
 		int			log_min_duration;
 
 		/*
@@ -2808,20 +2854,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 * defaults, autovacuum's own first and plain vacuum second.
 		 */
 
-		/* -1 in autovac setting means use plain vacuum_cost_delay */
-		vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
-			? avopts->vacuum_cost_delay
-			: (autovacuum_vac_cost_delay >= 0)
-			? autovacuum_vac_cost_delay
-			: vacuum_cost_delay;
-
-		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
-		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
-			? avopts->vacuum_cost_limit
-			: (autovacuum_vac_cost_limit > 0)
-			? autovacuum_vac_cost_limit
-			: vacuum_cost_limit;
-
 		/* -1 in autovac setting means use log_autovacuum_min_duration */
 		log_min_duration = (avopts && avopts->log_min_duration >= 0)
 			? avopts->log_min_duration
@@ -2877,8 +2909,10 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab->at_params.multixact_freeze_table_age = multixact_freeze_table_age;
 		tab->at_params.is_wraparound = wraparound;
 		tab->at_params.log_min_duration = log_min_duration;
-		tab->at_vacuum_cost_limit = vac_cost_limit;
-		tab->at_vacuum_cost_delay = vac_cost_delay;
+		tab->at_storage_param_vac_cost_limit = avopts ?
+			avopts->vacuum_cost_limit : 0;
+		tab->at_storage_param_vac_cost_delay = avopts ?
+			avopts->vacuum_cost_delay : -1;
 		tab->at_relname = NULL;
 		tab->at_nspname = NULL;
 		tab->at_datname = NULL;
@@ -3380,10 +3414,18 @@ AutoVacuumShmemInit(void)
 		worker = (WorkerInfo) ((char *) AutoVacuumShmem +
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
+
 		/* initialize the WorkerInfo free list */
 		for (i = 0; i < autovacuum_max_workers; i++)
+		{
 			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 							&worker[i].wi_links);
+
+			pg_atomic_init_flag(&worker[i].wi_dobalance);
+		}
+
+		pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
+
 	}
 	else
 		Assert(found);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 440ddd2154..e9705ba51d 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -352,6 +352,7 @@ extern IndexBulkDeleteResult *vac_cleanup_one_index(IndexVacuumInfo *ivinfo,
 extern Size vac_max_items_to_alloc_size(int max_items);
 
 /* In postmaster/autovacuum.c */
+extern void AutoVacuumUpdateCostLimit(void);
 extern void VacuumUpdateCosts(void);
 
 /* in commands/vacuumparallel.c */
-- 
2.37.2

v17-0001-Make-vacuum-s-failsafe_active-a-global.patchtext/x-patch; charset=US-ASCII; name=v17-0001-Make-vacuum-s-failsafe_active-a-global.patchDownload
From 5b31ff0af5813d2f7747eaabbbf5e6f4c8284c4a Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Fri, 31 Mar 2023 10:38:39 -0400
Subject: [PATCH v17 1/3] Make vacuum's failsafe_active a global

While vacuuming a table in failsafe mode, VacuumCostActive should not be
re-enabled. This currently isn't a problem because vacuum cost
parameters are only refreshed in between vacuuming tables and failsafe
status is reset for every table. In preparation for allowing vacuum cost
parameters to be updated more frequently, elevate
LVRelState->failsafe_active to a global, VacuumFailsafeActive, which
will be checked when determining whether or not to re-enable vacuum
cost-related delays.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 16 +++++++---------
 src/backend/commands/vacuum.c        | 15 +++++++++++++++
 src/include/commands/vacuum.h        |  1 +
 3 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 639179aa46..2ba85bd3d6 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -153,8 +153,6 @@ typedef struct LVRelState
 	bool		aggressive;
 	/* Use visibility map to skip? (disabled by DISABLE_PAGE_SKIPPING) */
 	bool		skipwithvm;
-	/* Wraparound failsafe has been triggered? */
-	bool		failsafe_active;
 	/* Consider index vacuuming bypass optimization? */
 	bool		consider_bypass_optimization;
 
@@ -391,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
 	Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
 		   params->truncate != VACOPTVALUE_AUTO);
-	vacrel->failsafe_active = false;
+	VacuumFailsafeActive = false;
 	vacrel->consider_bypass_optimization = true;
 	vacrel->do_index_vacuuming = true;
 	vacrel->do_index_cleanup = true;
@@ -709,7 +707,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 			}
 			else
 			{
-				if (!vacrel->failsafe_active)
+				if (!VacuumFailsafeActive)
 					appendStringInfoString(&buf, _("index scan bypassed: "));
 				else
 					appendStringInfoString(&buf, _("index scan bypassed by failsafe: "));
@@ -2293,7 +2291,7 @@ lazy_vacuum(LVRelState *vacrel)
 		 * vacuuming or heap vacuuming.  This VACUUM operation won't end up
 		 * back here again.
 		 */
-		Assert(vacrel->failsafe_active);
+		Assert(VacuumFailsafeActive);
 	}
 
 	/*
@@ -2374,7 +2372,7 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	 */
 	Assert(vacrel->num_index_scans > 0 ||
 		   vacrel->dead_items->num_items == vacrel->lpdead_items);
-	Assert(allindexes || vacrel->failsafe_active);
+	Assert(allindexes || VacuumFailsafeActive);
 
 	/*
 	 * Increase and report the number of index scans.
@@ -2616,12 +2614,12 @@ static bool
 lazy_check_wraparound_failsafe(LVRelState *vacrel)
 {
 	/* Don't warn more than once per VACUUM */
-	if (vacrel->failsafe_active)
+	if (VacuumFailsafeActive)
 		return true;
 
 	if (unlikely(vacuum_xid_failsafe_check(&vacrel->cutoffs)))
 	{
-		vacrel->failsafe_active = true;
+		VacuumFailsafeActive = true;
 
 		/*
 		 * Abandon use of a buffer access strategy to allow use of all of
@@ -2820,7 +2818,7 @@ should_attempt_truncation(LVRelState *vacrel)
 {
 	BlockNumber possibly_freeable;
 
-	if (!vacrel->do_rel_truncate || vacrel->failsafe_active ||
+	if (!vacrel->do_rel_truncate || VacuumFailsafeActive ||
 		old_snapshot_threshold >= 0)
 		return false;
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index ea1d8960f4..7fc5c19e37 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -72,6 +72,21 @@ int			vacuum_multixact_freeze_table_age;
 int			vacuum_failsafe_age;
 int			vacuum_multixact_failsafe_age;
 
+/*
+ * VacuumFailsafeActive is a defined as a global so that we can determine
+ * whether or not to re-enable cost-based vacuum delay when vacuuming a table.
+ * If failsafe mode has been engaged, we will not re-enable cost-based delay
+ * for the table until after vacuuming has completed, regardless of other
+ * settings.
+ *
+ * Only VACUUM code should inspect this variable and only table access methods
+ * should set it to true. In Table AM-agnostic VACUUM code, this variable is
+ * inspected to determine whether or not to allow cost-based delays. Table AMs
+ * are free to set it if they desire this behavior, but it is false by default
+ * and reset to false in between vacuuming each relation.
+ */
+bool		VacuumFailsafeActive = false;
+
 /*
  * Variables for cost-based parallel vacuum.  See comments atop
  * compute_parallel_delay to understand how it works.
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 19ca818dc2..1223d15e0d 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -306,6 +306,7 @@ extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
 
+extern PGDLLIMPORT bool VacuumFailsafeActive;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
-- 
2.37.2

v17-0002-Separate-vacuum-cost-variables-from-gucs.patchtext/x-patch; charset=US-ASCII; name=v17-0002-Separate-vacuum-cost-variables-from-gucs.patchDownload
From c456e0d6949e7422168a3eb5b2dc268b7a243833 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Mon, 3 Apr 2023 11:22:18 -0400
Subject: [PATCH v17 2/3] Separate vacuum cost variables from gucs

Vacuum code run both by autovacuum workers and a backend doing
VACUUM/ANALYZE previously used VacuumCostLimit and VacuumCostDelay which
were the global variables for the gucs vacuum_cost_limit and
vacuum_cost_delay. Autovacuum workers needed to override these variables
with their own values, derived from autovacuum_vacuum_cost_limit and
autovacuum_vacuum_cost_delay and worker cost limit balancing logic. This
led to confusing code which, in some cases, both derived and set a new
value of VacuumCostLimit from VacuumCostLimit.

In preparation for refreshing these guc values more often, separate
these variables from the gucs themselves and add a function to update
the global variables using the gucs and existing logic.

Per suggestion by Kyotaro Horiguchi

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
---
 src/backend/commands/vacuum.c         | 15 +++++++++--
 src/backend/commands/vacuumparallel.c |  1 +
 src/backend/postmaster/autovacuum.c   | 38 +++++++++++----------------
 src/backend/utils/init/globals.c      |  2 --
 src/backend/utils/misc/guc_tables.c   |  4 +--
 src/include/commands/vacuum.h         |  7 +++++
 src/include/miscadmin.h               |  2 --
 src/include/postmaster/autovacuum.h   |  3 ---
 8 files changed, 39 insertions(+), 33 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7fc5c19e37..5b6f8f5244 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -71,6 +71,17 @@ int			vacuum_multixact_freeze_min_age;
 int			vacuum_multixact_freeze_table_age;
 int			vacuum_failsafe_age;
 int			vacuum_multixact_failsafe_age;
+double		vacuum_cost_delay;
+int			vacuum_cost_limit;
+
+/*
+ * Variables for cost-based vacuum delay. The defaults differ between
+ * autovacuum and vacuum. These should be overridden with the appropriate GUC
+ * value in vacuum code. These are initialized here to the defaults for client
+ * backends executing VACUUM or ANALYZE.
+ */
+int			VacuumCostLimit = 200;
+double		VacuumCostDelay = 0;
 
 /*
  * VacuumFailsafeActive is a defined as a global so that we can determine
@@ -514,6 +525,7 @@ vacuum(List *relations, VacuumParams *params, BufferAccessStrategy bstrategy,
 	{
 		ListCell   *cur;
 
+		VacuumUpdateCosts();
 		in_vacuum = true;
 		VacuumCostActive = (VacuumCostDelay > 0);
 		VacuumCostBalance = 0;
@@ -2268,8 +2280,7 @@ vacuum_delay_point(void)
 
 		VacuumCostBalance = 0;
 
-		/* update balance values for workers */
-		AutoVacuumUpdateDelay();
+		VacuumUpdateCosts();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 563117a8f6..0b59c922e4 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -996,6 +996,7 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 
 	/* Set cost-based vacuum delay */
 	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumUpdateCosts();
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
 	VacuumPageMiss = 0;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c1e911b1b3..296b1851e3 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1773,17 +1773,25 @@ FreeWorkerInfo(int code, Datum arg)
 }
 
 /*
- * Update the cost-based delay parameters, so that multiple workers consume
- * each a fraction of the total available I/O.
+ * Update vacuum cost-based delay-related parameters for autovacuum workers and
+ * backends executing VACUUM or ANALYZE using the value of relevant gucs and
+ * global state. This must be called during setup for vacuum and after every
+ * config reload to ensure up-to-date values.
  */
 void
-AutoVacuumUpdateDelay(void)
+VacuumUpdateCosts(void)
 {
 	if (MyWorkerInfo)
 	{
 		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
 		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
 	}
+	else
+	{
+		/* Must be explicit VACUUM or ANALYZE */
+		VacuumCostLimit = vacuum_cost_limit;
+		VacuumCostDelay = vacuum_cost_delay;
+	}
 }
 
 /*
@@ -1804,9 +1812,9 @@ autovac_balance_cost(void)
 	 * zero is not a valid value.
 	 */
 	int			vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
-								  autovacuum_vac_cost_limit : VacuumCostLimit);
+								  autovacuum_vac_cost_limit : vacuum_cost_limit);
 	double		vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
-								  autovacuum_vac_cost_delay : VacuumCostDelay);
+								  autovacuum_vac_cost_delay : vacuum_cost_delay);
 	double		cost_total;
 	double		cost_avail;
 	dlist_iter	iter;
@@ -2311,8 +2319,6 @@ do_autovacuum(void)
 		autovac_table *tab;
 		bool		isshared;
 		bool		skipit;
-		double		stdVacuumCostDelay;
-		int			stdVacuumCostLimit;
 		dlist_iter	iter;
 
 		CHECK_FOR_INTERRUPTS();
@@ -2415,14 +2421,6 @@ do_autovacuum(void)
 			continue;
 		}
 
-		/*
-		 * Remember the prevailing values of the vacuum cost GUCs.  We have to
-		 * restore these at the bottom of the loop, else we'll compute wrong
-		 * values in the next iteration of autovac_balance_cost().
-		 */
-		stdVacuumCostDelay = VacuumCostDelay;
-		stdVacuumCostLimit = VacuumCostLimit;
-
 		/* Must hold AutovacuumLock while mucking with cost balance info */
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 
@@ -2436,7 +2434,7 @@ do_autovacuum(void)
 		autovac_balance_cost();
 
 		/* set the active cost parameters from the result of that */
-		AutoVacuumUpdateDelay();
+		VacuumUpdateCosts();
 
 		/* done */
 		LWLockRelease(AutovacuumLock);
@@ -2533,10 +2531,6 @@ deleted:
 		MyWorkerInfo->wi_tableoid = InvalidOid;
 		MyWorkerInfo->wi_sharedrel = false;
 		LWLockRelease(AutovacuumScheduleLock);
-
-		/* restore vacuum cost GUCs for the next iteration */
-		VacuumCostDelay = stdVacuumCostDelay;
-		VacuumCostLimit = stdVacuumCostLimit;
 	}
 
 	/*
@@ -2819,14 +2813,14 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			? avopts->vacuum_cost_delay
 			: (autovacuum_vac_cost_delay >= 0)
 			? autovacuum_vac_cost_delay
-			: VacuumCostDelay;
+			: vacuum_cost_delay;
 
 		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
 		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
 			? avopts->vacuum_cost_limit
 			: (autovacuum_vac_cost_limit > 0)
 			? autovacuum_vac_cost_limit
-			: VacuumCostLimit;
+			: vacuum_cost_limit;
 
 		/* -1 in autovac setting means use log_autovacuum_min_duration */
 		log_min_duration = (avopts && avopts->log_min_duration >= 0)
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 1b1d814254..8e5b065e8f 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -142,8 +142,6 @@ int			MaxBackends = 0;
 int			VacuumCostPageHit = 1;	/* GUC parameters for vacuum */
 int			VacuumCostPageMiss = 2;
 int			VacuumCostPageDirty = 20;
-int			VacuumCostLimit = 200;
-double		VacuumCostDelay = 0;
 
 int64		VacuumPageHit = 0;
 int64		VacuumPageMiss = 0;
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8062589efd..77db1a146c 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2409,7 +2409,7 @@ struct config_int ConfigureNamesInt[] =
 			gettext_noop("Vacuum cost amount available before napping."),
 			NULL
 		},
-		&VacuumCostLimit,
+		&vacuum_cost_limit,
 		200, 1, 10000,
 		NULL, NULL, NULL
 	},
@@ -3701,7 +3701,7 @@ struct config_real ConfigureNamesReal[] =
 			NULL,
 			GUC_UNIT_MS
 		},
-		&VacuumCostDelay,
+		&vacuum_cost_delay,
 		0, 0, 100,
 		NULL, NULL, NULL
 	},
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1223d15e0d..440ddd2154 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,8 @@ extern PGDLLIMPORT int vacuum_multixact_freeze_min_age;
 extern PGDLLIMPORT int vacuum_multixact_freeze_table_age;
 extern PGDLLIMPORT int vacuum_failsafe_age;
 extern PGDLLIMPORT int vacuum_multixact_failsafe_age;
+extern PGDLLIMPORT double vacuum_cost_delay;
+extern PGDLLIMPORT int vacuum_cost_limit;
 
 /* Variables for cost-based parallel vacuum */
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
@@ -307,6 +309,8 @@ extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
 
 extern PGDLLIMPORT bool VacuumFailsafeActive;
+extern PGDLLIMPORT int VacuumCostLimit;
+extern PGDLLIMPORT double VacuumCostDelay;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
@@ -347,6 +351,9 @@ extern IndexBulkDeleteResult *vac_cleanup_one_index(IndexVacuumInfo *ivinfo,
 													IndexBulkDeleteResult *istat);
 extern Size vac_max_items_to_alloc_size(int max_items);
 
+/* In postmaster/autovacuum.c */
+extern void VacuumUpdateCosts(void);
+
 /* in commands/vacuumparallel.c */
 extern ParallelVacuumState *parallel_vacuum_init(Relation rel, Relation *indrels,
 												 int nindexes, int nrequested_workers,
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 06a86f9ac1..66db1b2c69 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -266,8 +266,6 @@ extern PGDLLIMPORT int max_parallel_maintenance_workers;
 extern PGDLLIMPORT int VacuumCostPageHit;
 extern PGDLLIMPORT int VacuumCostPageMiss;
 extern PGDLLIMPORT int VacuumCostPageDirty;
-extern PGDLLIMPORT int VacuumCostLimit;
-extern PGDLLIMPORT double VacuumCostDelay;
 
 extern PGDLLIMPORT int64 VacuumPageHit;
 extern PGDLLIMPORT int64 VacuumPageMiss;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index c140371b51..65afd1ea1e 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -63,9 +63,6 @@ extern int	StartAutoVacWorker(void);
 /* called from postmaster when a worker could not be forked */
 extern void AutoVacWorkerFailed(void);
 
-/* autovacuum cost-delay balancer */
-extern void AutoVacuumUpdateDelay(void);
-
 #ifdef EXEC_BACKEND
 extern void AutoVacLauncherMain(int argc, char *argv[]) pg_attribute_noreturn();
 extern void AutoVacWorkerMain(int argc, char *argv[]) pg_attribute_noreturn();
-- 
2.37.2

#72Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#71)
3 attachment(s)
Re: Should vacuum process config file reload more often

I think attached v18 addresses all outstanding issues except a run
through the docs making sure all mentions of the balancing algorithm are
still correct.

On Wed, Apr 5, 2023 at 9:10 AM Daniel Gustafsson <daniel@yesql.se> wrote:

On 4 Apr 2023, at 22:04, Melanie Plageman <melanieplageman@gmail.com> wrote:

+extern int VacuumCostLimit;
+extern double VacuumCostDelay;
...
-extern PGDLLIMPORT int VacuumCostLimit;
-extern PGDLLIMPORT double VacuumCostDelay;

Same with these, I don't think this is according to our default visibility.
Moreover, I'm not sure it's a good idea to perform this rename. This will keep
VacuumCostLimit and VacuumCostDelay exported, but change their meaning. Any
external code referring to these thinking they are backing the GUCs will still
compile, but may be broken in subtle ways. Is there a reason for not keeping
the current GUC variables and instead add net new ones?

When VacuumCostLimit was the same variable in the code and for the GUC
vacuum_cost_limit, everytime we reload the config file, VacuumCostLimit
is overwritten. Autovacuum workers have to overwrite this value with the
appropriate one for themselves given the balancing logic and the value
of autovacuum_vacuum_cost_limit. However, the problem is, because you
can specify -1 for autovacuum_vacuum_cost_limit to indicate it should
fall back to vacuum_cost_limit, we have to reference the value of
VacuumCostLimit when calculating the new autovacuum worker's cost limit
after a config reload.

But, you have to be sure you *only* do this after a config reload when
the value of VacuumCostLimit is fresh and unmodified or you risk
dividing the value of VacuumCostLimit over and over. That means it is
unsafe to call functions updating the cost limit more than once.

This orchestration wasn't as difficult when we only reloaded the config
file once every table. We were careful about it and also kept the
original "base" cost limit around from table_recheck_autovac(). However,
once we started reloading the config file more often, this no longer
works.

By separating the variables modified when the gucs are set and the ones
used the code, we can make sure we always have the original value the
guc was set to in vacuum_cost_limit and autovacuum_vacuum_cost_limit,
whenever we need to reference it.

That being said, perhaps we should document what extensions should do?
Do you think they will want to use the variables backing the gucs or to
be able to overwrite the variables being used in the code?

I think I wasn't clear in my comment, sorry. I don't have a problem with
introducing a new variable to split the balanced value from the GUC value.
What I don't think we should do is repurpose an exported symbol into doing a
new thing. In the case at hand I think VacuumCostLimit and VacuumCostDelay
should remain the backing variables for the GUCs, with vacuum_cost_limit and
vacuum_cost_delay carrying the balanced values. So the inverse of what is in
the patch now.

The risk of these symbols being used in extensions might be very low but on
principle it seems unwise to alter a symbol and risk subtle breakage.

In attached v18, I have flipped them. Existing (in master) GUCs which
were exported for VacuumCostLimit and VacuumCostDelay retain their names
and new globals vacuum_cost_limit and vacuum_cost_delay have been
introduced for use in the code.

Flipping these kind of melted my mind, so I could definitely use another
set of eyes double checking that the correct ones are being used in the
correct places throughout 0002 and 0003.

On Thu, Apr 6, 2023 at 3:09 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

v17 attached does not yet fix the logging problem or variable naming
problem.

I have not changed where AutoVacuumUpdateCostLimit() is called either.

This is effectively just a round of cleanup. I hope I have managed to
address all other code review feedback so far, though some may have
slipped through the cracks.

On Wed, Apr 5, 2023 at 2:56 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Apr 5, 2023 at 11:29 AM Melanie Plageman <melanieplageman@gmail.com> wrote:
+ /*
+ * Balance and update limit values for autovacuum workers. We must
+ * always do this in case the autovacuum launcher or another
+ * autovacuum worker has recalculated the number of workers across
+ * which we must balance the limit. This is done by the launcher when
+ * launching a new worker and by workers before vacuuming each table.
+ */

I don't quite understand what's going on here. A big reason that I'm
worried about this whole issue in the first place is that sometimes
there's a vacuum going on a giant table and you can't get it to go
fast. You want it to absorb new settings, and to do so quickly. I
realize that this is about the number of workers, not the actual cost
limit, so that makes what I'm about to say less important. But ... is
this often enough? Like, the time before we move onto the next table
could be super long. The time before a new worker is launched should
be ~autovacuum_naptime/autovacuum_max_workers or ~20s with default
settings, so that's not horrible, but I'm kind of struggling to
understand the rationale for this particular choice. Maybe it's fine.

I've at least updated this comment to be more correct/less misleading.

+               if (autovacuum_vac_cost_limit > 0)
+                       VacuumCostLimit = autovacuum_vac_cost_limit;
+               else
+                       VacuumCostLimit = vacuum_cost_limit;
+
+               /* Only balance limit if no cost-related storage
parameters specified */
+               if (pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance))
+                       return;
+               Assert(VacuumCostLimit > 0);
+
+               nworkers_for_balance = pg_atomic_read_u32(
+
&AutoVacuumShmem->av_nworkersForBalance);
+
+               /* There is at least 1 autovac worker (this worker). */
+               if (nworkers_for_balance <= 0)
+                       elog(ERROR, "nworkers_for_balance must be > 0");
+
+               VacuumCostLimit = Max(VacuumCostLimit /
nworkers_for_balance, 1);

I think it would be better stylistically to use a temporary variable
here and only assign the final value to VacuumCostLimit.

I tried that and thought it adding confusing clutter. If it is a code
cleanliness issue, I am willing to change it, though.

On Wed, Apr 5, 2023 at 3:04 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 5 Apr 2023, at 20:55, Robert Haas <robertmhaas@gmail.com> wrote:

Again, I don't think this is something we should try to
address right now under time pressure, but in the future, I think we
should consider ripping this behavior out.

I would not be opposed to that, but I wholeheartedly agree that it's not the
job of this patch (or any patch at this point in the cycle).

+               if (autovacuum_vac_cost_limit > 0)
+                       VacuumCostLimit = autovacuum_vac_cost_limit;
+               else
+                       VacuumCostLimit = vacuum_cost_limit;
+
+               /* Only balance limit if no cost-related storage
parameters specified */
+               if (pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance))
+                       return;
+               Assert(VacuumCostLimit > 0);
+
+               nworkers_for_balance = pg_atomic_read_u32(
+
&AutoVacuumShmem->av_nworkersForBalance);
+
+               /* There is at least 1 autovac worker (this worker). */
+               if (nworkers_for_balance <= 0)
+                       elog(ERROR, "nworkers_for_balance must be > 0");
+
+               VacuumCostLimit = Max(VacuumCostLimit /
nworkers_for_balance, 1);

I think it would be better stylistically to use a temporary variable
here and only assign the final value to VacuumCostLimit.

I can agree with that. Another supertiny nitpick on the above is to not end a
single-line comment with a period.

I have fixed this.

On Thu, Apr 6, 2023 at 2:40 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Apr 6, 2023 at 12:29 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Thanks all for the reviews.

v16 attached. I put it together rather quickly, so there might be a few
spurious whitespaces or similar. There is one rather annoying pgindent
outlier that I have to figure out what to do about as well.

The remaining functional TODOs that I know of are:

- Resolve what to do about names of GUC and vacuum variables for cost
limit and cost delay (since it may affect extensions)

- Figure out what to do about the logging message which accesses dboid
and tableoid (lock/no lock, where to put it, etc)

- I see several places in docs which reference the balancing algorithm
for autovac workers. I did not read them in great detail, but we may
want to review them to see if any require updates.

- Consider whether or not the initial two commits should just be
squashed with the third commit

- Anything else reviewers are still unhappy with

On Wed, Apr 5, 2023 at 1:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Wed, Apr 5, 2023 at 5:05 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Tue, Apr 4, 2023 at 4:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

---
- if (worker->wi_proc != NULL)
- elog(DEBUG2, "autovac_balance_cost(pid=%d
db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d,
cost_delay=%g)",
- worker->wi_proc->pid,
worker->wi_dboid, worker->wi_tableoid,
- worker->wi_dobalance ? "yes" : "no",
- worker->wi_cost_limit,
worker->wi_cost_limit_base,
- worker->wi_cost_delay);

I think it's better to keep this kind of log in some form for
debugging. For example, we can show these values of autovacuum workers
in VacuumUpdateCosts().

I added a message to do_autovacuum() after calling VacuumUpdateCosts()
in the loop vacuuming each table. That means it will happen once per
table. It's not ideal that I had to move the call to VacuumUpdateCosts()
behind the shared lock in that loop so that we could access the pid and
such in the logging message after updating the cost and delay, but it is
probably okay. Though noone is going to be changing those at this
point, it still seemed better to access them under the lock.

This does mean we won't log anything when we do change the values of
VacuumCostDelay and VacuumCostLimit while vacuuming a table. Is it worth
adding some code to do that in VacuumUpdateCosts() (only when the value
has changed not on every call to VacuumUpdateCosts())? Or perhaps we
could add it in the config reload branch that is already in
vacuum_delay_point()?

Previously, we used to show the pid in the log since a worker/launcher
set other workers' delay costs. But now that the worker sets its delay
costs, we don't need to show the pid in the log. Also, I think it's
useful for debugging and investigating the system if we log it when
changing the values. The log I imagined to add was like:

@@ -1801,6 +1801,13 @@ VacuumUpdateCosts(void)
VacuumCostDelay = vacuum_cost_delay;

AutoVacuumUpdateLimit();
+
+       elog(DEBUG2, "autovacuum update costs (db=%u, rel=%u,
dobalance=%s, cost_limit=%d, cost_delay=%g active=%s failsafe=%s)",
+            MyWorkerInfo->wi_dboid, MyWorkerInfo->wi_tableoid,
+            pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance)
? "no" : "yes",
+            VacuumCostLimit, VacuumCostDelay,
+            VacuumCostDelay > 0 ? "yes" : "no",
+            VacuumFailsafeActive ? "yes" : "no");
}
else
{

Makes sense. I've updated the log message to roughly what you suggested.
I also realized I think it does make sense to call it in
VacuumUpdateCosts() -- only for autovacuum workers of course. I've done
this. I haven't taken the lock though and can't decide if I must since
they access dboid and tableoid -- those are not going to change at this
point, but I still don't know if I can access them lock-free...
Perhaps there is a way to condition it on the log level?

If I have to take a lock, then I don't know if we should put these in
VacuumUpdateCosts()...

I think we don't need to acquire a lock there as both values are
updated only by workers reporting this message.

I dunno. I just don't feel that comfortable saying, oh it's okay to
access these without a lock probably. I propose we do one of the
following:

- Take a shared lock inside VacuumUpdateCosts() (it is not called on every
call to vacuum_delay_point()) before reading from these variables.

Pros:
- We will log whenever there is a change to these parameters
Cons:
- This adds overhead in the common case when log level is < DEBUG2.
Is there a way to check the log level before taking the lock?
- Acquiring the lock inside the function is inconsistent with the
pattern that some of the other autovacuum functions requiring
locks use (they assume you have a lock if needed inside of the
function). But, we could assert that the lock is not already held.
- If we later decide we don't like this choice and want to move the
logging elsewhere, it will necessarily log less frequently which
seems like a harder change to make than logging more frequently.

- Move this logging into the loop through relations in do_autovacuum()
and the config reload code and take the shared lock before doing the
logging.

Pros:
- Seems safe and not expensive
- Covers most of the times we would want the logging
Cons:
- duplicates logging in two places

Okay, in an attempt to wrap up this saga, I have made the following
change:

Autovacuum workers, at the end of VacuumUpdateCosts(), check if cost
limit or cost delay have been changed. If they have, they assert that
they don't already hold the AutovacuumLock, take it in shared mode, and
do the logging.

- Melanie

Attachments:

v18-0003-Autovacuum-refreshes-cost-based-delay-params-mor.patchtext/x-patch; charset=US-ASCII; name=v18-0003-Autovacuum-refreshes-cost-based-delay-params-mor.patchDownload
From 902ef92089d8b073be0d664fa5d5cc23624aa313 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 25 Mar 2023 14:14:55 -0400
Subject: [PATCH v18 3/3] Autovacuum refreshes cost-based delay params more
 often

Allow autovacuum to reload the config file more often so that cost-based
delay parameters can take effect while VACUUMing a relation. Previously,
autovacuum workers only reloaded the config file once per relation
vacuumed, so config changes could not take effect until beginning to
vacuum the next table.

Now, check if a reload is pending roughly once per block, when checking
if we need to delay.

In order for autovacuum workers to safely update their own cost delay
and cost limit parameters without impacting performance, we had to
rethink when and how these values were accessed.

Previously, an autovacuum worker's wi_cost_limit was set only at the
beginning of vacuuming a table, after reloading the config file.
Therefore, at the time that autovac_balance_cost() was called, workers
vacuuming tables with no cost-related storage parameters could still
have different values for their wi_cost_limit_base and wi_cost_delay.

Now that the cost parameters can be updated while vacuuming a table,
workers will (within some margin of error) have no reason to have
different values for cost limit and cost delay (in the absence of
cost-related storage parameters). This removes the rationale for keeping
cost limit and cost delay in shared memory. Balancing the cost limit
requires only the number of active autovacuum workers vacuuming a table
with no cost-based storage parameters.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c  |   2 +-
 src/backend/commands/vacuum.c         |  46 +++-
 src/backend/commands/vacuumparallel.c |   1 -
 src/backend/postmaster/autovacuum.c   | 289 +++++++++++++++-----------
 src/include/commands/vacuum.h         |   1 +
 5 files changed, 217 insertions(+), 122 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 2ba85bd3d6..0a9ebd22bd 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -389,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
 	Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
 		   params->truncate != VACOPTVALUE_AUTO);
-	VacuumFailsafeActive = false;
+	Assert(!VacuumFailsafeActive);
 	vacrel->consider_bypass_optimization = true;
 	vacrel->do_index_vacuuming = true;
 	vacrel->do_index_cleanup = true;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index f2be74cdb5..ca347e0a6d 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -48,6 +48,7 @@
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/interrupt.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/pmsignal.h"
@@ -523,9 +524,9 @@ vacuum(List *relations, VacuumParams *params, BufferAccessStrategy bstrategy,
 	{
 		ListCell   *cur;
 
-		VacuumUpdateCosts();
 		in_vacuum = true;
-		VacuumCostActive = (vacuum_cost_delay > 0);
+		VacuumFailsafeActive = false;
+		VacuumUpdateCosts();
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
@@ -579,12 +580,20 @@ vacuum(List *relations, VacuumParams *params, BufferAccessStrategy bstrategy,
 					CommandCounterIncrement();
 				}
 			}
+
+			/*
+			 * Ensure VacuumFailsafeActive has been reset before vacuuming the
+			 * next relation.
+			 */
+			VacuumFailsafeActive = false;
 		}
 	}
 	PG_FINALLY();
 	{
 		in_vacuum = false;
 		VacuumCostActive = false;
+		VacuumFailsafeActive = false;
+		VacuumCostBalance = 0;
 	}
 	PG_END_TRY();
 
@@ -2245,7 +2254,28 @@ vacuum_delay_point(void)
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	if (!VacuumCostActive || InterruptPending)
+	if (InterruptPending ||
+		(!VacuumCostActive && !ConfigReloadPending))
+		return;
+
+	/*
+	 * Autovacuum workers should reload the configuration file if requested.
+	 * This allows changes to [autovacuum_]vacuum_cost_limit and
+	 * [autovacuum_]vacuum_cost_delay to take effect while a table is being
+	 * vacuumed or analyzed.
+	 */
+	if (ConfigReloadPending && IsAutoVacuumWorkerProcess())
+	{
+		ConfigReloadPending = false;
+		ProcessConfigFile(PGC_SIGHUP);
+		VacuumUpdateCosts();
+	}
+
+	/*
+	 * If we disabled cost-based delays after reloading the config file,
+	 * return.
+	 */
+	if (!VacuumCostActive)
 		return;
 
 	/*
@@ -2278,7 +2308,15 @@ vacuum_delay_point(void)
 
 		VacuumCostBalance = 0;
 
-		VacuumUpdateCosts();
+		/*
+		 * Balance and update limit values for autovacuum workers. We must do
+		 * this periodically, as the number of workers across which we are
+		 * balancing the limit may have changed.
+		 *
+		 * XXX: There may be better criteria for determining when to do this
+		 * besides "check after napping".
+		 */
+		AutoVacuumUpdateCostLimit();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index cc0aff7904..e200d5caf8 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -995,7 +995,6 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												 false);
 
 	/* Set cost-based vacuum delay */
-	VacuumCostActive = (vacuum_cost_delay > 0);
 	VacuumUpdateCosts();
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 3644b86443..8529324ebd 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -139,6 +139,18 @@ int			Log_autovacuum_min_duration = 600000;
 static bool am_autovacuum_launcher = false;
 static bool am_autovacuum_worker = false;
 
+/*
+ * Variables to save the cost-related storage parameters for the current
+ * relation being vacuumed by this autovacuum worker. Using these, we can
+ * ensure we don't overwrite the values of vacuum_cost_delay and
+ * vacuum_cost_limit after reloading the configuration file. They are
+ * initialized to "invalid" values to indicate that no cost-related storage
+ * parameters were specified and will be set in do_autovacuum() after checking
+ * the storage parameters in table_recheck_autovac().
+ */
+static double av_storage_param_cost_delay = -1;
+static int	av_storage_param_cost_limit = -1;
+
 /* Flags set by signal handlers */
 static volatile sig_atomic_t got_SIGUSR2 = false;
 
@@ -189,8 +201,8 @@ typedef struct autovac_table
 {
 	Oid			at_relid;
 	VacuumParams at_params;
-	double		at_vacuum_cost_delay;
-	int			at_vacuum_cost_limit;
+	double		at_storage_param_vac_cost_delay;
+	int			at_storage_param_vac_cost_limit;
 	bool		at_dobalance;
 	bool		at_sharedrel;
 	char	   *at_relname;
@@ -209,7 +221,7 @@ typedef struct autovac_table
  * wi_sharedrel flag indicating whether table is marked relisshared
  * wi_proc		pointer to PGPROC of the running worker, NULL if not started
  * wi_launchtime Time at which this worker was launched
- * wi_cost_*	Vacuum cost-based delay parameters current in this worker
+ * wi_dobalance Whether this worker should be included in balance calculations
  *
  * All fields are protected by AutovacuumLock, except for wi_tableoid and
  * wi_sharedrel which are protected by AutovacuumScheduleLock (note these
@@ -223,11 +235,8 @@ typedef struct WorkerInfoData
 	Oid			wi_tableoid;
 	PGPROC	   *wi_proc;
 	TimestampTz wi_launchtime;
-	bool		wi_dobalance;
+	pg_atomic_flag wi_dobalance;
 	bool		wi_sharedrel;
-	double		wi_cost_delay;
-	int			wi_cost_limit;
-	int			wi_cost_limit_base;
 } WorkerInfoData;
 
 typedef struct WorkerInfoData *WorkerInfo;
@@ -273,6 +282,8 @@ typedef struct AutoVacuumWorkItem
  * av_startingWorker pointer to WorkerInfo currently being started (cleared by
  *					the worker itself as soon as it's up and running)
  * av_workItems		work item array
+ * av_nworkersForBalance the number of autovacuum workers to use when
+ * 					calculating the per worker cost limit
  *
  * This struct is protected by AutovacuumLock, except for av_signal and parts
  * of the worker list (see above).
@@ -286,6 +297,7 @@ typedef struct
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+	pg_atomic_uint32 av_nworkersForBalance;
 } AutoVacuumShmemStruct;
 
 static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -319,7 +331,7 @@ static void launch_worker(TimestampTz now);
 static List *get_database_list(void);
 static void rebuild_database_list(Oid newdb);
 static int	db_comparator(const void *a, const void *b);
-static void autovac_balance_cost(void);
+static void autovac_recalculate_workers_for_balance(void);
 
 static void do_autovacuum(void);
 static void FreeWorkerInfo(int code, Datum arg);
@@ -669,7 +681,7 @@ AutoVacLauncherMain(int argc, char *argv[])
 			{
 				LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 				AutoVacuumShmem->av_signal[AutoVacRebalance] = false;
-				autovac_balance_cost();
+				autovac_recalculate_workers_for_balance();
 				LWLockRelease(AutovacuumLock);
 			}
 
@@ -818,11 +830,6 @@ HandleAutoVacLauncherInterrupts(void)
 		if (!AutoVacuumingActive())
 			AutoVacLauncherShutdown();
 
-		/* rebalance in case the default cost parameters changed */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-		autovac_balance_cost();
-		LWLockRelease(AutovacuumLock);
-
 		/* rebuild the list in case the naptime changed */
 		rebuild_database_list(InvalidOid);
 	}
@@ -1754,10 +1761,7 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_sharedrel = false;
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
-		MyWorkerInfo->wi_dobalance = false;
-		MyWorkerInfo->wi_cost_delay = 0;
-		MyWorkerInfo->wi_cost_limit = 0;
-		MyWorkerInfo->wi_cost_limit_base = 0;
+		pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
 		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 						&MyWorkerInfo->wi_links);
 		/* not mine anymore */
@@ -1781,10 +1785,20 @@ FreeWorkerInfo(int code, Datum arg)
 void
 VacuumUpdateCosts(void)
 {
+	double		original_cost_delay = vacuum_cost_delay;
+	int			original_cost_limit = vacuum_cost_limit;
+
 	if (MyWorkerInfo)
 	{
-		vacuum_cost_delay = MyWorkerInfo->wi_cost_delay;
-		vacuum_cost_limit = MyWorkerInfo->wi_cost_limit;
+		if (av_storage_param_cost_delay >= 0)
+			vacuum_cost_delay = av_storage_param_cost_delay;
+		else if (autovacuum_vac_cost_delay >= 0)
+			vacuum_cost_delay = autovacuum_vac_cost_delay;
+		else
+			/* fall back to VacuumCostDelay */
+			vacuum_cost_delay = VacuumCostDelay;
+
+		AutoVacuumUpdateCostLimit();
 	}
 	else
 	{
@@ -1792,88 +1806,124 @@ VacuumUpdateCosts(void)
 		vacuum_cost_delay = VacuumCostDelay;
 		vacuum_cost_limit = VacuumCostLimit;
 	}
+
+	/*
+	 * If configuration changes are allowed to impact VacuumCostActive, make
+	 * sure it is updated.
+	 */
+	if (VacuumFailsafeActive)
+		Assert(!VacuumCostActive);
+	else if (vacuum_cost_delay > 0)
+		VacuumCostActive = true;
+	else
+	{
+		VacuumCostActive = false;
+		VacuumCostBalance = 0;
+	}
+
+	if (MyWorkerInfo)
+	{
+		/* Only log updates to cost-related variables */
+		if (vacuum_cost_delay == original_cost_delay &&
+			vacuum_cost_limit == original_cost_limit)
+			return;
+
+		Assert(!LWLockHeldByMe(AutovacuumLock));
+
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
+
+		elog(DEBUG2,
+			 "Autovacuum VacuumUpdateCosts(db=%u, rel=%u, dobalance=%s, cost_limit=%d, cost_delay=%g active=%s failsafe=%s)",
+			 MyWorkerInfo->wi_dboid, MyWorkerInfo->wi_tableoid,
+			 pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance) ? "no" : "yes",
+			 vacuum_cost_limit, vacuum_cost_delay,
+			 vacuum_cost_delay > 0 ? "yes" : "no",
+			 VacuumFailsafeActive ? "yes" : "no");
+
+		LWLockRelease(AutovacuumLock);
+	}
 }
 
 /*
- * autovac_balance_cost
- *		Recalculate the cost limit setting for each active worker.
- *
- * Caller must hold the AutovacuumLock in exclusive mode.
+ * Update vacuum_cost_limit with the correct value for an autovacuum worker,
+ * given the value of other relevant cost limit parameters and the number of
+ * workers across which the limit must be balanced. Autovacuum workers must
+ * call this regularly in case av_nworkersForBalance has been updated by
+ * another worker or by the autovacuum launcher. They must also call it after a
+ * config reload.
  */
-static void
-autovac_balance_cost(void)
+void
+AutoVacuumUpdateCostLimit(void)
 {
+	if (!MyWorkerInfo)
+		return;
+
 	/*
-	 * The idea here is that we ration out I/O equally.  The amount of I/O
-	 * that a worker can consume is determined by cost_limit/cost_delay, so we
-	 * try to equalize those ratios rather than the raw limit settings.
-	 *
 	 * note: in cost_limit, zero also means use value from elsewhere, because
 	 * zero is not a valid value.
 	 */
-	int			vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
-								  autovacuum_vac_cost_limit : VacuumCostLimit);
-	double		vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
-								  autovacuum_vac_cost_delay : VacuumCostDelay);
-	double		cost_total;
-	double		cost_avail;
-	dlist_iter	iter;
-
-	/* not set? nothing to do */
-	if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
-		return;
 
-	/* calculate the total base cost limit of participating active workers */
-	cost_total = 0.0;
-	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
+	if (av_storage_param_cost_limit > 0)
+		vacuum_cost_limit = av_storage_param_cost_limit;
+	else
 	{
-		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
+		int			nworkers_for_balance;
+
+		if (autovacuum_vac_cost_limit > 0)
+			vacuum_cost_limit = autovacuum_vac_cost_limit;
+		else
+			vacuum_cost_limit = VacuumCostLimit;
+
+		/* Only balance limit if no cost-related storage parameters specified */
+		if (pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance))
+			return;
+
+		Assert(vacuum_cost_limit > 0);
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-			cost_total +=
-				(double) worker->wi_cost_limit_base / worker->wi_cost_delay;
+		nworkers_for_balance = pg_atomic_read_u32(&AutoVacuumShmem->av_nworkersForBalance);
+
+		/* There is at least 1 autovac worker (this worker) */
+		if (nworkers_for_balance <= 0)
+			elog(ERROR, "nworkers_for_balance must be > 0");
+
+		vacuum_cost_limit = Max(vacuum_cost_limit / nworkers_for_balance, 1);
 	}
+}
 
-	/* there are no cost limits -- nothing to do */
-	if (cost_total <= 0)
-		return;
+/*
+ * autovac_recalculate_workers_for_balance
+ *		Recalculate the number of workers to consider, given cost-related
+ *		storage parameters and the current number of active workers.
+ *
+ * Caller must hold the AutovacuumLock in at least shared mode to access
+ * worker->wi_proc.
+ */
+static void
+autovac_recalculate_workers_for_balance(void)
+{
+	dlist_iter	iter;
+	int			orig_nworkers_for_balance;
+	int			nworkers_for_balance = 0;
+
+	Assert(LWLockHeldByMe(AutovacuumLock));
+
+	orig_nworkers_for_balance =
+		pg_atomic_read_u32(&AutoVacuumShmem->av_nworkersForBalance);
 
-	/*
-	 * Adjust cost limit of each active worker to balance the total of cost
-	 * limit to autovacuum_vacuum_cost_limit.
-	 */
-	cost_avail = (double) vac_cost_limit / vac_cost_delay;
 	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
 	{
 		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-		{
-			int			limit = (int)
-			(cost_avail * worker->wi_cost_limit_base / cost_total);
-
-			/*
-			 * We put a lower bound of 1 on the cost_limit, to avoid division-
-			 * by-zero in the vacuum code.  Also, in case of roundoff trouble
-			 * in these calculations, let's be sure we don't ever set
-			 * cost_limit to more than the base value.
-			 */
-			worker->wi_cost_limit = Max(Min(limit,
-											worker->wi_cost_limit_base),
-										1);
-		}
+		if (worker->wi_proc == NULL ||
+			pg_atomic_unlocked_test_flag(&worker->wi_dobalance))
+			continue;
 
-		if (worker->wi_proc != NULL)
-			elog(DEBUG2, "autovac_balance_cost(pid=%d db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d, cost_delay=%g)",
-				 worker->wi_proc->pid, worker->wi_dboid, worker->wi_tableoid,
-				 worker->wi_dobalance ? "yes" : "no",
-				 worker->wi_cost_limit, worker->wi_cost_limit_base,
-				 worker->wi_cost_delay);
+		nworkers_for_balance++;
 	}
+
+	if (nworkers_for_balance != orig_nworkers_for_balance)
+		pg_atomic_write_u32(&AutoVacuumShmem->av_nworkersForBalance,
+							nworkers_for_balance);
 }
 
 /*
@@ -2421,23 +2471,34 @@ do_autovacuum(void)
 			continue;
 		}
 
-		/* Must hold AutovacuumLock while mucking with cost balance info */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+		/*
+		 * Save the cost-related storage parameter values in global variables
+		 * for reference when updating vacuum_cost_delay and vacuum_cost_limit
+		 * during vacuuming this table.
+		 */
+		av_storage_param_cost_delay = tab->at_storage_param_vac_cost_delay;
+		av_storage_param_cost_limit = tab->at_storage_param_vac_cost_limit;
 
-		/* advertise my cost delay parameters for the balancing algorithm */
-		MyWorkerInfo->wi_dobalance = tab->at_dobalance;
-		MyWorkerInfo->wi_cost_delay = tab->at_vacuum_cost_delay;
-		MyWorkerInfo->wi_cost_limit = tab->at_vacuum_cost_limit;
-		MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
+		/*
+		 * We only expect this worker to ever set the flag, so don't bother
+		 * checking the return value. We shouldn't have to retry.
+		 */
+		if (tab->at_dobalance)
+			pg_atomic_test_set_flag(&MyWorkerInfo->wi_dobalance);
+		else
+			pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
 
-		/* do a balance */
-		autovac_balance_cost();
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
+		autovac_recalculate_workers_for_balance();
+		LWLockRelease(AutovacuumLock);
 
-		/* set the active cost parameters from the result of that */
+		/*
+		 * We wait until this point to update cost delay and cost limit
+		 * values, even though we reloaded the configuration file above, so
+		 * that we can take into account the cost-related storage parameters.
+		 */
 		VacuumUpdateCosts();
 
-		/* done */
-		LWLockRelease(AutovacuumLock);
 
 		/* clean up memory before each iteration */
 		MemoryContextResetAndDeleteChildren(PortalContext);
@@ -2521,16 +2582,17 @@ deleted:
 		pfree(tab);
 
 		/*
-		 * Remove my info from shared memory.  We could, but intentionally
-		 * don't, clear wi_cost_limit and friends --- this is on the
-		 * assumption that we probably have more to do with similar cost
-		 * settings, so we don't want to give up our share of I/O for a very
-		 * short interval and thereby thrash the global balance.
+		 * Remove my info from shared memory.  We set wi_dobalance on the
+		 * assumption that we are more likely than not to vacuum a table with
+		 * no cost-related storage parameters next, so we want to claim our
+		 * share of I/O as soon as possible to avoid thrashing the global
+		 * balance.
 		 */
 		LWLockAcquire(AutovacuumScheduleLock, LW_EXCLUSIVE);
 		MyWorkerInfo->wi_tableoid = InvalidOid;
 		MyWorkerInfo->wi_sharedrel = false;
 		LWLockRelease(AutovacuumScheduleLock);
+		pg_atomic_test_set_flag(&MyWorkerInfo->wi_dobalance);
 	}
 
 	/*
@@ -2562,6 +2624,7 @@ deleted:
 		{
 			ConfigReloadPending = false;
 			ProcessConfigFile(PGC_SIGHUP);
+			VacuumUpdateCosts();
 		}
 
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
@@ -2797,8 +2860,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		int			freeze_table_age;
 		int			multixact_freeze_min_age;
 		int			multixact_freeze_table_age;
-		int			vac_cost_limit;
-		double		vac_cost_delay;
 		int			log_min_duration;
 
 		/*
@@ -2808,20 +2869,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 * defaults, autovacuum's own first and plain vacuum second.
 		 */
 
-		/* -1 in autovac setting means use plain vacuum_cost_delay */
-		vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
-			? avopts->vacuum_cost_delay
-			: (autovacuum_vac_cost_delay >= 0)
-			? autovacuum_vac_cost_delay
-			: VacuumCostDelay;
-
-		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
-		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
-			? avopts->vacuum_cost_limit
-			: (autovacuum_vac_cost_limit > 0)
-			? autovacuum_vac_cost_limit
-			: VacuumCostLimit;
-
 		/* -1 in autovac setting means use log_autovacuum_min_duration */
 		log_min_duration = (avopts && avopts->log_min_duration >= 0)
 			? avopts->log_min_duration
@@ -2877,8 +2924,10 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab->at_params.multixact_freeze_table_age = multixact_freeze_table_age;
 		tab->at_params.is_wraparound = wraparound;
 		tab->at_params.log_min_duration = log_min_duration;
-		tab->at_vacuum_cost_limit = vac_cost_limit;
-		tab->at_vacuum_cost_delay = vac_cost_delay;
+		tab->at_storage_param_vac_cost_limit = avopts ?
+			avopts->vacuum_cost_limit : 0;
+		tab->at_storage_param_vac_cost_delay = avopts ?
+			avopts->vacuum_cost_delay : -1;
 		tab->at_relname = NULL;
 		tab->at_nspname = NULL;
 		tab->at_datname = NULL;
@@ -3380,10 +3429,18 @@ AutoVacuumShmemInit(void)
 		worker = (WorkerInfo) ((char *) AutoVacuumShmem +
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
+
 		/* initialize the WorkerInfo free list */
 		for (i = 0; i < autovacuum_max_workers; i++)
+		{
 			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 							&worker[i].wi_links);
+
+			pg_atomic_init_flag(&worker[i].wi_dobalance);
+		}
+
+		pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
+
 	}
 	else
 		Assert(found);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 50caf1315d..2a856b0e5e 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -350,6 +350,7 @@ extern IndexBulkDeleteResult *vac_cleanup_one_index(IndexVacuumInfo *ivinfo,
 extern Size vac_max_items_to_alloc_size(int max_items);
 
 /* In postmaster/autovacuum.c */
+extern void AutoVacuumUpdateCostLimit(void);
 extern void VacuumUpdateCosts(void);
 
 /* in commands/vacuumparallel.c */
-- 
2.37.2

v18-0002-Separate-vacuum-cost-variables-from-GUCs.patchtext/x-patch; charset=US-ASCII; name=v18-0002-Separate-vacuum-cost-variables-from-GUCs.patchDownload
From 92fc87969a496c1a1f5fd612d3c7c32251dec6e1 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Thu, 6 Apr 2023 16:02:12 -0400
Subject: [PATCH v18 2/3] Separate vacuum cost variables from GUCs

Vacuum code run both by autovacuum workers and a backend doing
VACUUM/ANALYZE previously inspected VacuumCostLimit and VacuumCostDelay,
which are the global variables backing the GUCs vacuum_cost_limit and
vacuum_cost_delay.

Autovacuum workers needed to override these variables with their own
values, derived from autovacuum_vacuum_cost_limit and
autovacuum_vacuum_cost_delay and worker cost limit balancing logic. This
led to confusing code which, in some cases, both derived and set a new
value of VacuumCostLimit from VacuumCostLimit.

In preparation for refreshing these GUC values more often, introduce
new, independent global variables and add a function to update them
using the GUCs and existing logic.

Per suggestion by Kyotaro Horiguchi

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
---
 src/backend/commands/vacuum.c         | 29 +++++++++++++++--------
 src/backend/commands/vacuumparallel.c |  3 ++-
 src/backend/postmaster/autovacuum.c   | 34 +++++++++++----------------
 src/include/commands/vacuum.h         |  5 ++++
 src/include/postmaster/autovacuum.h   |  3 ---
 5 files changed, 40 insertions(+), 34 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7fc5c19e37..f2be74cdb5 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -72,6 +72,15 @@ int			vacuum_multixact_freeze_table_age;
 int			vacuum_failsafe_age;
 int			vacuum_multixact_failsafe_age;
 
+/*
+ * Variables for cost-based vacuum delay. The defaults differ between
+ * autovacuum and vacuum. They should be set with the appropriate GUC value in
+ * vacuum code. They are initialized here to the defaults for client backends
+ * executing VACUUM or ANALYZE.
+ */
+double		vacuum_cost_delay = 0;
+int			vacuum_cost_limit = 200;
+
 /*
  * VacuumFailsafeActive is a defined as a global so that we can determine
  * whether or not to re-enable cost-based vacuum delay when vacuuming a table.
@@ -514,8 +523,9 @@ vacuum(List *relations, VacuumParams *params, BufferAccessStrategy bstrategy,
 	{
 		ListCell   *cur;
 
+		VacuumUpdateCosts();
 		in_vacuum = true;
-		VacuumCostActive = (VacuumCostDelay > 0);
+		VacuumCostActive = (vacuum_cost_delay > 0);
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
@@ -2244,14 +2254,14 @@ vacuum_delay_point(void)
 	 */
 	if (VacuumSharedCostBalance != NULL)
 		msec = compute_parallel_delay();
-	else if (VacuumCostBalance >= VacuumCostLimit)
-		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+	else if (VacuumCostBalance >= vacuum_cost_limit)
+		msec = vacuum_cost_delay * VacuumCostBalance / vacuum_cost_limit;
 
 	/* Nap if appropriate */
 	if (msec > 0)
 	{
-		if (msec > VacuumCostDelay * 4)
-			msec = VacuumCostDelay * 4;
+		if (msec > vacuum_cost_delay * 4)
+			msec = vacuum_cost_delay * 4;
 
 		pgstat_report_wait_start(WAIT_EVENT_VACUUM_DELAY);
 		pg_usleep(msec * 1000);
@@ -2268,8 +2278,7 @@ vacuum_delay_point(void)
 
 		VacuumCostBalance = 0;
 
-		/* update balance values for workers */
-		AutoVacuumUpdateDelay();
+		VacuumUpdateCosts();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
@@ -2319,11 +2328,11 @@ compute_parallel_delay(void)
 	/* Compute the total local balance for the current worker */
 	VacuumCostBalanceLocal += VacuumCostBalance;
 
-	if ((shared_balance >= VacuumCostLimit) &&
-		(VacuumCostBalanceLocal > 0.5 * ((double) VacuumCostLimit / nworkers)))
+	if ((shared_balance >= vacuum_cost_limit) &&
+		(VacuumCostBalanceLocal > 0.5 * ((double) vacuum_cost_limit / nworkers)))
 	{
 		/* Compute sleep time based on the local cost balance */
-		msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+		msec = vacuum_cost_delay * VacuumCostBalanceLocal / vacuum_cost_limit;
 		pg_atomic_sub_fetch_u32(VacuumSharedCostBalance, VacuumCostBalanceLocal);
 		VacuumCostBalanceLocal = 0;
 	}
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 563117a8f6..cc0aff7904 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -995,7 +995,8 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												 false);
 
 	/* Set cost-based vacuum delay */
-	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostActive = (vacuum_cost_delay > 0);
+	VacuumUpdateCosts();
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
 	VacuumPageMiss = 0;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c1e911b1b3..3644b86443 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1773,16 +1773,24 @@ FreeWorkerInfo(int code, Datum arg)
 }
 
 /*
- * Update the cost-based delay parameters, so that multiple workers consume
- * each a fraction of the total available I/O.
+ * Update vacuum cost-based delay-related parameters for autovacuum workers and
+ * backends executing VACUUM or ANALYZE using the value of relevant gucs and
+ * global state. This must be called during setup for vacuum and after every
+ * config reload to ensure up-to-date values.
  */
 void
-AutoVacuumUpdateDelay(void)
+VacuumUpdateCosts(void)
 {
 	if (MyWorkerInfo)
 	{
-		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
-		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+		vacuum_cost_delay = MyWorkerInfo->wi_cost_delay;
+		vacuum_cost_limit = MyWorkerInfo->wi_cost_limit;
+	}
+	else
+	{
+		/* Must be explicit VACUUM or ANALYZE */
+		vacuum_cost_delay = VacuumCostDelay;
+		vacuum_cost_limit = VacuumCostLimit;
 	}
 }
 
@@ -2311,8 +2319,6 @@ do_autovacuum(void)
 		autovac_table *tab;
 		bool		isshared;
 		bool		skipit;
-		double		stdVacuumCostDelay;
-		int			stdVacuumCostLimit;
 		dlist_iter	iter;
 
 		CHECK_FOR_INTERRUPTS();
@@ -2415,14 +2421,6 @@ do_autovacuum(void)
 			continue;
 		}
 
-		/*
-		 * Remember the prevailing values of the vacuum cost GUCs.  We have to
-		 * restore these at the bottom of the loop, else we'll compute wrong
-		 * values in the next iteration of autovac_balance_cost().
-		 */
-		stdVacuumCostDelay = VacuumCostDelay;
-		stdVacuumCostLimit = VacuumCostLimit;
-
 		/* Must hold AutovacuumLock while mucking with cost balance info */
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 
@@ -2436,7 +2434,7 @@ do_autovacuum(void)
 		autovac_balance_cost();
 
 		/* set the active cost parameters from the result of that */
-		AutoVacuumUpdateDelay();
+		VacuumUpdateCosts();
 
 		/* done */
 		LWLockRelease(AutovacuumLock);
@@ -2533,10 +2531,6 @@ deleted:
 		MyWorkerInfo->wi_tableoid = InvalidOid;
 		MyWorkerInfo->wi_sharedrel = false;
 		LWLockRelease(AutovacuumScheduleLock);
-
-		/* restore vacuum cost GUCs for the next iteration */
-		VacuumCostDelay = stdVacuumCostDelay;
-		VacuumCostLimit = stdVacuumCostLimit;
 	}
 
 	/*
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1223d15e0d..50caf1315d 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -307,6 +307,8 @@ extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
 
 extern PGDLLIMPORT bool VacuumFailsafeActive;
+extern PGDLLIMPORT double vacuum_cost_delay;
+extern PGDLLIMPORT int vacuum_cost_limit;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
@@ -347,6 +349,9 @@ extern IndexBulkDeleteResult *vac_cleanup_one_index(IndexVacuumInfo *ivinfo,
 													IndexBulkDeleteResult *istat);
 extern Size vac_max_items_to_alloc_size(int max_items);
 
+/* In postmaster/autovacuum.c */
+extern void VacuumUpdateCosts(void);
+
 /* in commands/vacuumparallel.c */
 extern ParallelVacuumState *parallel_vacuum_init(Relation rel, Relation *indrels,
 												 int nindexes, int nrequested_workers,
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index c140371b51..65afd1ea1e 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -63,9 +63,6 @@ extern int	StartAutoVacWorker(void);
 /* called from postmaster when a worker could not be forked */
 extern void AutoVacWorkerFailed(void);
 
-/* autovacuum cost-delay balancer */
-extern void AutoVacuumUpdateDelay(void);
-
 #ifdef EXEC_BACKEND
 extern void AutoVacLauncherMain(int argc, char *argv[]) pg_attribute_noreturn();
 extern void AutoVacWorkerMain(int argc, char *argv[]) pg_attribute_noreturn();
-- 
2.37.2

v18-0001-Make-vacuum-failsafe_active-global.patchtext/x-patch; charset=US-ASCII; name=v18-0001-Make-vacuum-failsafe_active-global.patchDownload
From 0042067ce72a474fe4087245b978847c0b835b72 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Fri, 31 Mar 2023 10:38:39 -0400
Subject: [PATCH v18 1/3] Make vacuum failsafe_active global

While vacuuming a table in failsafe mode, VacuumCostActive should not be
re-enabled. This currently isn't a problem because vacuum cost
parameters are only refreshed in between vacuuming tables and failsafe
status is reset for every table. In preparation for allowing vacuum cost
parameters to be updated more frequently, elevate
LVRelState->failsafe_active to a global, VacuumFailsafeActive, which
will be checked when determining whether or not to re-enable vacuum
cost-related delays.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 16 +++++++---------
 src/backend/commands/vacuum.c        | 15 +++++++++++++++
 src/include/commands/vacuum.h        |  1 +
 3 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 639179aa46..2ba85bd3d6 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -153,8 +153,6 @@ typedef struct LVRelState
 	bool		aggressive;
 	/* Use visibility map to skip? (disabled by DISABLE_PAGE_SKIPPING) */
 	bool		skipwithvm;
-	/* Wraparound failsafe has been triggered? */
-	bool		failsafe_active;
 	/* Consider index vacuuming bypass optimization? */
 	bool		consider_bypass_optimization;
 
@@ -391,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
 	Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
 		   params->truncate != VACOPTVALUE_AUTO);
-	vacrel->failsafe_active = false;
+	VacuumFailsafeActive = false;
 	vacrel->consider_bypass_optimization = true;
 	vacrel->do_index_vacuuming = true;
 	vacrel->do_index_cleanup = true;
@@ -709,7 +707,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 			}
 			else
 			{
-				if (!vacrel->failsafe_active)
+				if (!VacuumFailsafeActive)
 					appendStringInfoString(&buf, _("index scan bypassed: "));
 				else
 					appendStringInfoString(&buf, _("index scan bypassed by failsafe: "));
@@ -2293,7 +2291,7 @@ lazy_vacuum(LVRelState *vacrel)
 		 * vacuuming or heap vacuuming.  This VACUUM operation won't end up
 		 * back here again.
 		 */
-		Assert(vacrel->failsafe_active);
+		Assert(VacuumFailsafeActive);
 	}
 
 	/*
@@ -2374,7 +2372,7 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	 */
 	Assert(vacrel->num_index_scans > 0 ||
 		   vacrel->dead_items->num_items == vacrel->lpdead_items);
-	Assert(allindexes || vacrel->failsafe_active);
+	Assert(allindexes || VacuumFailsafeActive);
 
 	/*
 	 * Increase and report the number of index scans.
@@ -2616,12 +2614,12 @@ static bool
 lazy_check_wraparound_failsafe(LVRelState *vacrel)
 {
 	/* Don't warn more than once per VACUUM */
-	if (vacrel->failsafe_active)
+	if (VacuumFailsafeActive)
 		return true;
 
 	if (unlikely(vacuum_xid_failsafe_check(&vacrel->cutoffs)))
 	{
-		vacrel->failsafe_active = true;
+		VacuumFailsafeActive = true;
 
 		/*
 		 * Abandon use of a buffer access strategy to allow use of all of
@@ -2820,7 +2818,7 @@ should_attempt_truncation(LVRelState *vacrel)
 {
 	BlockNumber possibly_freeable;
 
-	if (!vacrel->do_rel_truncate || vacrel->failsafe_active ||
+	if (!vacrel->do_rel_truncate || VacuumFailsafeActive ||
 		old_snapshot_threshold >= 0)
 		return false;
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index ea1d8960f4..7fc5c19e37 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -72,6 +72,21 @@ int			vacuum_multixact_freeze_table_age;
 int			vacuum_failsafe_age;
 int			vacuum_multixact_failsafe_age;
 
+/*
+ * VacuumFailsafeActive is a defined as a global so that we can determine
+ * whether or not to re-enable cost-based vacuum delay when vacuuming a table.
+ * If failsafe mode has been engaged, we will not re-enable cost-based delay
+ * for the table until after vacuuming has completed, regardless of other
+ * settings.
+ *
+ * Only VACUUM code should inspect this variable and only table access methods
+ * should set it to true. In Table AM-agnostic VACUUM code, this variable is
+ * inspected to determine whether or not to allow cost-based delays. Table AMs
+ * are free to set it if they desire this behavior, but it is false by default
+ * and reset to false in between vacuuming each relation.
+ */
+bool		VacuumFailsafeActive = false;
+
 /*
  * Variables for cost-based parallel vacuum.  See comments atop
  * compute_parallel_delay to understand how it works.
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 19ca818dc2..1223d15e0d 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -306,6 +306,7 @@ extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
 
+extern PGDLLIMPORT bool VacuumFailsafeActive;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
-- 
2.37.2

#73Daniel Gustafsson
daniel@yesql.se
In reply to: Melanie Plageman (#72)
Re: Should vacuum process config file reload more often

On 6 Apr 2023, at 23:06, Melanie Plageman <melanieplageman@gmail.com> wrote:

Autovacuum workers, at the end of VacuumUpdateCosts(), check if cost
limit or cost delay have been changed. If they have, they assert that
they don't already hold the AutovacuumLock, take it in shared mode, and
do the logging.

Another idea would be to copy the values to local temp variables while holding
the lock, and release the lock before calling elog() to avoid holding the lock
over potential IO.

--
Daniel Gustafsson

#74Melanie Plageman
melanieplageman@gmail.com
In reply to: Daniel Gustafsson (#73)
3 attachment(s)
Re: Should vacuum process config file reload more often

On Thu, Apr 6, 2023 at 5:45 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 6 Apr 2023, at 23:06, Melanie Plageman <melanieplageman@gmail.com> wrote:

Autovacuum workers, at the end of VacuumUpdateCosts(), check if cost
limit or cost delay have been changed. If they have, they assert that
they don't already hold the AutovacuumLock, take it in shared mode, and
do the logging.

Another idea would be to copy the values to local temp variables while holding
the lock, and release the lock before calling elog() to avoid holding the lock
over potential IO.

Good idea. I've done this in attached v19.
Also I looked through the docs and everything still looks correct for
balancing algo.

- Melanie

Attachments:

v19-0001-Make-vacuum-failsafe_active-global.patchtext/x-patch; charset=US-ASCII; name=v19-0001-Make-vacuum-failsafe_active-global.patchDownload
From 0042067ce72a474fe4087245b978847c0b835b72 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Fri, 31 Mar 2023 10:38:39 -0400
Subject: [PATCH v19 1/3] Make vacuum failsafe_active global

While vacuuming a table in failsafe mode, VacuumCostActive should not be
re-enabled. This currently isn't a problem because vacuum cost
parameters are only refreshed in between vacuuming tables and failsafe
status is reset for every table. In preparation for allowing vacuum cost
parameters to be updated more frequently, elevate
LVRelState->failsafe_active to a global, VacuumFailsafeActive, which
will be checked when determining whether or not to re-enable vacuum
cost-related delays.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 16 +++++++---------
 src/backend/commands/vacuum.c        | 15 +++++++++++++++
 src/include/commands/vacuum.h        |  1 +
 3 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 639179aa46..2ba85bd3d6 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -153,8 +153,6 @@ typedef struct LVRelState
 	bool		aggressive;
 	/* Use visibility map to skip? (disabled by DISABLE_PAGE_SKIPPING) */
 	bool		skipwithvm;
-	/* Wraparound failsafe has been triggered? */
-	bool		failsafe_active;
 	/* Consider index vacuuming bypass optimization? */
 	bool		consider_bypass_optimization;
 
@@ -391,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
 	Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
 		   params->truncate != VACOPTVALUE_AUTO);
-	vacrel->failsafe_active = false;
+	VacuumFailsafeActive = false;
 	vacrel->consider_bypass_optimization = true;
 	vacrel->do_index_vacuuming = true;
 	vacrel->do_index_cleanup = true;
@@ -709,7 +707,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 			}
 			else
 			{
-				if (!vacrel->failsafe_active)
+				if (!VacuumFailsafeActive)
 					appendStringInfoString(&buf, _("index scan bypassed: "));
 				else
 					appendStringInfoString(&buf, _("index scan bypassed by failsafe: "));
@@ -2293,7 +2291,7 @@ lazy_vacuum(LVRelState *vacrel)
 		 * vacuuming or heap vacuuming.  This VACUUM operation won't end up
 		 * back here again.
 		 */
-		Assert(vacrel->failsafe_active);
+		Assert(VacuumFailsafeActive);
 	}
 
 	/*
@@ -2374,7 +2372,7 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	 */
 	Assert(vacrel->num_index_scans > 0 ||
 		   vacrel->dead_items->num_items == vacrel->lpdead_items);
-	Assert(allindexes || vacrel->failsafe_active);
+	Assert(allindexes || VacuumFailsafeActive);
 
 	/*
 	 * Increase and report the number of index scans.
@@ -2616,12 +2614,12 @@ static bool
 lazy_check_wraparound_failsafe(LVRelState *vacrel)
 {
 	/* Don't warn more than once per VACUUM */
-	if (vacrel->failsafe_active)
+	if (VacuumFailsafeActive)
 		return true;
 
 	if (unlikely(vacuum_xid_failsafe_check(&vacrel->cutoffs)))
 	{
-		vacrel->failsafe_active = true;
+		VacuumFailsafeActive = true;
 
 		/*
 		 * Abandon use of a buffer access strategy to allow use of all of
@@ -2820,7 +2818,7 @@ should_attempt_truncation(LVRelState *vacrel)
 {
 	BlockNumber possibly_freeable;
 
-	if (!vacrel->do_rel_truncate || vacrel->failsafe_active ||
+	if (!vacrel->do_rel_truncate || VacuumFailsafeActive ||
 		old_snapshot_threshold >= 0)
 		return false;
 
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index ea1d8960f4..7fc5c19e37 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -72,6 +72,21 @@ int			vacuum_multixact_freeze_table_age;
 int			vacuum_failsafe_age;
 int			vacuum_multixact_failsafe_age;
 
+/*
+ * VacuumFailsafeActive is a defined as a global so that we can determine
+ * whether or not to re-enable cost-based vacuum delay when vacuuming a table.
+ * If failsafe mode has been engaged, we will not re-enable cost-based delay
+ * for the table until after vacuuming has completed, regardless of other
+ * settings.
+ *
+ * Only VACUUM code should inspect this variable and only table access methods
+ * should set it to true. In Table AM-agnostic VACUUM code, this variable is
+ * inspected to determine whether or not to allow cost-based delays. Table AMs
+ * are free to set it if they desire this behavior, but it is false by default
+ * and reset to false in between vacuuming each relation.
+ */
+bool		VacuumFailsafeActive = false;
+
 /*
  * Variables for cost-based parallel vacuum.  See comments atop
  * compute_parallel_delay to understand how it works.
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 19ca818dc2..1223d15e0d 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -306,6 +306,7 @@ extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
 
+extern PGDLLIMPORT bool VacuumFailsafeActive;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
-- 
2.37.2

v19-0003-Autovacuum-refreshes-cost-based-delay-params-mor.patchtext/x-patch; charset=US-ASCII; name=v19-0003-Autovacuum-refreshes-cost-based-delay-params-mor.patchDownload
From 3db6b900042f193892857ba7b2301a478ad0ef9a Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 25 Mar 2023 14:14:55 -0400
Subject: [PATCH v19 3/3] Autovacuum refreshes cost-based delay params more
 often

Allow autovacuum to reload the config file more often so that cost-based
delay parameters can take effect while VACUUMing a relation. Previously,
autovacuum workers only reloaded the config file once per relation
vacuumed, so config changes could not take effect until beginning to
vacuum the next table.

Now, check if a reload is pending roughly once per block, when checking
if we need to delay.

In order for autovacuum workers to safely update their own cost delay
and cost limit parameters without impacting performance, we had to
rethink when and how these values were accessed.

Previously, an autovacuum worker's wi_cost_limit was set only at the
beginning of vacuuming a table, after reloading the config file.
Therefore, at the time that autovac_balance_cost() was called, workers
vacuuming tables with no cost-related storage parameters could still
have different values for their wi_cost_limit_base and wi_cost_delay.

Now that the cost parameters can be updated while vacuuming a table,
workers will (within some margin of error) have no reason to have
different values for cost limit and cost delay (in the absence of
cost-related storage parameters). This removes the rationale for keeping
cost limit and cost delay in shared memory. Balancing the cost limit
requires only the number of active autovacuum workers vacuuming a table
with no cost-based storage parameters.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c  |   2 +-
 src/backend/commands/vacuum.c         |  46 +++-
 src/backend/commands/vacuumparallel.c |   1 -
 src/backend/postmaster/autovacuum.c   | 293 ++++++++++++++++----------
 src/include/commands/vacuum.h         |   1 +
 5 files changed, 221 insertions(+), 122 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 2ba85bd3d6..0a9ebd22bd 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -389,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
 	Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
 		   params->truncate != VACOPTVALUE_AUTO);
-	VacuumFailsafeActive = false;
+	Assert(!VacuumFailsafeActive);
 	vacrel->consider_bypass_optimization = true;
 	vacrel->do_index_vacuuming = true;
 	vacrel->do_index_cleanup = true;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index f2be74cdb5..ca347e0a6d 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -48,6 +48,7 @@
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/interrupt.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/pmsignal.h"
@@ -523,9 +524,9 @@ vacuum(List *relations, VacuumParams *params, BufferAccessStrategy bstrategy,
 	{
 		ListCell   *cur;
 
-		VacuumUpdateCosts();
 		in_vacuum = true;
-		VacuumCostActive = (vacuum_cost_delay > 0);
+		VacuumFailsafeActive = false;
+		VacuumUpdateCosts();
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
@@ -579,12 +580,20 @@ vacuum(List *relations, VacuumParams *params, BufferAccessStrategy bstrategy,
 					CommandCounterIncrement();
 				}
 			}
+
+			/*
+			 * Ensure VacuumFailsafeActive has been reset before vacuuming the
+			 * next relation.
+			 */
+			VacuumFailsafeActive = false;
 		}
 	}
 	PG_FINALLY();
 	{
 		in_vacuum = false;
 		VacuumCostActive = false;
+		VacuumFailsafeActive = false;
+		VacuumCostBalance = 0;
 	}
 	PG_END_TRY();
 
@@ -2245,7 +2254,28 @@ vacuum_delay_point(void)
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	if (!VacuumCostActive || InterruptPending)
+	if (InterruptPending ||
+		(!VacuumCostActive && !ConfigReloadPending))
+		return;
+
+	/*
+	 * Autovacuum workers should reload the configuration file if requested.
+	 * This allows changes to [autovacuum_]vacuum_cost_limit and
+	 * [autovacuum_]vacuum_cost_delay to take effect while a table is being
+	 * vacuumed or analyzed.
+	 */
+	if (ConfigReloadPending && IsAutoVacuumWorkerProcess())
+	{
+		ConfigReloadPending = false;
+		ProcessConfigFile(PGC_SIGHUP);
+		VacuumUpdateCosts();
+	}
+
+	/*
+	 * If we disabled cost-based delays after reloading the config file,
+	 * return.
+	 */
+	if (!VacuumCostActive)
 		return;
 
 	/*
@@ -2278,7 +2308,15 @@ vacuum_delay_point(void)
 
 		VacuumCostBalance = 0;
 
-		VacuumUpdateCosts();
+		/*
+		 * Balance and update limit values for autovacuum workers. We must do
+		 * this periodically, as the number of workers across which we are
+		 * balancing the limit may have changed.
+		 *
+		 * XXX: There may be better criteria for determining when to do this
+		 * besides "check after napping".
+		 */
+		AutoVacuumUpdateCostLimit();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index cc0aff7904..e200d5caf8 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -995,7 +995,6 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												 false);
 
 	/* Set cost-based vacuum delay */
-	VacuumCostActive = (vacuum_cost_delay > 0);
 	VacuumUpdateCosts();
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 3644b86443..17177c44c6 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -139,6 +139,18 @@ int			Log_autovacuum_min_duration = 600000;
 static bool am_autovacuum_launcher = false;
 static bool am_autovacuum_worker = false;
 
+/*
+ * Variables to save the cost-related storage parameters for the current
+ * relation being vacuumed by this autovacuum worker. Using these, we can
+ * ensure we don't overwrite the values of vacuum_cost_delay and
+ * vacuum_cost_limit after reloading the configuration file. They are
+ * initialized to "invalid" values to indicate that no cost-related storage
+ * parameters were specified and will be set in do_autovacuum() after checking
+ * the storage parameters in table_recheck_autovac().
+ */
+static double av_storage_param_cost_delay = -1;
+static int	av_storage_param_cost_limit = -1;
+
 /* Flags set by signal handlers */
 static volatile sig_atomic_t got_SIGUSR2 = false;
 
@@ -189,8 +201,8 @@ typedef struct autovac_table
 {
 	Oid			at_relid;
 	VacuumParams at_params;
-	double		at_vacuum_cost_delay;
-	int			at_vacuum_cost_limit;
+	double		at_storage_param_vac_cost_delay;
+	int			at_storage_param_vac_cost_limit;
 	bool		at_dobalance;
 	bool		at_sharedrel;
 	char	   *at_relname;
@@ -209,7 +221,7 @@ typedef struct autovac_table
  * wi_sharedrel flag indicating whether table is marked relisshared
  * wi_proc		pointer to PGPROC of the running worker, NULL if not started
  * wi_launchtime Time at which this worker was launched
- * wi_cost_*	Vacuum cost-based delay parameters current in this worker
+ * wi_dobalance Whether this worker should be included in balance calculations
  *
  * All fields are protected by AutovacuumLock, except for wi_tableoid and
  * wi_sharedrel which are protected by AutovacuumScheduleLock (note these
@@ -223,11 +235,8 @@ typedef struct WorkerInfoData
 	Oid			wi_tableoid;
 	PGPROC	   *wi_proc;
 	TimestampTz wi_launchtime;
-	bool		wi_dobalance;
+	pg_atomic_flag wi_dobalance;
 	bool		wi_sharedrel;
-	double		wi_cost_delay;
-	int			wi_cost_limit;
-	int			wi_cost_limit_base;
 } WorkerInfoData;
 
 typedef struct WorkerInfoData *WorkerInfo;
@@ -273,6 +282,8 @@ typedef struct AutoVacuumWorkItem
  * av_startingWorker pointer to WorkerInfo currently being started (cleared by
  *					the worker itself as soon as it's up and running)
  * av_workItems		work item array
+ * av_nworkersForBalance the number of autovacuum workers to use when
+ * 					calculating the per worker cost limit
  *
  * This struct is protected by AutovacuumLock, except for av_signal and parts
  * of the worker list (see above).
@@ -286,6 +297,7 @@ typedef struct
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+	pg_atomic_uint32 av_nworkersForBalance;
 } AutoVacuumShmemStruct;
 
 static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -319,7 +331,7 @@ static void launch_worker(TimestampTz now);
 static List *get_database_list(void);
 static void rebuild_database_list(Oid newdb);
 static int	db_comparator(const void *a, const void *b);
-static void autovac_balance_cost(void);
+static void autovac_recalculate_workers_for_balance(void);
 
 static void do_autovacuum(void);
 static void FreeWorkerInfo(int code, Datum arg);
@@ -669,7 +681,7 @@ AutoVacLauncherMain(int argc, char *argv[])
 			{
 				LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 				AutoVacuumShmem->av_signal[AutoVacRebalance] = false;
-				autovac_balance_cost();
+				autovac_recalculate_workers_for_balance();
 				LWLockRelease(AutovacuumLock);
 			}
 
@@ -818,11 +830,6 @@ HandleAutoVacLauncherInterrupts(void)
 		if (!AutoVacuumingActive())
 			AutoVacLauncherShutdown();
 
-		/* rebalance in case the default cost parameters changed */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-		autovac_balance_cost();
-		LWLockRelease(AutovacuumLock);
-
 		/* rebuild the list in case the naptime changed */
 		rebuild_database_list(InvalidOid);
 	}
@@ -1754,10 +1761,7 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_sharedrel = false;
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
-		MyWorkerInfo->wi_dobalance = false;
-		MyWorkerInfo->wi_cost_delay = 0;
-		MyWorkerInfo->wi_cost_limit = 0;
-		MyWorkerInfo->wi_cost_limit_base = 0;
+		pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
 		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 						&MyWorkerInfo->wi_links);
 		/* not mine anymore */
@@ -1781,10 +1785,20 @@ FreeWorkerInfo(int code, Datum arg)
 void
 VacuumUpdateCosts(void)
 {
+	double		original_cost_delay = vacuum_cost_delay;
+	int			original_cost_limit = vacuum_cost_limit;
+
 	if (MyWorkerInfo)
 	{
-		vacuum_cost_delay = MyWorkerInfo->wi_cost_delay;
-		vacuum_cost_limit = MyWorkerInfo->wi_cost_limit;
+		if (av_storage_param_cost_delay >= 0)
+			vacuum_cost_delay = av_storage_param_cost_delay;
+		else if (autovacuum_vac_cost_delay >= 0)
+			vacuum_cost_delay = autovacuum_vac_cost_delay;
+		else
+			/* fall back to VacuumCostDelay */
+			vacuum_cost_delay = VacuumCostDelay;
+
+		AutoVacuumUpdateCostLimit();
 	}
 	else
 	{
@@ -1792,88 +1806,128 @@ VacuumUpdateCosts(void)
 		vacuum_cost_delay = VacuumCostDelay;
 		vacuum_cost_limit = VacuumCostLimit;
 	}
+
+	/*
+	 * If configuration changes are allowed to impact VacuumCostActive, make
+	 * sure it is updated.
+	 */
+	if (VacuumFailsafeActive)
+		Assert(!VacuumCostActive);
+	else if (vacuum_cost_delay > 0)
+		VacuumCostActive = true;
+	else
+	{
+		VacuumCostActive = false;
+		VacuumCostBalance = 0;
+	}
+
+	if (MyWorkerInfo)
+	{
+		Oid			dboid,
+					tableoid;
+
+		/* Only log updates to cost-related variables */
+		if (vacuum_cost_delay == original_cost_delay &&
+			vacuum_cost_limit == original_cost_limit)
+			return;
+
+		Assert(!LWLockHeldByMe(AutovacuumLock));
+
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
+		dboid = MyWorkerInfo->wi_dboid;
+		tableoid = MyWorkerInfo->wi_tableoid;
+		LWLockRelease(AutovacuumLock);
+
+		elog(DEBUG2,
+			 "Autovacuum VacuumUpdateCosts(db=%u, rel=%u, dobalance=%s, cost_limit=%d, cost_delay=%g active=%s failsafe=%s)",
+			 dboid, tableoid, pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance) ? "no" : "yes",
+			 vacuum_cost_limit, vacuum_cost_delay,
+			 vacuum_cost_delay > 0 ? "yes" : "no",
+			 VacuumFailsafeActive ? "yes" : "no");
+
+	}
 }
 
 /*
- * autovac_balance_cost
- *		Recalculate the cost limit setting for each active worker.
- *
- * Caller must hold the AutovacuumLock in exclusive mode.
+ * Update vacuum_cost_limit with the correct value for an autovacuum worker,
+ * given the value of other relevant cost limit parameters and the number of
+ * workers across which the limit must be balanced. Autovacuum workers must
+ * call this regularly in case av_nworkersForBalance has been updated by
+ * another worker or by the autovacuum launcher. They must also call it after a
+ * config reload.
  */
-static void
-autovac_balance_cost(void)
+void
+AutoVacuumUpdateCostLimit(void)
 {
+	if (!MyWorkerInfo)
+		return;
+
 	/*
-	 * The idea here is that we ration out I/O equally.  The amount of I/O
-	 * that a worker can consume is determined by cost_limit/cost_delay, so we
-	 * try to equalize those ratios rather than the raw limit settings.
-	 *
 	 * note: in cost_limit, zero also means use value from elsewhere, because
 	 * zero is not a valid value.
 	 */
-	int			vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
-								  autovacuum_vac_cost_limit : VacuumCostLimit);
-	double		vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
-								  autovacuum_vac_cost_delay : VacuumCostDelay);
-	double		cost_total;
-	double		cost_avail;
-	dlist_iter	iter;
-
-	/* not set? nothing to do */
-	if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
-		return;
 
-	/* calculate the total base cost limit of participating active workers */
-	cost_total = 0.0;
-	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
+	if (av_storage_param_cost_limit > 0)
+		vacuum_cost_limit = av_storage_param_cost_limit;
+	else
 	{
-		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
+		int			nworkers_for_balance;
+
+		if (autovacuum_vac_cost_limit > 0)
+			vacuum_cost_limit = autovacuum_vac_cost_limit;
+		else
+			vacuum_cost_limit = VacuumCostLimit;
+
+		/* Only balance limit if no cost-related storage parameters specified */
+		if (pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance))
+			return;
+
+		Assert(vacuum_cost_limit > 0);
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-			cost_total +=
-				(double) worker->wi_cost_limit_base / worker->wi_cost_delay;
+		nworkers_for_balance = pg_atomic_read_u32(&AutoVacuumShmem->av_nworkersForBalance);
+
+		/* There is at least 1 autovac worker (this worker) */
+		if (nworkers_for_balance <= 0)
+			elog(ERROR, "nworkers_for_balance must be > 0");
+
+		vacuum_cost_limit = Max(vacuum_cost_limit / nworkers_for_balance, 1);
 	}
+}
 
-	/* there are no cost limits -- nothing to do */
-	if (cost_total <= 0)
-		return;
+/*
+ * autovac_recalculate_workers_for_balance
+ *		Recalculate the number of workers to consider, given cost-related
+ *		storage parameters and the current number of active workers.
+ *
+ * Caller must hold the AutovacuumLock in at least shared mode to access
+ * worker->wi_proc.
+ */
+static void
+autovac_recalculate_workers_for_balance(void)
+{
+	dlist_iter	iter;
+	int			orig_nworkers_for_balance;
+	int			nworkers_for_balance = 0;
+
+	Assert(LWLockHeldByMe(AutovacuumLock));
+
+	orig_nworkers_for_balance =
+		pg_atomic_read_u32(&AutoVacuumShmem->av_nworkersForBalance);
 
-	/*
-	 * Adjust cost limit of each active worker to balance the total of cost
-	 * limit to autovacuum_vacuum_cost_limit.
-	 */
-	cost_avail = (double) vac_cost_limit / vac_cost_delay;
 	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
 	{
 		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-		{
-			int			limit = (int)
-			(cost_avail * worker->wi_cost_limit_base / cost_total);
-
-			/*
-			 * We put a lower bound of 1 on the cost_limit, to avoid division-
-			 * by-zero in the vacuum code.  Also, in case of roundoff trouble
-			 * in these calculations, let's be sure we don't ever set
-			 * cost_limit to more than the base value.
-			 */
-			worker->wi_cost_limit = Max(Min(limit,
-											worker->wi_cost_limit_base),
-										1);
-		}
+		if (worker->wi_proc == NULL ||
+			pg_atomic_unlocked_test_flag(&worker->wi_dobalance))
+			continue;
 
-		if (worker->wi_proc != NULL)
-			elog(DEBUG2, "autovac_balance_cost(pid=%d db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d, cost_delay=%g)",
-				 worker->wi_proc->pid, worker->wi_dboid, worker->wi_tableoid,
-				 worker->wi_dobalance ? "yes" : "no",
-				 worker->wi_cost_limit, worker->wi_cost_limit_base,
-				 worker->wi_cost_delay);
+		nworkers_for_balance++;
 	}
+
+	if (nworkers_for_balance != orig_nworkers_for_balance)
+		pg_atomic_write_u32(&AutoVacuumShmem->av_nworkersForBalance,
+							nworkers_for_balance);
 }
 
 /*
@@ -2421,23 +2475,34 @@ do_autovacuum(void)
 			continue;
 		}
 
-		/* Must hold AutovacuumLock while mucking with cost balance info */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+		/*
+		 * Save the cost-related storage parameter values in global variables
+		 * for reference when updating vacuum_cost_delay and vacuum_cost_limit
+		 * during vacuuming this table.
+		 */
+		av_storage_param_cost_delay = tab->at_storage_param_vac_cost_delay;
+		av_storage_param_cost_limit = tab->at_storage_param_vac_cost_limit;
 
-		/* advertise my cost delay parameters for the balancing algorithm */
-		MyWorkerInfo->wi_dobalance = tab->at_dobalance;
-		MyWorkerInfo->wi_cost_delay = tab->at_vacuum_cost_delay;
-		MyWorkerInfo->wi_cost_limit = tab->at_vacuum_cost_limit;
-		MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
+		/*
+		 * We only expect this worker to ever set the flag, so don't bother
+		 * checking the return value. We shouldn't have to retry.
+		 */
+		if (tab->at_dobalance)
+			pg_atomic_test_set_flag(&MyWorkerInfo->wi_dobalance);
+		else
+			pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
 
-		/* do a balance */
-		autovac_balance_cost();
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
+		autovac_recalculate_workers_for_balance();
+		LWLockRelease(AutovacuumLock);
 
-		/* set the active cost parameters from the result of that */
+		/*
+		 * We wait until this point to update cost delay and cost limit
+		 * values, even though we reloaded the configuration file above, so
+		 * that we can take into account the cost-related storage parameters.
+		 */
 		VacuumUpdateCosts();
 
-		/* done */
-		LWLockRelease(AutovacuumLock);
 
 		/* clean up memory before each iteration */
 		MemoryContextResetAndDeleteChildren(PortalContext);
@@ -2521,16 +2586,17 @@ deleted:
 		pfree(tab);
 
 		/*
-		 * Remove my info from shared memory.  We could, but intentionally
-		 * don't, clear wi_cost_limit and friends --- this is on the
-		 * assumption that we probably have more to do with similar cost
-		 * settings, so we don't want to give up our share of I/O for a very
-		 * short interval and thereby thrash the global balance.
+		 * Remove my info from shared memory.  We set wi_dobalance on the
+		 * assumption that we are more likely than not to vacuum a table with
+		 * no cost-related storage parameters next, so we want to claim our
+		 * share of I/O as soon as possible to avoid thrashing the global
+		 * balance.
 		 */
 		LWLockAcquire(AutovacuumScheduleLock, LW_EXCLUSIVE);
 		MyWorkerInfo->wi_tableoid = InvalidOid;
 		MyWorkerInfo->wi_sharedrel = false;
 		LWLockRelease(AutovacuumScheduleLock);
+		pg_atomic_test_set_flag(&MyWorkerInfo->wi_dobalance);
 	}
 
 	/*
@@ -2562,6 +2628,7 @@ deleted:
 		{
 			ConfigReloadPending = false;
 			ProcessConfigFile(PGC_SIGHUP);
+			VacuumUpdateCosts();
 		}
 
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
@@ -2797,8 +2864,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		int			freeze_table_age;
 		int			multixact_freeze_min_age;
 		int			multixact_freeze_table_age;
-		int			vac_cost_limit;
-		double		vac_cost_delay;
 		int			log_min_duration;
 
 		/*
@@ -2808,20 +2873,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 * defaults, autovacuum's own first and plain vacuum second.
 		 */
 
-		/* -1 in autovac setting means use plain vacuum_cost_delay */
-		vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
-			? avopts->vacuum_cost_delay
-			: (autovacuum_vac_cost_delay >= 0)
-			? autovacuum_vac_cost_delay
-			: VacuumCostDelay;
-
-		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
-		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
-			? avopts->vacuum_cost_limit
-			: (autovacuum_vac_cost_limit > 0)
-			? autovacuum_vac_cost_limit
-			: VacuumCostLimit;
-
 		/* -1 in autovac setting means use log_autovacuum_min_duration */
 		log_min_duration = (avopts && avopts->log_min_duration >= 0)
 			? avopts->log_min_duration
@@ -2877,8 +2928,10 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab->at_params.multixact_freeze_table_age = multixact_freeze_table_age;
 		tab->at_params.is_wraparound = wraparound;
 		tab->at_params.log_min_duration = log_min_duration;
-		tab->at_vacuum_cost_limit = vac_cost_limit;
-		tab->at_vacuum_cost_delay = vac_cost_delay;
+		tab->at_storage_param_vac_cost_limit = avopts ?
+			avopts->vacuum_cost_limit : 0;
+		tab->at_storage_param_vac_cost_delay = avopts ?
+			avopts->vacuum_cost_delay : -1;
 		tab->at_relname = NULL;
 		tab->at_nspname = NULL;
 		tab->at_datname = NULL;
@@ -3380,10 +3433,18 @@ AutoVacuumShmemInit(void)
 		worker = (WorkerInfo) ((char *) AutoVacuumShmem +
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
+
 		/* initialize the WorkerInfo free list */
 		for (i = 0; i < autovacuum_max_workers; i++)
+		{
 			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 							&worker[i].wi_links);
+
+			pg_atomic_init_flag(&worker[i].wi_dobalance);
+		}
+
+		pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
+
 	}
 	else
 		Assert(found);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 50caf1315d..2a856b0e5e 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -350,6 +350,7 @@ extern IndexBulkDeleteResult *vac_cleanup_one_index(IndexVacuumInfo *ivinfo,
 extern Size vac_max_items_to_alloc_size(int max_items);
 
 /* In postmaster/autovacuum.c */
+extern void AutoVacuumUpdateCostLimit(void);
 extern void VacuumUpdateCosts(void);
 
 /* in commands/vacuumparallel.c */
-- 
2.37.2

v19-0002-Separate-vacuum-cost-variables-from-GUCs.patchtext/x-patch; charset=US-ASCII; name=v19-0002-Separate-vacuum-cost-variables-from-GUCs.patchDownload
From 92fc87969a496c1a1f5fd612d3c7c32251dec6e1 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Thu, 6 Apr 2023 16:02:12 -0400
Subject: [PATCH v19 2/3] Separate vacuum cost variables from GUCs

Vacuum code run both by autovacuum workers and a backend doing
VACUUM/ANALYZE previously inspected VacuumCostLimit and VacuumCostDelay,
which are the global variables backing the GUCs vacuum_cost_limit and
vacuum_cost_delay.

Autovacuum workers needed to override these variables with their own
values, derived from autovacuum_vacuum_cost_limit and
autovacuum_vacuum_cost_delay and worker cost limit balancing logic. This
led to confusing code which, in some cases, both derived and set a new
value of VacuumCostLimit from VacuumCostLimit.

In preparation for refreshing these GUC values more often, introduce
new, independent global variables and add a function to update them
using the GUCs and existing logic.

Per suggestion by Kyotaro Horiguchi

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
---
 src/backend/commands/vacuum.c         | 29 +++++++++++++++--------
 src/backend/commands/vacuumparallel.c |  3 ++-
 src/backend/postmaster/autovacuum.c   | 34 +++++++++++----------------
 src/include/commands/vacuum.h         |  5 ++++
 src/include/postmaster/autovacuum.h   |  3 ---
 5 files changed, 40 insertions(+), 34 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7fc5c19e37..f2be74cdb5 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -72,6 +72,15 @@ int			vacuum_multixact_freeze_table_age;
 int			vacuum_failsafe_age;
 int			vacuum_multixact_failsafe_age;
 
+/*
+ * Variables for cost-based vacuum delay. The defaults differ between
+ * autovacuum and vacuum. They should be set with the appropriate GUC value in
+ * vacuum code. They are initialized here to the defaults for client backends
+ * executing VACUUM or ANALYZE.
+ */
+double		vacuum_cost_delay = 0;
+int			vacuum_cost_limit = 200;
+
 /*
  * VacuumFailsafeActive is a defined as a global so that we can determine
  * whether or not to re-enable cost-based vacuum delay when vacuuming a table.
@@ -514,8 +523,9 @@ vacuum(List *relations, VacuumParams *params, BufferAccessStrategy bstrategy,
 	{
 		ListCell   *cur;
 
+		VacuumUpdateCosts();
 		in_vacuum = true;
-		VacuumCostActive = (VacuumCostDelay > 0);
+		VacuumCostActive = (vacuum_cost_delay > 0);
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
@@ -2244,14 +2254,14 @@ vacuum_delay_point(void)
 	 */
 	if (VacuumSharedCostBalance != NULL)
 		msec = compute_parallel_delay();
-	else if (VacuumCostBalance >= VacuumCostLimit)
-		msec = VacuumCostDelay * VacuumCostBalance / VacuumCostLimit;
+	else if (VacuumCostBalance >= vacuum_cost_limit)
+		msec = vacuum_cost_delay * VacuumCostBalance / vacuum_cost_limit;
 
 	/* Nap if appropriate */
 	if (msec > 0)
 	{
-		if (msec > VacuumCostDelay * 4)
-			msec = VacuumCostDelay * 4;
+		if (msec > vacuum_cost_delay * 4)
+			msec = vacuum_cost_delay * 4;
 
 		pgstat_report_wait_start(WAIT_EVENT_VACUUM_DELAY);
 		pg_usleep(msec * 1000);
@@ -2268,8 +2278,7 @@ vacuum_delay_point(void)
 
 		VacuumCostBalance = 0;
 
-		/* update balance values for workers */
-		AutoVacuumUpdateDelay();
+		VacuumUpdateCosts();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
@@ -2319,11 +2328,11 @@ compute_parallel_delay(void)
 	/* Compute the total local balance for the current worker */
 	VacuumCostBalanceLocal += VacuumCostBalance;
 
-	if ((shared_balance >= VacuumCostLimit) &&
-		(VacuumCostBalanceLocal > 0.5 * ((double) VacuumCostLimit / nworkers)))
+	if ((shared_balance >= vacuum_cost_limit) &&
+		(VacuumCostBalanceLocal > 0.5 * ((double) vacuum_cost_limit / nworkers)))
 	{
 		/* Compute sleep time based on the local cost balance */
-		msec = VacuumCostDelay * VacuumCostBalanceLocal / VacuumCostLimit;
+		msec = vacuum_cost_delay * VacuumCostBalanceLocal / vacuum_cost_limit;
 		pg_atomic_sub_fetch_u32(VacuumSharedCostBalance, VacuumCostBalanceLocal);
 		VacuumCostBalanceLocal = 0;
 	}
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 563117a8f6..cc0aff7904 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -995,7 +995,8 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												 false);
 
 	/* Set cost-based vacuum delay */
-	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostActive = (vacuum_cost_delay > 0);
+	VacuumUpdateCosts();
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
 	VacuumPageMiss = 0;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c1e911b1b3..3644b86443 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1773,16 +1773,24 @@ FreeWorkerInfo(int code, Datum arg)
 }
 
 /*
- * Update the cost-based delay parameters, so that multiple workers consume
- * each a fraction of the total available I/O.
+ * Update vacuum cost-based delay-related parameters for autovacuum workers and
+ * backends executing VACUUM or ANALYZE using the value of relevant gucs and
+ * global state. This must be called during setup for vacuum and after every
+ * config reload to ensure up-to-date values.
  */
 void
-AutoVacuumUpdateDelay(void)
+VacuumUpdateCosts(void)
 {
 	if (MyWorkerInfo)
 	{
-		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
-		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+		vacuum_cost_delay = MyWorkerInfo->wi_cost_delay;
+		vacuum_cost_limit = MyWorkerInfo->wi_cost_limit;
+	}
+	else
+	{
+		/* Must be explicit VACUUM or ANALYZE */
+		vacuum_cost_delay = VacuumCostDelay;
+		vacuum_cost_limit = VacuumCostLimit;
 	}
 }
 
@@ -2311,8 +2319,6 @@ do_autovacuum(void)
 		autovac_table *tab;
 		bool		isshared;
 		bool		skipit;
-		double		stdVacuumCostDelay;
-		int			stdVacuumCostLimit;
 		dlist_iter	iter;
 
 		CHECK_FOR_INTERRUPTS();
@@ -2415,14 +2421,6 @@ do_autovacuum(void)
 			continue;
 		}
 
-		/*
-		 * Remember the prevailing values of the vacuum cost GUCs.  We have to
-		 * restore these at the bottom of the loop, else we'll compute wrong
-		 * values in the next iteration of autovac_balance_cost().
-		 */
-		stdVacuumCostDelay = VacuumCostDelay;
-		stdVacuumCostLimit = VacuumCostLimit;
-
 		/* Must hold AutovacuumLock while mucking with cost balance info */
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 
@@ -2436,7 +2434,7 @@ do_autovacuum(void)
 		autovac_balance_cost();
 
 		/* set the active cost parameters from the result of that */
-		AutoVacuumUpdateDelay();
+		VacuumUpdateCosts();
 
 		/* done */
 		LWLockRelease(AutovacuumLock);
@@ -2533,10 +2531,6 @@ deleted:
 		MyWorkerInfo->wi_tableoid = InvalidOid;
 		MyWorkerInfo->wi_sharedrel = false;
 		LWLockRelease(AutovacuumScheduleLock);
-
-		/* restore vacuum cost GUCs for the next iteration */
-		VacuumCostDelay = stdVacuumCostDelay;
-		VacuumCostLimit = stdVacuumCostLimit;
 	}
 
 	/*
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1223d15e0d..50caf1315d 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -307,6 +307,8 @@ extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
 
 extern PGDLLIMPORT bool VacuumFailsafeActive;
+extern PGDLLIMPORT double vacuum_cost_delay;
+extern PGDLLIMPORT int vacuum_cost_limit;
 
 /* in commands/vacuum.c */
 extern void ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel);
@@ -347,6 +349,9 @@ extern IndexBulkDeleteResult *vac_cleanup_one_index(IndexVacuumInfo *ivinfo,
 													IndexBulkDeleteResult *istat);
 extern Size vac_max_items_to_alloc_size(int max_items);
 
+/* In postmaster/autovacuum.c */
+extern void VacuumUpdateCosts(void);
+
 /* in commands/vacuumparallel.c */
 extern ParallelVacuumState *parallel_vacuum_init(Relation rel, Relation *indrels,
 												 int nindexes, int nrequested_workers,
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index c140371b51..65afd1ea1e 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -63,9 +63,6 @@ extern int	StartAutoVacWorker(void);
 /* called from postmaster when a worker could not be forked */
 extern void AutoVacWorkerFailed(void);
 
-/* autovacuum cost-delay balancer */
-extern void AutoVacuumUpdateDelay(void);
-
 #ifdef EXEC_BACKEND
 extern void AutoVacLauncherMain(int argc, char *argv[]) pg_attribute_noreturn();
 extern void AutoVacWorkerMain(int argc, char *argv[]) pg_attribute_noreturn();
-- 
2.37.2

#75Daniel Gustafsson
daniel@yesql.se
In reply to: Melanie Plageman (#74)
Re: Should vacuum process config file reload more often

On 7 Apr 2023, at 00:12, Melanie Plageman <melanieplageman@gmail.com> wrote:

On Thu, Apr 6, 2023 at 5:45 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 6 Apr 2023, at 23:06, Melanie Plageman <melanieplageman@gmail.com> wrote:

Autovacuum workers, at the end of VacuumUpdateCosts(), check if cost
limit or cost delay have been changed. If they have, they assert that
they don't already hold the AutovacuumLock, take it in shared mode, and
do the logging.

Another idea would be to copy the values to local temp variables while holding
the lock, and release the lock before calling elog() to avoid holding the lock
over potential IO.

Good idea. I've done this in attached v19.
Also I looked through the docs and everything still looks correct for
balancing algo.

I had another read-through and test-through of this version, and have applied
it with some minor changes to comments and whitespace. Thanks for the quick
turnaround times on reviews in this thread!

I opted for keeping the three individual commits, squashing them didn't seem
helpful enough to future commitlog readers and no other combination of the
three made more sense than what has been in the thread.

--
Daniel Gustafsson

#76Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Daniel Gustafsson (#75)
2 attachment(s)
Re: Should vacuum process config file reload more often

On Fri, Apr 7, 2023 at 8:08 AM Daniel Gustafsson <daniel@yesql.se> wrote:

On 7 Apr 2023, at 00:12, Melanie Plageman <melanieplageman@gmail.com> wrote:

On Thu, Apr 6, 2023 at 5:45 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 6 Apr 2023, at 23:06, Melanie Plageman <melanieplageman@gmail.com> wrote:

Autovacuum workers, at the end of VacuumUpdateCosts(), check if cost
limit or cost delay have been changed. If they have, they assert that
they don't already hold the AutovacuumLock, take it in shared mode, and
do the logging.

Another idea would be to copy the values to local temp variables while holding
the lock, and release the lock before calling elog() to avoid holding the lock
over potential IO.

Good idea. I've done this in attached v19.
Also I looked through the docs and everything still looks correct for
balancing algo.

I had another read-through and test-through of this version, and have applied
it with some minor changes to comments and whitespace. Thanks for the quick
turnaround times on reviews in this thread!

Cool!

Regarding the commit 7d71d3dd08, I have one comment:

+       /* Only log updates to cost-related variables */
+       if (vacuum_cost_delay == original_cost_delay &&
+           vacuum_cost_limit == original_cost_limit)
+           return;

IIUC by default, we log not only before starting the vacuum but also
when changing cost-related variables. Which is good, I think, because
logging the initial values would also be helpful for investigation.
However, I think that we don't log the initial vacuum cost values
depending on the values. For example, if the
autovacuum_vacuum_cost_delay storage option is set to 0, we don't log
the initial values. I think that instead of comparing old and new
values, we can write the log only if
message_level_is_interesting(DEBUG2) is true. That way, we don't need
to acquire the lwlock unnecessarily. And the code looks cleaner to me.
I've attached the patch (use_message_level_is_interesting.patch)

Also, while testing the autovacuum delay with relopt
autovacuum_vacuum_cost_delay = 0, I realized that even if we set
autovacuum_vacuum_cost_delay = 0 to a table, wi_dobalance is set to
true. wi_dobalance comes from the following expression:

/*
* If any of the cost delay parameters has been set individually for
* this table, disable the balancing algorithm.
*/
tab->at_dobalance =
!(avopts && (avopts->vacuum_cost_limit > 0 ||
avopts->vacuum_cost_delay > 0));

The initial values of both avopts->vacuum_cost_limit and
avopts->vacuum_cost_delay are -1. I think we should use ">= 0" instead
of "> 0". Otherwise, we include the autovacuum worker working on a
table whose autovacuum_vacuum_cost_delay is 0 to the balancing
algorithm. Probably this behavior has existed also on back branches
but I haven't checked it yet.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

fix.patchapplication/octet-stream; name=fix.patchDownload
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 53c8f8d79c..2036b39ad5 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2951,8 +2951,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 * this table, disable the balancing algorithm.
 		 */
 		tab->at_dobalance =
-			!(avopts && (avopts->vacuum_cost_limit > 0 ||
-						 avopts->vacuum_cost_delay > 0));
+			!(avopts && (avopts->vacuum_cost_limit >= 0 ||
+						 avopts->vacuum_cost_delay >= 0));
 	}
 
 	heap_freetuple(classTup);
use_message_level_is_interesting.patchapplication/octet-stream; name=use_message_level_is_interesting.patchDownload
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 53c8f8d79c..44497f6734 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1785,9 +1785,6 @@ FreeWorkerInfo(int code, Datum arg)
 void
 VacuumUpdateCosts(void)
 {
-	double		original_cost_delay = vacuum_cost_delay;
-	int			original_cost_limit = vacuum_cost_limit;
-
 	if (MyWorkerInfo)
 	{
 		if (av_storage_param_cost_delay >= 0)
@@ -1821,16 +1818,11 @@ VacuumUpdateCosts(void)
 		VacuumCostBalance = 0;
 	}
 
-	if (MyWorkerInfo)
+	if (MyWorkerInfo && message_level_is_interesting(DEBUG2))
 	{
 		Oid			dboid,
 					tableoid;
 
-		/* Only log updates to cost-related variables */
-		if (vacuum_cost_delay == original_cost_delay &&
-			vacuum_cost_limit == original_cost_limit)
-			return;
-
 		Assert(!LWLockHeldByMe(AutovacuumLock));
 
 		LWLockAcquire(AutovacuumLock, LW_SHARED);
#77Daniel Gustafsson
daniel@yesql.se
In reply to: Masahiko Sawada (#76)
Re: Should vacuum process config file reload more often

On 7 Apr 2023, at 08:52, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Apr 7, 2023 at 8:08 AM Daniel Gustafsson <daniel@yesql.se> wrote:

I had another read-through and test-through of this version, and have applied
it with some minor changes to comments and whitespace. Thanks for the quick
turnaround times on reviews in this thread!

Cool!

Regarding the commit 7d71d3dd08, I have one comment:

+       /* Only log updates to cost-related variables */
+       if (vacuum_cost_delay == original_cost_delay &&
+           vacuum_cost_limit == original_cost_limit)
+           return;

IIUC by default, we log not only before starting the vacuum but also
when changing cost-related variables. Which is good, I think, because
logging the initial values would also be helpful for investigation.
However, I think that we don't log the initial vacuum cost values
depending on the values. For example, if the
autovacuum_vacuum_cost_delay storage option is set to 0, we don't log
the initial values. I think that instead of comparing old and new
values, we can write the log only if
message_level_is_interesting(DEBUG2) is true. That way, we don't need
to acquire the lwlock unnecessarily. And the code looks cleaner to me.
I've attached the patch (use_message_level_is_interesting.patch)

That's a good idea, unless Melanie has conflicting opinions I think we should
go ahead with this. Avoiding taking a lock here is a good save.

Also, while testing the autovacuum delay with relopt
autovacuum_vacuum_cost_delay = 0, I realized that even if we set
autovacuum_vacuum_cost_delay = 0 to a table, wi_dobalance is set to
true. wi_dobalance comes from the following expression:

/*
* If any of the cost delay parameters has been set individually for
* this table, disable the balancing algorithm.
*/
tab->at_dobalance =
!(avopts && (avopts->vacuum_cost_limit > 0 ||
avopts->vacuum_cost_delay > 0));

The initial values of both avopts->vacuum_cost_limit and
avopts->vacuum_cost_delay are -1. I think we should use ">= 0" instead
of "> 0". Otherwise, we include the autovacuum worker working on a
table whose autovacuum_vacuum_cost_delay is 0 to the balancing
algorithm. Probably this behavior has existed also on back branches
but I haven't checked it yet.

Interesting, good find. Looking quickly at the back branches I think there is
a variant of this for vacuum_cost_limit even there but needs more investigation.

--
Daniel Gustafsson

#78Melanie Plageman
melanieplageman@gmail.com
In reply to: Masahiko Sawada (#76)
Re: Should vacuum process config file reload more often

On Fri, Apr 7, 2023 at 2:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Apr 7, 2023 at 8:08 AM Daniel Gustafsson <daniel@yesql.se> wrote:

On 7 Apr 2023, at 00:12, Melanie Plageman <melanieplageman@gmail.com> wrote:

On Thu, Apr 6, 2023 at 5:45 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 6 Apr 2023, at 23:06, Melanie Plageman <melanieplageman@gmail.com> wrote:

Autovacuum workers, at the end of VacuumUpdateCosts(), check if cost
limit or cost delay have been changed. If they have, they assert that
they don't already hold the AutovacuumLock, take it in shared mode, and
do the logging.

Another idea would be to copy the values to local temp variables while holding
the lock, and release the lock before calling elog() to avoid holding the lock
over potential IO.

Good idea. I've done this in attached v19.
Also I looked through the docs and everything still looks correct for
balancing algo.

I had another read-through and test-through of this version, and have applied
it with some minor changes to comments and whitespace. Thanks for the quick
turnaround times on reviews in this thread!

Cool!

Regarding the commit 7d71d3dd08, I have one comment:

+       /* Only log updates to cost-related variables */
+       if (vacuum_cost_delay == original_cost_delay &&
+           vacuum_cost_limit == original_cost_limit)
+           return;

IIUC by default, we log not only before starting the vacuum but also
when changing cost-related variables. Which is good, I think, because
logging the initial values would also be helpful for investigation.
However, I think that we don't log the initial vacuum cost values
depending on the values. For example, if the
autovacuum_vacuum_cost_delay storage option is set to 0, we don't log
the initial values. I think that instead of comparing old and new
values, we can write the log only if
message_level_is_interesting(DEBUG2) is true. That way, we don't need
to acquire the lwlock unnecessarily. And the code looks cleaner to me.
I've attached the patch (use_message_level_is_interesting.patch)

Thanks for coming up with the case you thought of with storage param for
cost delay = 0. In that case we wouldn't print the message initially and
we should fix that.

I disagree, however, that we should condition it only on
message_level_is_interesting().

Actually, outside of printing initial values when the autovacuum worker
first starts (before vacuuming all tables), I don't think we should log
these values except when they are being updated. Autovacuum workers
could vacuum tons of small tables and having this print out at least
once per table (which I know is how it is on master) would be
distracting. Also, you could be reloading the config to update some
other GUCs and be oblivious to an ongoing autovacuum and get these
messages printed out, which I would also find distracting.

You will have to stare very hard at the logs to tell if your changes to
vacuum cost delay and limit took effect when you reload config. I think
with our changes to update the values more often, we should take the
opportunity to make this logging more useful by making it happen only
when the values are changed.

I would be open to elevating the log level to DEBUG1 for logging only
updates and, perhaps, having an option if you set log level to DEBUG2,
for example, to always log these values in VacuumUpdateCosts().

I'd even argue that, potentially, having the cost-delay related
parameters printed at the beginning of vacuuming could be interesting to
regular VACUUM as well (even though it doesn't benefit from config
reload while in progress).

To fix the issue you mentioned and ensure the logging is printed when
autovacuum workers start up before vacuuming tables, we could either
initialize vacuum_cost_delay and vacuum_cost_limit to something invalid
that will always be different than what they are set to in
VacuumUpdateCosts() (not sure if this poses a problem for VACUUM using
these values since they are set to the defaults for VACUUM). Or, we
could duplicate this logging message in do_autovacuum().

Finally, one other point about message_level_is_interesting(). I liked
the idea of using it a lot, since log level DEBUG2 will not be the
common case. I thought of it but hesitated because all other users of
message_level_is_interesting() are avoiding some memory allocation or
string copying -- not avoiding take a lock. Making this conditioned on
log level made me a bit uncomfortable. I can't think of a situation when
it would be a problem, but it felt a bit off.

Also, while testing the autovacuum delay with relopt
autovacuum_vacuum_cost_delay = 0, I realized that even if we set
autovacuum_vacuum_cost_delay = 0 to a table, wi_dobalance is set to
true. wi_dobalance comes from the following expression:

/*
* If any of the cost delay parameters has been set individually for
* this table, disable the balancing algorithm.
*/
tab->at_dobalance =
!(avopts && (avopts->vacuum_cost_limit > 0 ||
avopts->vacuum_cost_delay > 0));

The initial values of both avopts->vacuum_cost_limit and
avopts->vacuum_cost_delay are -1. I think we should use ">= 0" instead
of "> 0". Otherwise, we include the autovacuum worker working on a
table whose autovacuum_vacuum_cost_delay is 0 to the balancing
algorithm. Probably this behavior has existed also on back branches
but I haven't checked it yet.

Thank you for catching this. Indeed this exists in master since
1021bd6a89b which was backpatched. I checked and it is true all the way
back through REL_11_STABLE.

Definitely seems worth fixing as it kind of defeats the purpose of the
original commit. I wish I had noticed before!

Your fix has:
!(avopts && (avopts->vacuum_cost_limit >= 0 ||
avopts->vacuum_cost_delay >= 0));

And though delay is required to be >= 0
avopts->vacuum_cost_delay >= 0

Limit does not. It can just be > 0.

postgres=# create table foo (a int) with (autovacuum_vacuum_cost_limit = 0);
ERROR: value 0 out of bounds for option "autovacuum_vacuum_cost_limit"
DETAIL: Valid values are between "1" and "10000".

Though >= is also fine, the rest of the code in all versions always
checks if limit > 0 and delay >= 0 since 0 is a valid value for delay
and not for limit. Probably best we keep it consistent (though the whole
thing is quite confusing).

- Melanie

#79Daniel Gustafsson
daniel@yesql.se
In reply to: Melanie Plageman (#78)
Re: Should vacuum process config file reload more often

On 7 Apr 2023, at 15:07, Melanie Plageman <melanieplageman@gmail.com> wrote:
On Fri, Apr 7, 2023 at 2:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

+       /* Only log updates to cost-related variables */
+       if (vacuum_cost_delay == original_cost_delay &&
+           vacuum_cost_limit == original_cost_limit)
+           return;

IIUC by default, we log not only before starting the vacuum but also
when changing cost-related variables. Which is good, I think, because
logging the initial values would also be helpful for investigation.
However, I think that we don't log the initial vacuum cost values
depending on the values. For example, if the
autovacuum_vacuum_cost_delay storage option is set to 0, we don't log
the initial values. I think that instead of comparing old and new
values, we can write the log only if
message_level_is_interesting(DEBUG2) is true. That way, we don't need
to acquire the lwlock unnecessarily. And the code looks cleaner to me.
I've attached the patch (use_message_level_is_interesting.patch)

Thanks for coming up with the case you thought of with storage param for
cost delay = 0. In that case we wouldn't print the message initially and
we should fix that.

I disagree, however, that we should condition it only on
message_level_is_interesting().

I think we should keep the logging frequency as committed, but condition taking
the lock on message_level_is_interesting().

Actually, outside of printing initial values when the autovacuum worker
first starts (before vacuuming all tables), I don't think we should log
these values except when they are being updated. Autovacuum workers
could vacuum tons of small tables and having this print out at least
once per table (which I know is how it is on master) would be
distracting. Also, you could be reloading the config to update some
other GUCs and be oblivious to an ongoing autovacuum and get these
messages printed out, which I would also find distracting.

You will have to stare very hard at the logs to tell if your changes to
vacuum cost delay and limit took effect when you reload config. I think
with our changes to update the values more often, we should take the
opportunity to make this logging more useful by making it happen only
when the values are changed.

I would be open to elevating the log level to DEBUG1 for logging only
updates and, perhaps, having an option if you set log level to DEBUG2,
for example, to always log these values in VacuumUpdateCosts().

I'd even argue that, potentially, having the cost-delay related
parameters printed at the beginning of vacuuming could be interesting to
regular VACUUM as well (even though it doesn't benefit from config
reload while in progress).

To fix the issue you mentioned and ensure the logging is printed when
autovacuum workers start up before vacuuming tables, we could either
initialize vacuum_cost_delay and vacuum_cost_limit to something invalid
that will always be different than what they are set to in
VacuumUpdateCosts() (not sure if this poses a problem for VACUUM using
these values since they are set to the defaults for VACUUM). Or, we
could duplicate this logging message in do_autovacuum().

Duplicating logging, maybe with a slightly tailored message, seem the least
bad option.

Finally, one other point about message_level_is_interesting(). I liked
the idea of using it a lot, since log level DEBUG2 will not be the
common case. I thought of it but hesitated because all other users of
message_level_is_interesting() are avoiding some memory allocation or
string copying -- not avoiding take a lock. Making this conditioned on
log level made me a bit uncomfortable. I can't think of a situation when
it would be a problem, but it felt a bit off.

Considering how uncommon DEBUG2 will be in production, I think conditioning
taking a lock on it makes sense.

Also, while testing the autovacuum delay with relopt
autovacuum_vacuum_cost_delay = 0, I realized that even if we set
autovacuum_vacuum_cost_delay = 0 to a table, wi_dobalance is set to
true. wi_dobalance comes from the following expression:

/*
* If any of the cost delay parameters has been set individually for
* this table, disable the balancing algorithm.
*/
tab->at_dobalance =
!(avopts && (avopts->vacuum_cost_limit > 0 ||
avopts->vacuum_cost_delay > 0));

The initial values of both avopts->vacuum_cost_limit and
avopts->vacuum_cost_delay are -1. I think we should use ">= 0" instead
of "> 0". Otherwise, we include the autovacuum worker working on a
table whose autovacuum_vacuum_cost_delay is 0 to the balancing
algorithm. Probably this behavior has existed also on back branches
but I haven't checked it yet.

Thank you for catching this. Indeed this exists in master since
1021bd6a89b which was backpatched. I checked and it is true all the way
back through REL_11_STABLE.

Definitely seems worth fixing as it kind of defeats the purpose of the
original commit. I wish I had noticed before!

Your fix has:
!(avopts && (avopts->vacuum_cost_limit >= 0 ||
avopts->vacuum_cost_delay >= 0));

And though delay is required to be >= 0
avopts->vacuum_cost_delay >= 0

Limit does not. It can just be > 0.

postgres=# create table foo (a int) with (autovacuum_vacuum_cost_limit = 0);
ERROR: value 0 out of bounds for option "autovacuum_vacuum_cost_limit"
DETAIL: Valid values are between "1" and "10000".

Though >= is also fine, the rest of the code in all versions always
checks if limit > 0 and delay >= 0 since 0 is a valid value for delay
and not for limit. Probably best we keep it consistent (though the whole
thing is quite confusing).

+1

--
Daniel Gustafsson

#80Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#78)
Re: Should vacuum process config file reload more often

On Fri, Apr 7, 2023 at 9:07 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Fri, Apr 7, 2023 at 2:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Apr 7, 2023 at 8:08 AM Daniel Gustafsson <daniel@yesql.se> wrote:

On 7 Apr 2023, at 00:12, Melanie Plageman <melanieplageman@gmail.com> wrote:

On Thu, Apr 6, 2023 at 5:45 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 6 Apr 2023, at 23:06, Melanie Plageman <melanieplageman@gmail.com> wrote:

Autovacuum workers, at the end of VacuumUpdateCosts(), check if cost
limit or cost delay have been changed. If they have, they assert that
they don't already hold the AutovacuumLock, take it in shared mode, and
do the logging.

Another idea would be to copy the values to local temp variables while holding
the lock, and release the lock before calling elog() to avoid holding the lock
over potential IO.

Good idea. I've done this in attached v19.
Also I looked through the docs and everything still looks correct for
balancing algo.

I had another read-through and test-through of this version, and have applied
it with some minor changes to comments and whitespace. Thanks for the quick
turnaround times on reviews in this thread!

Cool!

Regarding the commit 7d71d3dd08, I have one comment:

+       /* Only log updates to cost-related variables */
+       if (vacuum_cost_delay == original_cost_delay &&
+           vacuum_cost_limit == original_cost_limit)
+           return;

IIUC by default, we log not only before starting the vacuum but also
when changing cost-related variables. Which is good, I think, because
logging the initial values would also be helpful for investigation.
However, I think that we don't log the initial vacuum cost values
depending on the values. For example, if the
autovacuum_vacuum_cost_delay storage option is set to 0, we don't log
the initial values. I think that instead of comparing old and new
values, we can write the log only if
message_level_is_interesting(DEBUG2) is true. That way, we don't need
to acquire the lwlock unnecessarily. And the code looks cleaner to me.
I've attached the patch (use_message_level_is_interesting.patch)

Thanks for coming up with the case you thought of with storage param for
cost delay = 0. In that case we wouldn't print the message initially and
we should fix that.

I disagree, however, that we should condition it only on
message_level_is_interesting().

Actually, outside of printing initial values when the autovacuum worker
first starts (before vacuuming all tables), I don't think we should log
these values except when they are being updated. Autovacuum workers
could vacuum tons of small tables and having this print out at least
once per table (which I know is how it is on master) would be
distracting. Also, you could be reloading the config to update some
other GUCs and be oblivious to an ongoing autovacuum and get these
messages printed out, which I would also find distracting.

You will have to stare very hard at the logs to tell if your changes to
vacuum cost delay and limit took effect when you reload config. I think
with our changes to update the values more often, we should take the
opportunity to make this logging more useful by making it happen only
when the values are changed.

I would be open to elevating the log level to DEBUG1 for logging only
updates and, perhaps, having an option if you set log level to DEBUG2,
for example, to always log these values in VacuumUpdateCosts().

I'd even argue that, potentially, having the cost-delay related
parameters printed at the beginning of vacuuming could be interesting to
regular VACUUM as well (even though it doesn't benefit from config
reload while in progress).

To fix the issue you mentioned and ensure the logging is printed when
autovacuum workers start up before vacuuming tables, we could either
initialize vacuum_cost_delay and vacuum_cost_limit to something invalid
that will always be different than what they are set to in
VacuumUpdateCosts() (not sure if this poses a problem for VACUUM using
these values since they are set to the defaults for VACUUM). Or, we
could duplicate this logging message in do_autovacuum().

Finally, one other point about message_level_is_interesting(). I liked
the idea of using it a lot, since log level DEBUG2 will not be the
common case. I thought of it but hesitated because all other users of
message_level_is_interesting() are avoiding some memory allocation or
string copying -- not avoiding take a lock. Making this conditioned on
log level made me a bit uncomfortable. I can't think of a situation when
it would be a problem, but it felt a bit off.

Also, while testing the autovacuum delay with relopt
autovacuum_vacuum_cost_delay = 0, I realized that even if we set
autovacuum_vacuum_cost_delay = 0 to a table, wi_dobalance is set to
true. wi_dobalance comes from the following expression:

/*
* If any of the cost delay parameters has been set individually for
* this table, disable the balancing algorithm.
*/
tab->at_dobalance =
!(avopts && (avopts->vacuum_cost_limit > 0 ||
avopts->vacuum_cost_delay > 0));

The initial values of both avopts->vacuum_cost_limit and
avopts->vacuum_cost_delay are -1. I think we should use ">= 0" instead
of "> 0". Otherwise, we include the autovacuum worker working on a
table whose autovacuum_vacuum_cost_delay is 0 to the balancing
algorithm. Probably this behavior has existed also on back branches
but I haven't checked it yet.

Thank you for catching this. Indeed this exists in master since
1021bd6a89b which was backpatched. I checked and it is true all the way
back through REL_11_STABLE.

Definitely seems worth fixing as it kind of defeats the purpose of the
original commit. I wish I had noticed before!

Your fix has:
!(avopts && (avopts->vacuum_cost_limit >= 0 ||
avopts->vacuum_cost_delay >= 0));

And though delay is required to be >= 0
avopts->vacuum_cost_delay >= 0

Limit does not. It can just be > 0.

postgres=# create table foo (a int) with (autovacuum_vacuum_cost_limit = 0);
ERROR: value 0 out of bounds for option "autovacuum_vacuum_cost_limit"
DETAIL: Valid values are between "1" and "10000".

Though >= is also fine, the rest of the code in all versions always
checks if limit > 0 and delay >= 0 since 0 is a valid value for delay
and not for limit. Probably best we keep it consistent (though the whole
thing is quite confusing).

I have created an open item for each of these issues on the wiki
(one for 16 and one under the section "affects stable branches").

- Melanie

#81Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Daniel Gustafsson (#79)
Re: Should vacuum process config file reload more often

On Fri, Apr 7, 2023 at 10:23 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 7 Apr 2023, at 15:07, Melanie Plageman <melanieplageman@gmail.com> wrote:
On Fri, Apr 7, 2023 at 2:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

+       /* Only log updates to cost-related variables */
+       if (vacuum_cost_delay == original_cost_delay &&
+           vacuum_cost_limit == original_cost_limit)
+           return;

IIUC by default, we log not only before starting the vacuum but also
when changing cost-related variables. Which is good, I think, because
logging the initial values would also be helpful for investigation.
However, I think that we don't log the initial vacuum cost values
depending on the values. For example, if the
autovacuum_vacuum_cost_delay storage option is set to 0, we don't log
the initial values. I think that instead of comparing old and new
values, we can write the log only if
message_level_is_interesting(DEBUG2) is true. That way, we don't need
to acquire the lwlock unnecessarily. And the code looks cleaner to me.
I've attached the patch (use_message_level_is_interesting.patch)

Thanks for coming up with the case you thought of with storage param for
cost delay = 0. In that case we wouldn't print the message initially and
we should fix that.

I disagree, however, that we should condition it only on
message_level_is_interesting().

I think we should keep the logging frequency as committed, but condition taking
the lock on message_level_is_interesting().

Actually, outside of printing initial values when the autovacuum worker
first starts (before vacuuming all tables), I don't think we should log
these values except when they are being updated. Autovacuum workers
could vacuum tons of small tables and having this print out at least
once per table (which I know is how it is on master) would be
distracting. Also, you could be reloading the config to update some
other GUCs and be oblivious to an ongoing autovacuum and get these
messages printed out, which I would also find distracting.

You will have to stare very hard at the logs to tell if your changes to
vacuum cost delay and limit took effect when you reload config. I think
with our changes to update the values more often, we should take the
opportunity to make this logging more useful by making it happen only
when the values are changed.

For debugging purposes, I think it could also be important information
that the cost values are not changed. Personally, I prefer to log the
current state rather than deciding for ourselves which events are
important. If always logging these values in DEBUG2 had been
distracting, we might want to lower it to DEBUG3.

I would be open to elevating the log level to DEBUG1 for logging only
updates and, perhaps, having an option if you set log level to DEBUG2,
for example, to always log these values in VacuumUpdateCosts().

I'm not really sure it's a good idea to change the log messages and
events depending on elevel. Do you know we have any precedents ?

I'd even argue that, potentially, having the cost-delay related
parameters printed at the beginning of vacuuming could be interesting to
regular VACUUM as well (even though it doesn't benefit from config
reload while in progress).

To fix the issue you mentioned and ensure the logging is printed when
autovacuum workers start up before vacuuming tables, we could either
initialize vacuum_cost_delay and vacuum_cost_limit to something invalid
that will always be different than what they are set to in
VacuumUpdateCosts() (not sure if this poses a problem for VACUUM using
these values since they are set to the defaults for VACUUM). Or, we
could duplicate this logging message in do_autovacuum().

Duplicating logging, maybe with a slightly tailored message, seem the least
bad option.

Finally, one other point about message_level_is_interesting(). I liked
the idea of using it a lot, since log level DEBUG2 will not be the
common case. I thought of it but hesitated because all other users of
message_level_is_interesting() are avoiding some memory allocation or
string copying -- not avoiding take a lock. Making this conditioned on
log level made me a bit uncomfortable. I can't think of a situation when
it would be a problem, but it felt a bit off.

Considering how uncommon DEBUG2 will be in production, I think conditioning
taking a lock on it makes sense.

The comment of message_level_is_interesting() says:

* This is useful to short-circuit any expensive preparatory work that
* might be needed for a logging message.

Which can apply to taking a lwlock, I think.

Also, while testing the autovacuum delay with relopt
autovacuum_vacuum_cost_delay = 0, I realized that even if we set
autovacuum_vacuum_cost_delay = 0 to a table, wi_dobalance is set to
true. wi_dobalance comes from the following expression:

/*
* If any of the cost delay parameters has been set individually for
* this table, disable the balancing algorithm.
*/
tab->at_dobalance =
!(avopts && (avopts->vacuum_cost_limit > 0 ||
avopts->vacuum_cost_delay > 0));

The initial values of both avopts->vacuum_cost_limit and
avopts->vacuum_cost_delay are -1. I think we should use ">= 0" instead
of "> 0". Otherwise, we include the autovacuum worker working on a
table whose autovacuum_vacuum_cost_delay is 0 to the balancing
algorithm. Probably this behavior has existed also on back branches
but I haven't checked it yet.

Thank you for catching this. Indeed this exists in master since
1021bd6a89b which was backpatched. I checked and it is true all the way
back through REL_11_STABLE.

Thanks for checking!

Definitely seems worth fixing as it kind of defeats the purpose of the
original commit. I wish I had noticed before!

Your fix has:
!(avopts && (avopts->vacuum_cost_limit >= 0 ||
avopts->vacuum_cost_delay >= 0));

And though delay is required to be >= 0
avopts->vacuum_cost_delay >= 0

Limit does not. It can just be > 0.

postgres=# create table foo (a int) with (autovacuum_vacuum_cost_limit = 0);
ERROR: value 0 out of bounds for option "autovacuum_vacuum_cost_limit"
DETAIL: Valid values are between "1" and "10000".

Though >= is also fine, the rest of the code in all versions always
checks if limit > 0 and delay >= 0 since 0 is a valid value for delay
and not for limit. Probably best we keep it consistent (though the whole
thing is quite confusing).

+1

+1. I misunderstood the initial value of autovacuum_vacuum_cost_limit reloption.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#82Daniel Gustafsson
daniel@yesql.se
In reply to: Masahiko Sawada (#81)
Re: Should vacuum process config file reload more often

On 11 Apr 2023, at 17:05, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

The comment of message_level_is_interesting() says:

* This is useful to short-circuit any expensive preparatory work that
* might be needed for a logging message.

Which can apply to taking a lwlock, I think.

I agree that we can, and should, use message_level_is_interesting to skip
taking this lock. Also, the more I think about the more I'm convinced that we
should not change the current logging frequency of once per table from what we
ship today. In DEGUG2 the logs should tell the whole story without requiring
extrapolation based on missing entries. So I think we should use your patch to
solve this open item. If there is interest in reducing the logging frequency
we should discuss that in its own thread, insted of it being hidden in here.

--
Daniel Gustafsson

#83Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#81)
1 attachment(s)
Re: Should vacuum process config file reload more often

On Wed, Apr 12, 2023 at 12:05 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Fri, Apr 7, 2023 at 10:23 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 7 Apr 2023, at 15:07, Melanie Plageman <melanieplageman@gmail.com> wrote:
On Fri, Apr 7, 2023 at 2:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Definitely seems worth fixing as it kind of defeats the purpose of the
original commit. I wish I had noticed before!

Your fix has:
!(avopts && (avopts->vacuum_cost_limit >= 0 ||
avopts->vacuum_cost_delay >= 0));

And though delay is required to be >= 0
avopts->vacuum_cost_delay >= 0

Limit does not. It can just be > 0.

postgres=# create table foo (a int) with (autovacuum_vacuum_cost_limit = 0);
ERROR: value 0 out of bounds for option "autovacuum_vacuum_cost_limit"
DETAIL: Valid values are between "1" and "10000".

Though >= is also fine, the rest of the code in all versions always
checks if limit > 0 and delay >= 0 since 0 is a valid value for delay
and not for limit. Probably best we keep it consistent (though the whole
thing is quite confusing).

+1

+1. I misunderstood the initial value of autovacuum_vacuum_cost_limit reloption.

I've attached an updated patch for fixing at_dobalance condition.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

0001-Fix-the-condition-of-joining-autovacuum-workers-to-b.patchapplication/octet-stream; name=0001-Fix-the-condition-of-joining-autovacuum-workers-to-b.patchDownload
From 875e0ca1965fb371a982d0ed9b805c45faeeab22 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Mon, 17 Apr 2023 10:59:17 +0900
Subject: [PATCH] Fix the condition of joining autovacuum workers to balance
 calculation.

Previously, we exclude an autovacuum worker from balance calculation
if the table's autovacuum_vacuum_cost_delay storage option is >
0. However, since the initial value and minimum value of
autovacuum_vacuum_cost_delay storage option are -1 and 0 respectively,
we should exclude it if the table's autovacuum_vacuum_cost_delay
storage option is >= 0.
---
 src/backend/postmaster/autovacuum.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 53c8f8d79c..33d80b067b 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2952,7 +2952,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 */
 		tab->at_dobalance =
 			!(avopts && (avopts->vacuum_cost_limit > 0 ||
-						 avopts->vacuum_cost_delay > 0));
+						 avopts->vacuum_cost_delay >= 0));
 	}
 
 	heap_freetuple(classTup);
-- 
2.31.1

#84Daniel Gustafsson
daniel@yesql.se
In reply to: Masahiko Sawada (#83)
Re: Should vacuum process config file reload more often

On 17 Apr 2023, at 04:04, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I've attached an updated patch for fixing at_dobalance condition.

I revisited this and pushed it to all supported branches after another round of
testing and reading.

--
Daniel Gustafsson

#85Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Daniel Gustafsson (#84)
Re: Should vacuum process config file reload more often

On Tue, Apr 25, 2023 at 9:39 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 17 Apr 2023, at 04:04, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I've attached an updated patch for fixing at_dobalance condition.

I revisited this and pushed it to all supported branches after another round of
testing and reading.

Thanks!

Can we mark the open item "Can't disable autovacuum cost delay through
storage parameter" as resolved?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#86Daniel Gustafsson
daniel@yesql.se
In reply to: Masahiko Sawada (#85)
Re: Should vacuum process config file reload more often

On 25 Apr 2023, at 15:31, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Apr 25, 2023 at 9:39 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 17 Apr 2023, at 04:04, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I've attached an updated patch for fixing at_dobalance condition.

I revisited this and pushed it to all supported branches after another round of
testing and reading.

Thanks!

Can we mark the open item "Can't disable autovacuum cost delay through
storage parameter" as resolved?

Yes, I've gone ahead and done that now.

--
Daniel Gustafsson

#87Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Daniel Gustafsson (#86)
Re: Should vacuum process config file reload more often

On Tue, Apr 25, 2023 at 10:35 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 25 Apr 2023, at 15:31, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Apr 25, 2023 at 9:39 PM Daniel Gustafsson <daniel@yesql.se> wrote:

On 17 Apr 2023, at 04:04, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I've attached an updated patch for fixing at_dobalance condition.

I revisited this and pushed it to all supported branches after another round of
testing and reading.

Thanks!

Can we mark the open item "Can't disable autovacuum cost delay through
storage parameter" as resolved?

Yes, I've gone ahead and done that now.

Great, thank you!

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#88John Naylor
john.naylor@enterprisedb.com
In reply to: Daniel Gustafsson (#75)
Re: Should vacuum process config file reload more often

On Fri, Apr 7, 2023 at 6:08 AM Daniel Gustafsson <daniel@yesql.se> wrote:

I had another read-through and test-through of this version, and have

applied

it with some minor changes to comments and whitespace. Thanks for the

quick

turnaround times on reviews in this thread!

-       VacuumFailsafeActive = false;
+       Assert(!VacuumFailsafeActive);

I can trigger this assert added in commit 7d71d3dd08.

First build with the patch in [1]/messages/by-id/CAD21AoAyYBZOiB1UPCPZJHTLk0-arrq5zqNGj+PrsbpdUy=g-g@mail.gmail.com, then:

session 1:

CREATE EXTENSION xid_wraparound ;

CREATE TABLE autovacuum_disabled(id serial primary key, data text) WITH
(autovacuum_enabled=false);
INSERT INTO autovacuum_disabled(data) SELECT generate_series(1,1000);

-- I can trigger without this, but just make sure it doesn't get vacuumed
BEGIN;
DELETE FROM autovacuum_disabled WHERE id % 2 = 0;

session 2:

-- get to failsafe limit
SELECT consume_xids(1*1000*1000*1000);
INSERT INTO autovacuum_disabled(data) SELECT 1;
SELECT consume_xids(1*1000*1000*1000);
INSERT INTO autovacuum_disabled(data) SELECT 1;

VACUUM autovacuum_disabled;

WARNING: cutoff for removing and freezing tuples is far in the past
HINT: Close open transactions soon to avoid wraparound problems.
You might also need to commit or roll back old prepared transactions, or
drop stale replication slots.
WARNING: bypassing nonessential maintenance of table
"john.public.autovacuum_disabled" as a failsafe after 0 index scans
DETAIL: The table's relfrozenxid or relminmxid is too far in the past.
HINT: Consider increasing configuration parameter "maintenance_work_mem"
or "autovacuum_work_mem".
You might also need to consider other ways for VACUUM to keep up with the
allocation of transaction IDs.
server closed the connection unexpectedly

#0 0x00007ff31f68ebec in __pthread_kill_implementation ()
from /lib64/libc.so.6
#1 0x00007ff31f63e956 in raise () from /lib64/libc.so.6
#2 0x00007ff31f6287f4 in abort () from /lib64/libc.so.6
#3 0x0000000000978032 in ExceptionalCondition (
conditionName=conditionName@entry=0xa4e970 "!VacuumFailsafeActive",
fileName=fileName@entry=0xa4da38
"../src/backend/access/heap/vacuumlazy.c", lineNumber=lineNumber@entry=392)
at ../src/backend/utils/error/assert.c:66
#4 0x000000000058c598 in heap_vacuum_rel (rel=0x7ff31d8a97d0,
params=<optimized out>, bstrategy=<optimized out>)
at ../src/backend/access/heap/vacuumlazy.c:392
#5 0x000000000069af1f in table_relation_vacuum (bstrategy=0x14ddca8,
params=0x7ffec28585f0, rel=0x7ff31d8a97d0)
at ../src/include/access/tableam.h:1705
#6 vacuum_rel (relid=relid@entry=16402, relation=relation@entry=0x0,
params=params@entry=0x7ffec28585f0, skip_privs=skip_privs@entry=true,
bstrategy=bstrategy@entry=0x14ddca8)
at ../src/backend/commands/vacuum.c:2202
#7 0x000000000069b0e4 in vacuum_rel (relid=16398, relation=<optimized
out>,
params=params@entry=0x7ffec2858850, skip_privs=skip_privs@entry=false,
bstrategy=bstrategy@entry=0x14ddca8)
at ../src/backend/commands/vacuum.c:2236
#8 0x000000000069c594 in vacuum (relations=0x14dde38,
params=0x7ffec2858850, bstrategy=0x14ddca8, vac_context=0x14ddb90,
isTopLevel=<optimized out>) at ../src/backend/commands/vacuum.c:623

[1]: /messages/by-id/CAD21AoAyYBZOiB1UPCPZJHTLk0-arrq5zqNGj+PrsbpdUy=g-g@mail.gmail.com
/messages/by-id/CAD21AoAyYBZOiB1UPCPZJHTLk0-arrq5zqNGj+PrsbpdUy=g-g@mail.gmail.com

--
John Naylor
EDB: http://www.enterprisedb.com

#89Daniel Gustafsson
daniel@yesql.se
In reply to: John Naylor (#88)
Re: Should vacuum process config file reload more often

On 27 Apr 2023, at 11:29, John Naylor <john.naylor@enterprisedb.com> wrote:
On Fri, Apr 7, 2023 at 6:08 AM Daniel Gustafsson <daniel@yesql.se> wrote:

I had another read-through and test-through of this version, and have applied
it with some minor changes to comments and whitespace. Thanks for the quick
turnaround times on reviews in this thread!

-       VacuumFailsafeActive = false;
+       Assert(!VacuumFailsafeActive);

I can trigger this assert added in commit 7d71d3dd08.

First build with the patch in [1], then:

Interesting, thanks for the report! I'll look into it directly.

--
Daniel Gustafsson

#90Masahiko Sawada
sawada.mshk@gmail.com
In reply to: John Naylor (#88)
1 attachment(s)
Re: Should vacuum process config file reload more often

On Thu, Apr 27, 2023 at 6:30 PM John Naylor
<john.naylor@enterprisedb.com> wrote:

On Fri, Apr 7, 2023 at 6:08 AM Daniel Gustafsson <daniel@yesql.se> wrote:

I had another read-through and test-through of this version, and have applied
it with some minor changes to comments and whitespace. Thanks for the quick
turnaround times on reviews in this thread!

-       VacuumFailsafeActive = false;
+       Assert(!VacuumFailsafeActive);

I can trigger this assert added in commit 7d71d3dd08.

First build with the patch in [1], then:

session 1:

CREATE EXTENSION xid_wraparound ;

CREATE TABLE autovacuum_disabled(id serial primary key, data text) WITH (autovacuum_enabled=false);
INSERT INTO autovacuum_disabled(data) SELECT generate_series(1,1000);

-- I can trigger without this, but just make sure it doesn't get vacuumed
BEGIN;
DELETE FROM autovacuum_disabled WHERE id % 2 = 0;

session 2:

-- get to failsafe limit
SELECT consume_xids(1*1000*1000*1000);
INSERT INTO autovacuum_disabled(data) SELECT 1;
SELECT consume_xids(1*1000*1000*1000);
INSERT INTO autovacuum_disabled(data) SELECT 1;

VACUUM autovacuum_disabled;

WARNING: cutoff for removing and freezing tuples is far in the past
HINT: Close open transactions soon to avoid wraparound problems.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
WARNING: bypassing nonessential maintenance of table "john.public.autovacuum_disabled" as a failsafe after 0 index scans
DETAIL: The table's relfrozenxid or relminmxid is too far in the past.
HINT: Consider increasing configuration parameter "maintenance_work_mem" or "autovacuum_work_mem".
You might also need to consider other ways for VACUUM to keep up with the allocation of transaction IDs.
server closed the connection unexpectedly

#0 0x00007ff31f68ebec in __pthread_kill_implementation ()
from /lib64/libc.so.6
#1 0x00007ff31f63e956 in raise () from /lib64/libc.so.6
#2 0x00007ff31f6287f4 in abort () from /lib64/libc.so.6
#3 0x0000000000978032 in ExceptionalCondition (
conditionName=conditionName@entry=0xa4e970 "!VacuumFailsafeActive",
fileName=fileName@entry=0xa4da38 "../src/backend/access/heap/vacuumlazy.c", lineNumber=lineNumber@entry=392) at ../src/backend/utils/error/assert.c:66
#4 0x000000000058c598 in heap_vacuum_rel (rel=0x7ff31d8a97d0,
params=<optimized out>, bstrategy=<optimized out>)
at ../src/backend/access/heap/vacuumlazy.c:392
#5 0x000000000069af1f in table_relation_vacuum (bstrategy=0x14ddca8,
params=0x7ffec28585f0, rel=0x7ff31d8a97d0)
at ../src/include/access/tableam.h:1705
#6 vacuum_rel (relid=relid@entry=16402, relation=relation@entry=0x0,
params=params@entry=0x7ffec28585f0, skip_privs=skip_privs@entry=true,
bstrategy=bstrategy@entry=0x14ddca8)
at ../src/backend/commands/vacuum.c:2202
#7 0x000000000069b0e4 in vacuum_rel (relid=16398, relation=<optimized out>,
params=params@entry=0x7ffec2858850, skip_privs=skip_privs@entry=false,
bstrategy=bstrategy@entry=0x14ddca8)
at ../src/backend/commands/vacuum.c:2236

Good catch. I think the problem is that vacuum_rel() is called
recursively and we don't reset VacuumFailsafeActive before vacuuming
the toast table. I think we should reset it in heap_vacuum_rel()
instead of Assert(). It's possible that we trigger the failsafe mode
only for either one.Please find the attached patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

reset_VacuumFailsafeActive.patchapplication/octet-stream; name=reset_VacuumFailsafeActive.patchDownload
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 0a9ebd22bd..2ba85bd3d6 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -389,7 +389,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	Assert(params->index_cleanup != VACOPTVALUE_UNSPECIFIED);
 	Assert(params->truncate != VACOPTVALUE_UNSPECIFIED &&
 		   params->truncate != VACOPTVALUE_AUTO);
-	Assert(!VacuumFailsafeActive);
+	VacuumFailsafeActive = false;
 	vacrel->consider_bypass_optimization = true;
 	vacrel->do_index_vacuuming = true;
 	vacrel->do_index_cleanup = true;
#91Daniel Gustafsson
daniel@yesql.se
In reply to: Masahiko Sawada (#90)
Re: Should vacuum process config file reload more often

On 27 Apr 2023, at 14:10, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Apr 27, 2023 at 6:30 PM John Naylor
<john.naylor@enterprisedb.com> wrote:

On Fri, Apr 7, 2023 at 6:08 AM Daniel Gustafsson <daniel@yesql.se> wrote:

I had another read-through and test-through of this version, and have applied
it with some minor changes to comments and whitespace. Thanks for the quick
turnaround times on reviews in this thread!

-       VacuumFailsafeActive = false;
+       Assert(!VacuumFailsafeActive);

I can trigger this assert added in commit 7d71d3dd08.

First build with the patch in [1], then:

session 1:

CREATE EXTENSION xid_wraparound ;

CREATE TABLE autovacuum_disabled(id serial primary key, data text) WITH (autovacuum_enabled=false);
INSERT INTO autovacuum_disabled(data) SELECT generate_series(1,1000);

-- I can trigger without this, but just make sure it doesn't get vacuumed
BEGIN;
DELETE FROM autovacuum_disabled WHERE id % 2 = 0;

session 2:

-- get to failsafe limit
SELECT consume_xids(1*1000*1000*1000);
INSERT INTO autovacuum_disabled(data) SELECT 1;
SELECT consume_xids(1*1000*1000*1000);
INSERT INTO autovacuum_disabled(data) SELECT 1;

VACUUM autovacuum_disabled;

WARNING: cutoff for removing and freezing tuples is far in the past
HINT: Close open transactions soon to avoid wraparound problems.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
WARNING: bypassing nonessential maintenance of table "john.public.autovacuum_disabled" as a failsafe after 0 index scans
DETAIL: The table's relfrozenxid or relminmxid is too far in the past.
HINT: Consider increasing configuration parameter "maintenance_work_mem" or "autovacuum_work_mem".
You might also need to consider other ways for VACUUM to keep up with the allocation of transaction IDs.
server closed the connection unexpectedly

#0 0x00007ff31f68ebec in __pthread_kill_implementation ()
from /lib64/libc.so.6
#1 0x00007ff31f63e956 in raise () from /lib64/libc.so.6
#2 0x00007ff31f6287f4 in abort () from /lib64/libc.so.6
#3 0x0000000000978032 in ExceptionalCondition (
conditionName=conditionName@entry=0xa4e970 "!VacuumFailsafeActive",
fileName=fileName@entry=0xa4da38 "../src/backend/access/heap/vacuumlazy.c", lineNumber=lineNumber@entry=392) at ../src/backend/utils/error/assert.c:66
#4 0x000000000058c598 in heap_vacuum_rel (rel=0x7ff31d8a97d0,
params=<optimized out>, bstrategy=<optimized out>)
at ../src/backend/access/heap/vacuumlazy.c:392
#5 0x000000000069af1f in table_relation_vacuum (bstrategy=0x14ddca8,
params=0x7ffec28585f0, rel=0x7ff31d8a97d0)
at ../src/include/access/tableam.h:1705
#6 vacuum_rel (relid=relid@entry=16402, relation=relation@entry=0x0,
params=params@entry=0x7ffec28585f0, skip_privs=skip_privs@entry=true,
bstrategy=bstrategy@entry=0x14ddca8)
at ../src/backend/commands/vacuum.c:2202
#7 0x000000000069b0e4 in vacuum_rel (relid=16398, relation=<optimized out>,
params=params@entry=0x7ffec2858850, skip_privs=skip_privs@entry=false,
bstrategy=bstrategy@entry=0x14ddca8)
at ../src/backend/commands/vacuum.c:2236

Good catch. I think the problem is that vacuum_rel() is called
recursively and we don't reset VacuumFailsafeActive before vacuuming
the toast table. I think we should reset it in heap_vacuum_rel()
instead of Assert(). It's possible that we trigger the failsafe mode
only for either one.Please find the attached patch.

Agreed, that matches my research and testing, I have the same diff here and it
passes testing and works as intended. This was briefly discussed in [0]CAAKRu_b1HjGCTsFpUnmwLNS8NeXJ+JnrDLhT1osP+Gq9HCU+Rw@mail.gmail.com and
slightly upthread from there but then missed. I will do some more looking and
testing but I'm fairly sure this is the right fix, so unless I find something
else I will go ahead with this.

xid_wraparound is a really nifty testing tool. Very cool.

--
Daniel Gustafsson

[0]: CAAKRu_b1HjGCTsFpUnmwLNS8NeXJ+JnrDLhT1osP+Gq9HCU+Rw@mail.gmail.com

#92Melanie Plageman
melanieplageman@gmail.com
In reply to: Daniel Gustafsson (#91)
Re: Should vacuum process config file reload more often

On Thu, Apr 27, 2023 at 8:55 AM Daniel Gustafsson <daniel@yesql.se> wrote:

On 27 Apr 2023, at 14:10, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Apr 27, 2023 at 6:30 PM John Naylor
<john.naylor@enterprisedb.com> wrote:

On Fri, Apr 7, 2023 at 6:08 AM Daniel Gustafsson <daniel@yesql.se> wrote:

I had another read-through and test-through of this version, and have applied
it with some minor changes to comments and whitespace. Thanks for the quick
turnaround times on reviews in this thread!

-       VacuumFailsafeActive = false;
+       Assert(!VacuumFailsafeActive);

I can trigger this assert added in commit 7d71d3dd08.

First build with the patch in [1], then:

session 1:

CREATE EXTENSION xid_wraparound ;

CREATE TABLE autovacuum_disabled(id serial primary key, data text) WITH (autovacuum_enabled=false);
INSERT INTO autovacuum_disabled(data) SELECT generate_series(1,1000);

-- I can trigger without this, but just make sure it doesn't get vacuumed
BEGIN;
DELETE FROM autovacuum_disabled WHERE id % 2 = 0;

session 2:

-- get to failsafe limit
SELECT consume_xids(1*1000*1000*1000);
INSERT INTO autovacuum_disabled(data) SELECT 1;
SELECT consume_xids(1*1000*1000*1000);
INSERT INTO autovacuum_disabled(data) SELECT 1;

VACUUM autovacuum_disabled;

WARNING: cutoff for removing and freezing tuples is far in the past
HINT: Close open transactions soon to avoid wraparound problems.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
WARNING: bypassing nonessential maintenance of table "john.public.autovacuum_disabled" as a failsafe after 0 index scans
DETAIL: The table's relfrozenxid or relminmxid is too far in the past.
HINT: Consider increasing configuration parameter "maintenance_work_mem" or "autovacuum_work_mem".
You might also need to consider other ways for VACUUM to keep up with the allocation of transaction IDs.
server closed the connection unexpectedly

#0 0x00007ff31f68ebec in __pthread_kill_implementation ()
from /lib64/libc.so.6
#1 0x00007ff31f63e956 in raise () from /lib64/libc.so.6
#2 0x00007ff31f6287f4 in abort () from /lib64/libc.so.6
#3 0x0000000000978032 in ExceptionalCondition (
conditionName=conditionName@entry=0xa4e970 "!VacuumFailsafeActive",
fileName=fileName@entry=0xa4da38 "../src/backend/access/heap/vacuumlazy.c", lineNumber=lineNumber@entry=392) at ../src/backend/utils/error/assert.c:66
#4 0x000000000058c598 in heap_vacuum_rel (rel=0x7ff31d8a97d0,
params=<optimized out>, bstrategy=<optimized out>)
at ../src/backend/access/heap/vacuumlazy.c:392
#5 0x000000000069af1f in table_relation_vacuum (bstrategy=0x14ddca8,
params=0x7ffec28585f0, rel=0x7ff31d8a97d0)
at ../src/include/access/tableam.h:1705
#6 vacuum_rel (relid=relid@entry=16402, relation=relation@entry=0x0,
params=params@entry=0x7ffec28585f0, skip_privs=skip_privs@entry=true,
bstrategy=bstrategy@entry=0x14ddca8)
at ../src/backend/commands/vacuum.c:2202
#7 0x000000000069b0e4 in vacuum_rel (relid=16398, relation=<optimized out>,
params=params@entry=0x7ffec2858850, skip_privs=skip_privs@entry=false,
bstrategy=bstrategy@entry=0x14ddca8)
at ../src/backend/commands/vacuum.c:2236

Good catch. I think the problem is that vacuum_rel() is called
recursively and we don't reset VacuumFailsafeActive before vacuuming
the toast table. I think we should reset it in heap_vacuum_rel()
instead of Assert(). It's possible that we trigger the failsafe mode
only for either one.Please find the attached patch.

Agreed, that matches my research and testing, I have the same diff here and it
passes testing and works as intended. This was briefly discussed in [0] and
slightly upthread from there but then missed. I will do some more looking and
testing but I'm fairly sure this is the right fix, so unless I find something
else I will go ahead with this.

xid_wraparound is a really nifty testing tool. Very cool.Makes sense to me too.

Fix LGTM.
Though we previously set it to false before this series of patches,
perhaps it is
worth adding a comment about why VacuumFailsafeActive must be reset here
even though we reset it before vacuuming each table?

- Melanie

#93Daniel Gustafsson
daniel@yesql.se
In reply to: Melanie Plageman (#92)
Re: Should vacuum process config file reload more often

On 27 Apr 2023, at 16:53, Melanie Plageman <melanieplageman@gmail.com> wrote:
On Thu, Apr 27, 2023 at 8:55 AM Daniel Gustafsson <daniel@yesql.se> wrote:

On 27 Apr 2023, at 14:10, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Good catch. I think the problem is that vacuum_rel() is called
recursively and we don't reset VacuumFailsafeActive before vacuuming
the toast table. I think we should reset it in heap_vacuum_rel()
instead of Assert(). It's possible that we trigger the failsafe mode
only for either one.Please find the attached patch.

Agreed, that matches my research and testing, I have the same diff here and it
passes testing and works as intended. This was briefly discussed in [0] and
slightly upthread from there but then missed. I will do some more looking and
testing but I'm fairly sure this is the right fix, so unless I find something
else I will go ahead with this.

xid_wraparound is a really nifty testing tool. Very cool.Makes sense to me too.

Fix LGTM.

Thanks for review. I plan to push this in the morning.

Though we previously set it to false before this series of patches,
perhaps it is
worth adding a comment about why VacuumFailsafeActive must be reset here
even though we reset it before vacuuming each table?

Agreed.

--
Daniel Gustafsson

#94Daniel Gustafsson
daniel@yesql.se
In reply to: Daniel Gustafsson (#93)
Re: Should vacuum process config file reload more often

On 27 Apr 2023, at 23:25, Daniel Gustafsson <daniel@yesql.se> wrote:

On 27 Apr 2023, at 16:53, Melanie Plageman <melanieplageman@gmail.com> wrote:

Fix LGTM.

Thanks for review. I plan to push this in the morning.

Done, thanks.

--
Daniel Gustafsson