allow changing autovacuum_max_workers without restarting

Started by Nathan Bossartalmost 2 years ago71 messages

nathandbossart@gmail.com

almost 2 years ago

1 attachment(s)

I frequently hear about scenarios where users with thousands upon thousands
of tables realize that autovacuum is struggling to keep up. When they
inevitably go to bump up autovacuum_max_workers, they discover that it
requires a server restart (i.e., downtime) to take effect, causing further
frustration. For this reason, I think $SUBJECT is a desirable improvement.
I spent some time looking for past discussions about this, and I was
surprised to not find any, so I thought I'd give it a try.

The attached proof-of-concept patch demonstrates what I have in mind.
Instead of trying to dynamically change the global process table, etc., I'm
proposing that we introduce a new GUC that sets the effective maximum
number of autovacuum workers that can be started at any time. This means
there would be two GUCs for the number of autovacuum workers: one for the
number of slots reserved for autovacuum workers, and another that restricts
the number of those slots that can be used. The former would continue to
require a restart to change its value, and users would typically want to
set it relatively high. The latter could be changed at any time and would
allow for raising or lowering the maximum number of active autovacuum
workers, up to the limit set by the other parameter.

The proof-of-concept patch keeps autovacuum_max_workers as the maximum
number of slots to reserve for workers, but I think we should instead
rename this parameter to something else and then reintroduce
autovacuum_max_workers as the new parameter that can be adjusted without
restarting. That way, autovacuum_max_workers continues to work much the
same way as in previous versions.

There are a couple of weird cases with this approach. One is when the
restart-only limit is set lower than the PGC_SIGHUP limit. In that case, I
think we should just use the restart-only limit. The other is when there
are already N active autovacuum workers and the PGC_SIGHUP parameter is
changed to something less than N. For that case, I think we should just
block starting additional workers until the number of workers drops below
the new parameter's value. I don't think we should kill existing workers,
or anything else like that.

TBH I've been sitting on this idea for a while now, only because I think it
has a slim chance of acceptance, but IMHO this is a simple change that
could help many users.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachments:

autovac_max_workers_proof_of_concept.patchtext/x-diff; charset=us-asciiDownload

diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 170b973cc5..e65ddd67c1 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -114,6 +114,7 @@
  * GUC parameters
  */
 bool		autovacuum_start_daemon = false;
+int			autovacuum_workers;
 int			autovacuum_max_workers;
 int			autovacuum_work_mem = -1;
 int			autovacuum_naptime;
@@ -289,7 +290,7 @@ typedef struct
 {
 	sig_atomic_t av_signal[AutoVacNumSignals];
 	pid_t		av_launcherpid;
-	dlist_head	av_freeWorkers;
+	dclist_head av_freeWorkers;
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
@@ -347,6 +348,7 @@ static void autovac_report_activity(autovac_table *tab);
 static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
 									const char *nspname, const char *relname);
 static void avl_sigusr2_handler(SIGNAL_ARGS);
+static bool autovac_slot_available(void);
 
 
 
@@ -575,8 +577,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		 * wakening conditions.
 		 */
 
-		launcher_determine_sleep(!dlist_is_empty(&AutoVacuumShmem->av_freeWorkers),
-								 false, &nap);
+		launcher_determine_sleep(autovac_slot_available(), false, &nap);
 
 		/*
 		 * Wait until naptime expires or we get some type of signal (all the
@@ -636,7 +637,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		current_time = GetCurrentTimestamp();
 		LWLockAcquire(AutovacuumLock, LW_SHARED);
 
-		can_launch = !dlist_is_empty(&AutoVacuumShmem->av_freeWorkers);
+		can_launch = autovac_slot_available();
 
 		if (AutoVacuumShmem->av_startingWorker != NULL)
 		{
@@ -679,8 +680,8 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 					worker->wi_sharedrel = false;
 					worker->wi_proc = NULL;
 					worker->wi_launchtime = 0;
-					dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-									&worker->wi_links);
+					dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+									 &worker->wi_links);
 					AutoVacuumShmem->av_startingWorker = NULL;
 					ereport(WARNING,
 							errmsg("autovacuum worker took too long to start; canceled"));
@@ -1087,7 +1088,7 @@ do_start_worker(void)
 
 	/* return quickly when there are no free workers */
 	LWLockAcquire(AutovacuumLock, LW_SHARED);
-	if (dlist_is_empty(&AutoVacuumShmem->av_freeWorkers))
+	if (!autovac_slot_available())
 	{
 		LWLockRelease(AutovacuumLock);
 		return InvalidOid;
@@ -1240,7 +1241,7 @@ do_start_worker(void)
 		 * Get a worker entry from the freelist.  We checked above, so there
 		 * really should be a free slot.
 		 */
-		wptr = dlist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
+		wptr = dclist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
 
 		worker = dlist_container(WorkerInfoData, wi_links, wptr);
 		worker->wi_dboid = avdb->adw_datid;
@@ -1609,8 +1610,8 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
 		pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
-		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-						&MyWorkerInfo->wi_links);
+		dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+						 &MyWorkerInfo->wi_links);
 		/* not mine anymore */
 		MyWorkerInfo = NULL;
 
@@ -3292,7 +3293,7 @@ AutoVacuumShmemInit(void)
 		Assert(!found);
 
 		AutoVacuumShmem->av_launcherpid = 0;
-		dlist_init(&AutoVacuumShmem->av_freeWorkers);
+		dclist_init(&AutoVacuumShmem->av_freeWorkers);
 		dlist_init(&AutoVacuumShmem->av_runningWorkers);
 		AutoVacuumShmem->av_startingWorker = NULL;
 		memset(AutoVacuumShmem->av_workItems, 0,
@@ -3304,8 +3305,8 @@ AutoVacuumShmemInit(void)
 		/* initialize the WorkerInfo free list */
 		for (i = 0; i < autovacuum_max_workers; i++)
 		{
-			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-							&worker[i].wi_links);
+			dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+							 &worker[i].wi_links);
 			pg_atomic_init_flag(&worker[i].wi_dobalance);
 		}
 
@@ -3341,3 +3342,12 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
 
 	return true;
 }
+
+static bool
+autovac_slot_available(void)
+{
+	const dclist_head *freelist = &AutoVacuumShmem->av_freeWorkers;
+	int			reserved_slots = autovacuum_max_workers - autovacuum_workers;
+
+	return dclist_count(freelist) > Max(0, reserved_slots);
+}
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index c68fdc008b..29eb22939a 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3400,14 +3400,23 @@ struct config_int ConfigureNamesInt[] =
 		400000000, 10000, 2000000000,
 		NULL, NULL, NULL
 	},
+	{
+		{"autovacuum_workers", PGC_SIGHUP, AUTOVACUUM,
+			gettext_noop("Sets the maximum number of simultaneously running autovacuum worker processes."),
+			NULL
+		},
+		&autovacuum_workers,
+		3, 1, MAX_BACKENDS,
+		NULL, NULL, NULL
+	},
 	{
 		/* see max_connections */
 		{"autovacuum_max_workers", PGC_POSTMASTER, AUTOVACUUM,
-			gettext_noop("Sets the maximum number of simultaneously running autovacuum worker processes."),
+			gettext_noop("Sets the maximum effective value of autovacuum_workers."),
 			NULL
 		},
 		&autovacuum_max_workers,
-		3, 1, MAX_BACKENDS,
+		16, 1, MAX_BACKENDS,
 		check_autovacuum_max_workers, NULL, NULL
 	},
 
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 2166ea4a87..f5bc403041 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -658,7 +658,8 @@
 
 #autovacuum = on			# Enable autovacuum subprocess?  'on'
 					# requires track_counts to also be on.
-#autovacuum_max_workers = 3		# max number of autovacuum subprocesses
+#autovacuum_workers = 3			# max number of autovacuum subprocesses
+#autovacuum_max_workers = 16		# effective limit for autovacuum_workers
 					# (change requires restart)
 #autovacuum_naptime = 1min		# time between autovacuum runs
 #autovacuum_vacuum_threshold = 50	# min number of row updates before
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index cae1e8b329..fb2936c161 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -28,6 +28,7 @@ typedef enum
 
 /* GUC variables */
 extern PGDLLIMPORT bool autovacuum_start_daemon;
+extern PGDLLIMPORT int autovacuum_workers;
 extern PGDLLIMPORT int autovacuum_max_workers;
 extern PGDLLIMPORT int autovacuum_work_mem;
 extern PGDLLIMPORT int autovacuum_naptime;
diff --git a/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
index 37550b67a4..fb20c9084c 100644
--- a/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
+++ b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
@@ -21,7 +21,7 @@ $node->append_conf(
 autovacuum = off # run autovacuum only when to anti wraparound
 autovacuum_naptime = 1s
 # so it's easier to verify the order of operations
-autovacuum_max_workers = 1
+autovacuum_workers = 1
 log_autovacuum_min_duration = 0
 ]);
 $node->start;
diff --git a/src/test/modules/xid_wraparound/t/003_wraparounds.pl b/src/test/modules/xid_wraparound/t/003_wraparounds.pl
index 88063b4b52..35f2e1029a 100644
--- a/src/test/modules/xid_wraparound/t/003_wraparounds.pl
+++ b/src/test/modules/xid_wraparound/t/003_wraparounds.pl
@@ -24,7 +24,7 @@ $node->append_conf(
 autovacuum = off # run autovacuum only when to anti wraparound
 autovacuum_naptime = 1s
 # so it's easier to verify the order of operations
-autovacuum_max_workers = 1
+autovacuum_workers = 1
 log_autovacuum_min_duration = 0
 ]);
 $node->start;

Imseih (AWS), Sami

simseih@amazon.com

almost 2 years ago

In reply to: Nathan Bossart (#1)

Re: allow changing autovacuum_max_workers without restarting

I frequently hear about scenarios where users with thousands upon thousands
of tables realize that autovacuum is struggling to keep up. When they
inevitably go to bump up autovacuum_max_workers, they discover that it
requires a server restart (i.e., downtime) to take effect, causing further
frustration. For this reason, I think $SUBJECT is a desirable improvement.
I spent some time looking for past discussions about this, and I was
surprised to not find any, so I thought I'd give it a try.

I did not review the patch in detail yet, but +1 to the idea.
It's not just thousands of tables that suffer from this.
If a user has a few large tables hogging the autovac workers, then other
tables don't get the autovac cycles they require. Users are then forced
to run manual vacuums, which adds complexity to their operations.

The attached proof-of-concept patch demonstrates what I have in mind.
Instead of trying to dynamically change the global process table, etc., I'm
proposing that we introduce a new GUC that sets the effective maximum
number of autovacuum workers that can be started at any time.

max_worker_processes defines a pool of max # of background workers allowed.
parallel workers and extensions that spin up background workers all utilize from
this pool.

Should autovacuum_max_workers be able to utilize from max_worker_processes also?

This will allow autovacuum_max_workers to be dynamic while the user only has
to deal with an already existing GUC. We may want to increase the default value
for max_worker_processes as part of this.

Regards,

Sami
Amazon Web Services (AWS)

Nathan Bossart

nathandbossart@gmail.com

almost 2 years ago

In reply to: Imseih (AWS), Sami (#2)

Re: allow changing autovacuum_max_workers without restarting

On Thu, Apr 11, 2024 at 02:24:18PM +0000, Imseih (AWS), Sami wrote:

max_worker_processes defines a pool of max # of background workers allowed.
parallel workers and extensions that spin up background workers all utilize from
this pool.

Should autovacuum_max_workers be able to utilize from max_worker_processes also?

This will allow autovacuum_max_workers to be dynamic while the user only has
to deal with an already existing GUC. We may want to increase the default value
for max_worker_processes as part of this.

My concern with this approach is that other background workers could use up
all the slots and prevent autovacuum workers from starting, unless of
course we reserve autovacuum_max_workers slots for _only_ autovacuum
workers. I'm not sure if we want to get these parameters tangled up like
this, though...

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Nathan Bossart

nathandbossart@gmail.com

almost 2 years ago

In reply to: Nathan Bossart (#3)

Re: allow changing autovacuum_max_workers without restarting

On Thu, Apr 11, 2024 at 09:42:40AM -0500, Nathan Bossart wrote:

On Thu, Apr 11, 2024 at 02:24:18PM +0000, Imseih (AWS), Sami wrote:

max_worker_processes defines a pool of max # of background workers allowed.
parallel workers and extensions that spin up background workers all utilize from
this pool.

Should autovacuum_max_workers be able to utilize from max_worker_processes also?

This will allow autovacuum_max_workers to be dynamic while the user only has
to deal with an already existing GUC. We may want to increase the default value
for max_worker_processes as part of this.

My concern with this approach is that other background workers could use up
all the slots and prevent autovacuum workers from starting, unless of
course we reserve autovacuum_max_workers slots for _only_ autovacuum
workers. I'm not sure if we want to get these parameters tangled up like
this, though...

I see that the logical replication launcher process uses this pool, but we
take special care to make sure it gets a slot:

/*
* Register the apply launcher. It's probably a good idea to call this
* before any modules had a chance to take the background worker slots.
*/
ApplyLauncherRegister();

I'm not sure there's another way to effectively reserve slots that would
work for the autovacuum workers (which need to restart to connect to
different databases), so that would need to be invented. We'd probably
also want to fail startup if autovacuum_max_workers < max_worker_processes,
which seems like it has the potential to cause problems when folks first
upgrade to v18.

Furthermore, we might have to convert autovacuum workers to background
worker processes for this to work. I've admittedly wondered about whether
we should do that eventually, anyway, but it'd expand the scope of this
work quite a bit.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Imseih (AWS), Sami

simseih@amazon.com

almost 2 years ago

In reply to: Nathan Bossart (#3)

Re: allow changing autovacuum_max_workers without restarting

My concern with this approach is that other background workers could use up
all the slots and prevent autovacuum workers from starting

That's a good point, the current settings do not guarantee that you
get a worker for the purpose if none are available,
i.e. max_parallel_workers_per_gather, you may have 2 workers planned
and 0 launched.

unless of
course we reserve autovacuum_max_workers slots for _only_ autovacuum
workers. I'm not sure if we want to get these parameters tangled up like
this, though...

This will be confusing to describe and we will be reserving autovac workers
implicitly, rather than explicitly with a new GUC.

Regards,

Sami

Nathan Bossart

nathandbossart@gmail.com

almost 2 years ago

In reply to: Imseih (AWS), Sami (#5)

Re: allow changing autovacuum_max_workers without restarting

On Thu, Apr 11, 2024 at 03:37:23PM +0000, Imseih (AWS), Sami wrote:

My concern with this approach is that other background workers could use up
all the slots and prevent autovacuum workers from starting

That's a good point, the current settings do not guarantee that you
get a worker for the purpose if none are available,
i.e. max_parallel_workers_per_gather, you may have 2 workers planned
and 0 launched.

unless of
course we reserve autovacuum_max_workers slots for _only_ autovacuum
workers. I'm not sure if we want to get these parameters tangled up like
this, though...

This will be confusing to describe and we will be reserving autovac workers
implicitly, rather than explicitly with a new GUC.

Yeah, that's probably a good reason to give autovacuum its own worker pool.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Imseih (AWS), Sami

simseih@amazon.com

almost 2 years ago

In reply to: Nathan Bossart (#6)

Re: allow changing autovacuum_max_workers without restarting

I spent sometime reviewing/testing the POC. It is relatively simple with a lot
of obvious value.

I tested with 16 tables that constantly reach the autovac threashold and the
patch did the right thing. I observed concurrent autovacuum workers matching
the setting as I was adjusting it dynamically.

As you mention above, If there are more autovacs in progress and a new lower setting
is applied, we should not take any special action on those autovacuums, and eventually
the max number of autovacuum workers will match the setting.

I also tested by allowing user connections to reach max_connections, and observed the
expected number of autovacuums spinning up and correctly adjusted.

Having autovacuum tests ( unless I missed something ) like the above is a good
general improvement, but it should not be tied to this.

A few comments on the POC patch:

1/ We should emit a log when autovacuum_workers is set higher than the max.

2/ should the name of the restart limit be "reserved_autovacuum_workers"?

Regards,

Sami Imseih
AWS (Amazon Web Services)

Nathan Bossart

nathandbossart@gmail.com

almost 2 years ago

In reply to: Imseih (AWS), Sami (#7)

Re: allow changing autovacuum_max_workers without restarting

On Fri, Apr 12, 2024 at 05:27:40PM +0000, Imseih (AWS), Sami wrote:

A few comments on the POC patch:

Thanks for reviewing.

1/ We should emit a log when autovacuum_workers is set higher than the max.

Hm. Maybe the autovacuum launcher could do that.

2/ should the name of the restart limit be "reserved_autovacuum_workers"?

That's kind-of what I had in mind, although I think we might want to avoid
the word "reserved" because it sounds a bit like reserved_connections and
superuser_reserved_connections. "autovacuum_max_slots" or
"autovacuum_max_worker_slots" might be worth considering, too.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Imseih (AWS), Sami

simseih@amazon.com

almost 2 years ago

In reply to: Nathan Bossart (#8)

Re: allow changing autovacuum_max_workers without restarting

1/ We should emit a log when autovacuum_workers is set higher than the max.

Hm. Maybe the autovacuum launcher could do that.

Would it be better to use a GUC check_hook that compares the
new value with the max allowed values and emits a WARNING ?

autovacuum_max_workers already has a check_autovacuum_max_workers
check_hook, which can be repurposed for this.

In the POC patch, this check_hook is kept as-is, which will no longer make sense.

2/ should the name of the restart limit be "reserved_autovacuum_workers"?

That's kind-of what I had in mind, although I think we might want to avoid
the word "reserved" because it sounds a bit like reserved_connections
and superuser_reserved_connections

Yes, I agree. This can be confusing.

"autovacuum_max_slots" or
"autovacuum_max_worker_slots" might be worth considering, too.

"autovacuum_max_worker_slots" is probably the best option because
we should have "worker" in the name of the GUC.

Regards,

Sami

#10

Nathan Bossart

nathandbossart@gmail.com

almost 2 years ago

In reply to: Imseih (AWS), Sami (#9)

Re: allow changing autovacuum_max_workers without restarting

On Fri, Apr 12, 2024 at 10:17:44PM +0000, Imseih (AWS), Sami wrote:

Hm. Maybe the autovacuum launcher could do that.

Would it be better to use a GUC check_hook that compares the
new value with the max allowed values and emits a WARNING ?

autovacuum_max_workers already has a check_autovacuum_max_workers
check_hook, which can be repurposed for this.

In the POC patch, this check_hook is kept as-is, which will no longer make sense.

IIRC using GUC hooks to handle dependencies like this is generally frowned
upon because it tends to not work very well [0]/messages/by-id/27574.1581015893@sss.pgh.pa.us. We could probably get it
to work for this particular case, but IMHO we should still try to avoid
this approach. I didn't find any similar warnings for other GUCs like
max_parallel_workers_per_gather, so it might not be crucial to emit a
WARNING here.

[0]: /messages/by-id/27574.1581015893@sss.pgh.pa.us

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#11

Imseih (AWS), Sami

simseih@amazon.com

over 1 year ago

In reply to: Nathan Bossart (#10)

Re: allow changing autovacuum_max_workers without restarting

IIRC using GUC hooks to handle dependencies like this is generally frowned
upon because it tends to not work very well [0]. We could probably get it
to work for this particular case, but IMHO we should still try to avoid
this approach.

Thanks for pointing this out. I agree, this could lead to false logs being
emitted.

so it might not be crucial to emit a
WARNING here.

As mentioned earlier in the thread, we can let the autovacuum launcher emit the
log, but it will need to be careful not flood the logs when this condition exists ( i.e.
log only the first time the condition is detected or log every once in a while )

The additional complexity is not worth it.

Regards,

Sami

#12

Nathan Bossart

nathandbossart@gmail.com

over 1 year ago

In reply to: Imseih (AWS), Sami (#11)

4 attachment(s)

Re: allow changing autovacuum_max_workers without restarting

Here is a first attempt at a proper patch set based on the discussion thus
far. I've split it up into several small patches for ease of review, which
is probably a bit excessive. If this ever makes it to commit, they could
likely be combined.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachments:

v1-0001-Rename-autovacuum_max_workers-to-autovacuum_max_w.patchtext/x-diff; charset=us-asciiDownload

From 612e3ccc66f689a2a16e9dbf027541b71454bad3 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Sat, 13 Apr 2024 15:00:08 -0500
Subject: [PATCH v1 1/4] Rename autovacuum_max_workers to
 autovacuum_max_worker_slots.

---
 doc/src/sgml/config.sgml                             |  8 ++++----
 doc/src/sgml/maintenance.sgml                        |  4 ++--
 doc/src/sgml/runtime.sgml                            | 12 ++++++------
 src/backend/access/transam/xlog.c                    |  2 +-
 src/backend/postmaster/autovacuum.c                  |  8 ++++----
 src/backend/postmaster/postmaster.c                  |  2 +-
 src/backend/storage/lmgr/proc.c                      |  6 +++---
 src/backend/utils/init/postinit.c                    | 12 ++++++------
 src/backend/utils/misc/guc_tables.c                  |  6 +++---
 src/backend/utils/misc/postgresql.conf.sample        |  2 +-
 src/include/postmaster/autovacuum.h                  |  2 +-
 src/include/utils/guc_hooks.h                        |  4 ++--
 .../modules/xid_wraparound/t/001_emergency_vacuum.pl |  2 +-
 src/test/modules/xid_wraparound/t/003_wraparounds.pl |  2 +-
 14 files changed, 36 insertions(+), 36 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d8e1282e12..b4d67a93b6 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1914,7 +1914,7 @@ include_dir 'conf.d'
        </para>
        <para>
         Note that when autovacuum runs, up to
-        <xref linkend="guc-autovacuum-max-workers"/> times this memory
+        <xref linkend="guc-autovacuum-max-worker-slots"/> times this memory
         may be allocated, so be careful not to set the default value
         too high.  It may be useful to control for this by separately
         setting <xref linkend="guc-autovacuum-work-mem"/>.
@@ -8534,10 +8534,10 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
       </listitem>
      </varlistentry>
 
-     <varlistentry id="guc-autovacuum-max-workers" xreflabel="autovacuum_max_workers">
-      <term><varname>autovacuum_max_workers</varname> (<type>integer</type>)
+     <varlistentry id="guc-autovacuum-max-worker-slots" xreflabel="autovacuum_max_worker_slots">
+      <term><varname>autovacuum_max_worker_slots</varname> (<type>integer</type>)
       <indexterm>
-       <primary><varname>autovacuum_max_workers</varname> configuration parameter</primary>
+       <primary><varname>autovacuum_max_worker_slots</varname> configuration parameter</primary>
       </indexterm>
       </term>
       <listitem>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 2bfa05b8bc..7b4b3f0087 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -864,9 +864,9 @@ HINT:  Execute a database-wide VACUUM in that database.
     seconds.  (Therefore, if the installation has <replaceable>N</replaceable> databases,
     a new worker will be launched every
     <varname>autovacuum_naptime</varname>/<replaceable>N</replaceable> seconds.)
-    A maximum of <xref linkend="guc-autovacuum-max-workers"/> worker processes
+    A maximum of <xref linkend="guc-autovacuum-max-worker-slots"/> worker processes
     are allowed to run at the same time. If there are more than
-    <varname>autovacuum_max_workers</varname> databases to be processed,
+    <varname>autovacuum_max_worker_slots</varname> databases to be processed,
     the next database will be processed as soon as the first worker finishes.
     Each worker process will check each table within its database and
     execute <command>VACUUM</command> and/or <command>ANALYZE</command> as needed.
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 6047b8171d..26a02034c8 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -781,13 +781,13 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
        <row>
         <entry><varname>SEMMNI</varname></entry>
         <entry>Maximum number of semaphore identifiers (i.e., sets)</entry>
-        <entry>at least <literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16)</literal> plus room for other applications</entry>
+        <entry>at least <literal>ceil((max_connections + autovacuum_max_worker_slots + max_wal_senders + max_worker_processes + 5) / 16)</literal> plus room for other applications</entry>
        </row>
 
        <row>
         <entry><varname>SEMMNS</varname></entry>
         <entry>Maximum number of semaphores system-wide</entry>
-        <entry><literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16) * 17</literal> plus room for other applications</entry>
+        <entry><literal>ceil((max_connections + autovacuum_max_worker_slots + max_wal_senders + max_worker_processes + 5) / 16) * 17</literal> plus room for other applications</entry>
        </row>
 
        <row>
@@ -838,7 +838,7 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     When using System V semaphores,
     <productname>PostgreSQL</productname> uses one semaphore per allowed connection
     (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
-    (<xref linkend="guc-autovacuum-max-workers"/>) and allowed background
+    (<xref linkend="guc-autovacuum-max-worker-slots"/>) and allowed background
     process (<xref linkend="guc-max-worker-processes"/>), in sets of 16.
     Each such set will
     also contain a 17th semaphore which contains a <quote>magic
@@ -846,13 +846,13 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     other applications. The maximum number of semaphores in the system
     is set by <varname>SEMMNS</varname>, which consequently must be at least
     as high as <varname>max_connections</varname> plus
-    <varname>autovacuum_max_workers</varname> plus <varname>max_wal_senders</varname>,
+    <varname>autovacuum_max_worker_slots</varname> plus <varname>max_wal_senders</varname>,
     plus <varname>max_worker_processes</varname>, plus one extra for each 16
     allowed connections plus workers (see the formula in <xref
     linkend="sysvipc-parameters"/>).  The parameter <varname>SEMMNI</varname>
     determines the limit on the number of semaphore sets that can
     exist on the system at one time.  Hence this parameter must be at
-    least <literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16)</literal>.
+    least <literal>ceil((max_connections + autovacuum_max_worker_slots + max_wal_senders + max_worker_processes + 5) / 16)</literal>.
     Lowering the number
     of allowed connections is a temporary workaround for failures,
     which are usually confusingly worded <quote>No space
@@ -883,7 +883,7 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     When using POSIX semaphores, the number of semaphores needed is the
     same as for System V, that is one semaphore per allowed connection
     (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
-    (<xref linkend="guc-autovacuum-max-workers"/>) and allowed background
+    (<xref linkend="guc-autovacuum-max-worker-slots"/>) and allowed background
     process (<xref linkend="guc-max-worker-processes"/>).
     On the platforms where this option is preferred, there is no specific
     kernel limit on the number of POSIX semaphores.
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 34a2c71812..9f9ce5da7d 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5362,7 +5362,7 @@ CheckRequiredParameterValues(void)
 	 */
 	if (ArchiveRecoveryRequested && EnableHotStandby)
 	{
-		/* We ignore autovacuum_max_workers when we make this test. */
+		/* We ignore autovacuum_max_worker_slots when we make this test. */
 		RecoveryRequiresIntParameter("max_connections",
 									 MaxConnections,
 									 ControlFile->MaxConnections);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c367ede6f8..af3d1e218e 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -114,7 +114,7 @@
  * GUC parameters
  */
 bool		autovacuum_start_daemon = false;
-int			autovacuum_max_workers;
+int			autovacuum_max_worker_slots;
 int			autovacuum_work_mem = -1;
 int			autovacuum_naptime;
 int			autovacuum_vac_thresh;
@@ -209,7 +209,7 @@ typedef struct autovac_table
 /*-------------
  * This struct holds information about a single worker's whereabouts.  We keep
  * an array of these in shared memory, sized according to
- * autovacuum_max_workers.
+ * autovacuum_max_worker_slots.
  *
  * wi_links		entry into free list or running list
  * wi_dboid		OID of the database this worker is supposed to work on
@@ -3262,7 +3262,7 @@ AutoVacuumShmemSize(void)
 	 */
 	size = sizeof(AutoVacuumShmemStruct);
 	size = MAXALIGN(size);
-	size = add_size(size, mul_size(autovacuum_max_workers,
+	size = add_size(size, mul_size(autovacuum_max_worker_slots,
 								   sizeof(WorkerInfoData)));
 	return size;
 }
@@ -3299,7 +3299,7 @@ AutoVacuumShmemInit(void)
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
 		/* initialize the WorkerInfo free list */
-		for (i = 0; i < autovacuum_max_workers; i++)
+		for (i = 0; i < autovacuum_max_worker_slots; i++)
 		{
 			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 							&worker[i].wi_links);
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 7f3170a8f0..0faec534c0 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -4144,7 +4144,7 @@ CreateOptsFile(int argc, char *argv[], char *fullprogname)
 int
 MaxLivePostmasterChildren(void)
 {
-	return 2 * (MaxConnections + autovacuum_max_workers + 1 +
+	return 2 * (MaxConnections + autovacuum_max_worker_slots + 1 +
 				max_wal_senders + max_worker_processes);
 }
 
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 162b1f919d..84339655e9 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -142,7 +142,7 @@ ProcGlobalSemas(void)
  *	  So, now we grab enough semaphores to support the desired max number
  *	  of backends immediately at initialization --- if the sysadmin has set
  *	  MaxConnections, max_worker_processes, max_wal_senders, or
- *	  autovacuum_max_workers higher than his kernel will support, he'll
+ *	  autovacuum_max_worker_slots higher than his kernel will support, he'll
  *	  find out sooner rather than later.
  *
  *	  Another reason for creating semaphores here is that the semaphore
@@ -242,13 +242,13 @@ InitProcGlobal(void)
 			dlist_push_tail(&ProcGlobal->freeProcs, &proc->links);
 			proc->procgloballist = &ProcGlobal->freeProcs;
 		}
-		else if (i < MaxConnections + autovacuum_max_workers + 1)
+		else if (i < MaxConnections + autovacuum_max_worker_slots + 1)
 		{
 			/* PGPROC for AV launcher/worker, add to autovacFreeProcs list */
 			dlist_push_tail(&ProcGlobal->autovacFreeProcs, &proc->links);
 			proc->procgloballist = &ProcGlobal->autovacFreeProcs;
 		}
-		else if (i < MaxConnections + autovacuum_max_workers + 1 + max_worker_processes)
+		else if (i < MaxConnections + autovacuum_max_worker_slots + 1 + max_worker_processes)
 		{
 			/* PGPROC for bgworker, add to bgworkerFreeProcs list */
 			dlist_push_tail(&ProcGlobal->bgworkerFreeProcs, &proc->links);
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 0805398e24..c05653262f 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -577,7 +577,7 @@ InitializeMaxBackends(void)
 	Assert(MaxBackends == 0);
 
 	/* the extra unit accounts for the autovacuum launcher */
-	MaxBackends = MaxConnections + autovacuum_max_workers + 1 +
+	MaxBackends = MaxConnections + autovacuum_max_worker_slots + 1 +
 		max_worker_processes + max_wal_senders;
 
 	/* internal error because the values were all checked previously */
@@ -591,17 +591,17 @@ InitializeMaxBackends(void)
 bool
 check_max_connections(int *newval, void **extra, GucSource source)
 {
-	if (*newval + autovacuum_max_workers + 1 +
+	if (*newval + autovacuum_max_worker_slots + 1 +
 		max_worker_processes + max_wal_senders > MAX_BACKENDS)
 		return false;
 	return true;
 }
 
 /*
- * GUC check_hook for autovacuum_max_workers
+ * GUC check_hook for autovacuum_max_worker_slots
  */
 bool
-check_autovacuum_max_workers(int *newval, void **extra, GucSource source)
+check_autovacuum_max_worker_slots(int *newval, void **extra, GucSource source)
 {
 	if (MaxConnections + *newval + 1 +
 		max_worker_processes + max_wal_senders > MAX_BACKENDS)
@@ -615,7 +615,7 @@ check_autovacuum_max_workers(int *newval, void **extra, GucSource source)
 bool
 check_max_worker_processes(int *newval, void **extra, GucSource source)
 {
-	if (MaxConnections + autovacuum_max_workers + 1 +
+	if (MaxConnections + autovacuum_max_worker_slots + 1 +
 		*newval + max_wal_senders > MAX_BACKENDS)
 		return false;
 	return true;
@@ -627,7 +627,7 @@ check_max_worker_processes(int *newval, void **extra, GucSource source)
 bool
 check_max_wal_senders(int *newval, void **extra, GucSource source)
 {
-	if (MaxConnections + autovacuum_max_workers + 1 +
+	if (MaxConnections + autovacuum_max_worker_slots + 1 +
 		max_worker_processes + *newval > MAX_BACKENDS)
 		return false;
 	return true;
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index c68fdc008b..92dea7061a 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3402,13 +3402,13 @@ struct config_int ConfigureNamesInt[] =
 	},
 	{
 		/* see max_connections */
-		{"autovacuum_max_workers", PGC_POSTMASTER, AUTOVACUUM,
+		{"autovacuum_max_worker_slots", PGC_POSTMASTER, AUTOVACUUM,
 			gettext_noop("Sets the maximum number of simultaneously running autovacuum worker processes."),
 			NULL
 		},
-		&autovacuum_max_workers,
+		&autovacuum_max_worker_slots,
 		3, 1, MAX_BACKENDS,
-		check_autovacuum_max_workers, NULL, NULL
+		check_autovacuum_max_worker_slots, NULL, NULL
 	},
 
 	{
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 2166ea4a87..c37767cecf 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -658,7 +658,7 @@
 
 #autovacuum = on			# Enable autovacuum subprocess?  'on'
 					# requires track_counts to also be on.
-#autovacuum_max_workers = 3		# max number of autovacuum subprocesses
+#autovacuum_max_worker_slots = 3	# max number of autovacuum subprocesses
 					# (change requires restart)
 #autovacuum_naptime = 1min		# time between autovacuum runs
 #autovacuum_vacuum_threshold = 50	# min number of row updates before
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index cae1e8b329..754d04485d 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -28,7 +28,7 @@ typedef enum
 
 /* GUC variables */
 extern PGDLLIMPORT bool autovacuum_start_daemon;
-extern PGDLLIMPORT int autovacuum_max_workers;
+extern PGDLLIMPORT int autovacuum_max_worker_slots;
 extern PGDLLIMPORT int autovacuum_work_mem;
 extern PGDLLIMPORT int autovacuum_naptime;
 extern PGDLLIMPORT int autovacuum_vac_thresh;
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index d64dc5fcdb..22d4c50bc6 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -29,8 +29,8 @@ extern bool check_application_name(char **newval, void **extra,
 								   GucSource source);
 extern void assign_application_name(const char *newval, void *extra);
 extern const char *show_archive_command(void);
-extern bool check_autovacuum_max_workers(int *newval, void **extra,
-										 GucSource source);
+extern bool check_autovacuum_max_worker_slots(int *newval, void **extra,
+											  GucSource source);
 extern bool check_autovacuum_work_mem(int *newval, void **extra,
 									  GucSource source);
 extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
diff --git a/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
index 37550b67a4..f9cdd50c19 100644
--- a/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
+++ b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
@@ -21,7 +21,7 @@ $node->append_conf(
 autovacuum = off # run autovacuum only when to anti wraparound
 autovacuum_naptime = 1s
 # so it's easier to verify the order of operations
-autovacuum_max_workers = 1
+autovacuum_max_worker_slots = 1
 log_autovacuum_min_duration = 0
 ]);
 $node->start;
diff --git a/src/test/modules/xid_wraparound/t/003_wraparounds.pl b/src/test/modules/xid_wraparound/t/003_wraparounds.pl
index 88063b4b52..99f76229d5 100644
--- a/src/test/modules/xid_wraparound/t/003_wraparounds.pl
+++ b/src/test/modules/xid_wraparound/t/003_wraparounds.pl
@@ -24,7 +24,7 @@ $node->append_conf(
 autovacuum = off # run autovacuum only when to anti wraparound
 autovacuum_naptime = 1s
 # so it's easier to verify the order of operations
-autovacuum_max_workers = 1
+autovacuum_max_worker_slots = 1
 log_autovacuum_min_duration = 0
 ]);
 $node->start;
-- 
2.25.1

v1-0002-Convert-autovacuum-s-free-workers-list-to-a-dclis.patchtext/x-diff; charset=us-asciiDownload

From 2da218f260afdd68820ec2708fe7279ec2339d8a Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Sat, 13 Apr 2024 21:48:53 -0500
Subject: [PATCH v1 2/4] Convert autovacuum's free workers list to a dclist.

---
 src/backend/postmaster/autovacuum.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index af3d1e218e..e925eff1e4 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -289,7 +289,7 @@ typedef struct
 {
 	sig_atomic_t av_signal[AutoVacNumSignals];
 	pid_t		av_launcherpid;
-	dlist_head	av_freeWorkers;
+	dclist_head av_freeWorkers;
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
@@ -575,7 +575,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		 * wakening conditions.
 		 */
 
-		launcher_determine_sleep(!dlist_is_empty(&AutoVacuumShmem->av_freeWorkers),
+		launcher_determine_sleep(!dclist_is_empty(&AutoVacuumShmem->av_freeWorkers),
 								 false, &nap);
 
 		/*
@@ -636,7 +636,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		current_time = GetCurrentTimestamp();
 		LWLockAcquire(AutovacuumLock, LW_SHARED);
 
-		can_launch = !dlist_is_empty(&AutoVacuumShmem->av_freeWorkers);
+		can_launch = !dclist_is_empty(&AutoVacuumShmem->av_freeWorkers);
 
 		if (AutoVacuumShmem->av_startingWorker != NULL)
 		{
@@ -679,8 +679,8 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 					worker->wi_sharedrel = false;
 					worker->wi_proc = NULL;
 					worker->wi_launchtime = 0;
-					dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-									&worker->wi_links);
+					dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+									 &worker->wi_links);
 					AutoVacuumShmem->av_startingWorker = NULL;
 					ereport(WARNING,
 							errmsg("autovacuum worker took too long to start; canceled"));
@@ -1087,7 +1087,7 @@ do_start_worker(void)
 
 	/* return quickly when there are no free workers */
 	LWLockAcquire(AutovacuumLock, LW_SHARED);
-	if (dlist_is_empty(&AutoVacuumShmem->av_freeWorkers))
+	if (dclist_is_empty(&AutoVacuumShmem->av_freeWorkers))
 	{
 		LWLockRelease(AutovacuumLock);
 		return InvalidOid;
@@ -1240,7 +1240,7 @@ do_start_worker(void)
 		 * Get a worker entry from the freelist.  We checked above, so there
 		 * really should be a free slot.
 		 */
-		wptr = dlist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
+		wptr = dclist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
 
 		worker = dlist_container(WorkerInfoData, wi_links, wptr);
 		worker->wi_dboid = avdb->adw_datid;
@@ -1609,8 +1609,8 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
 		pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
-		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-						&MyWorkerInfo->wi_links);
+		dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+						 &MyWorkerInfo->wi_links);
 		/* not mine anymore */
 		MyWorkerInfo = NULL;
 
@@ -3289,7 +3289,7 @@ AutoVacuumShmemInit(void)
 		Assert(!found);
 
 		AutoVacuumShmem->av_launcherpid = 0;
-		dlist_init(&AutoVacuumShmem->av_freeWorkers);
+		dclist_init(&AutoVacuumShmem->av_freeWorkers);
 		dlist_init(&AutoVacuumShmem->av_runningWorkers);
 		AutoVacuumShmem->av_startingWorker = NULL;
 		memset(AutoVacuumShmem->av_workItems, 0,
@@ -3301,8 +3301,8 @@ AutoVacuumShmemInit(void)
 		/* initialize the WorkerInfo free list */
 		for (i = 0; i < autovacuum_max_worker_slots; i++)
 		{
-			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-							&worker[i].wi_links);
+			dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+							 &worker[i].wi_links);
 			pg_atomic_init_flag(&worker[i].wi_dobalance);
 		}
 
-- 
2.25.1

v1-0003-Move-free-autovacuum-worker-checks-to-a-helper-fu.patchtext/x-diff; charset=us-asciiDownload

From b54ccd2f461fd325037f13806be16bea58dad321 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Sun, 14 Apr 2024 09:04:01 -0500
Subject: [PATCH v1 3/4] Move free autovacuum worker checks to a helper
 function.

---
 src/backend/postmaster/autovacuum.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index e925eff1e4..f80365faff 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -347,6 +347,7 @@ static void autovac_report_activity(autovac_table *tab);
 static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
 									const char *nspname, const char *relname);
 static void avl_sigusr2_handler(SIGNAL_ARGS);
+static bool av_worker_available(void);
 
 
 
@@ -575,8 +576,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		 * wakening conditions.
 		 */
 
-		launcher_determine_sleep(!dclist_is_empty(&AutoVacuumShmem->av_freeWorkers),
-								 false, &nap);
+		launcher_determine_sleep(av_worker_available(), false, &nap);
 
 		/*
 		 * Wait until naptime expires or we get some type of signal (all the
@@ -636,7 +636,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		current_time = GetCurrentTimestamp();
 		LWLockAcquire(AutovacuumLock, LW_SHARED);
 
-		can_launch = !dclist_is_empty(&AutoVacuumShmem->av_freeWorkers);
+		can_launch = av_worker_available();
 
 		if (AutoVacuumShmem->av_startingWorker != NULL)
 		{
@@ -1087,7 +1087,7 @@ do_start_worker(void)
 
 	/* return quickly when there are no free workers */
 	LWLockAcquire(AutovacuumLock, LW_SHARED);
-	if (dclist_is_empty(&AutoVacuumShmem->av_freeWorkers))
+	if (!av_worker_available())
 	{
 		LWLockRelease(AutovacuumLock);
 		return InvalidOid;
@@ -3338,3 +3338,14 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
 
 	return true;
 }
+
+/*
+ * Returns whether there is a free autovacuum worker slot available.
+ */
+static bool
+av_worker_available(void)
+{
+	const dclist_head *freelist = &AutoVacuumShmem->av_freeWorkers;
+
+	return dclist_count(freelist) > 0;
+}
-- 
2.25.1

v1-0004-Reintroduce-autovacuum_max_workers-as-a-PGC_SIGHU.patchtext/x-diff; charset=us-asciiDownload

From 259615d1b03ce2f27ddd17d9210147e39cd7a4cf Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Sat, 13 Apr 2024 21:42:33 -0500
Subject: [PATCH v1 4/4] Reintroduce autovacuum_max_workers as a PGC_SIGHUP
 parameter.

---
 doc/src/sgml/config.sgml                      | 25 ++++++++++++++++++-
 doc/src/sgml/maintenance.sgml                 |  4 +--
 src/backend/postmaster/autovacuum.c           |  4 ++-
 src/backend/utils/misc/guc_tables.c           | 15 ++++++++---
 src/backend/utils/misc/postgresql.conf.sample |  3 ++-
 src/include/postmaster/autovacuum.h           |  1 +
 .../xid_wraparound/t/001_emergency_vacuum.pl  |  2 +-
 .../xid_wraparound/t/003_wraparounds.pl       |  2 +-
 8 files changed, 46 insertions(+), 10 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index b4d67a93b6..569b090593 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1914,7 +1914,7 @@ include_dir 'conf.d'
        </para>
        <para>
         Note that when autovacuum runs, up to
-        <xref linkend="guc-autovacuum-max-worker-slots"/> times this memory
+        <xref linkend="guc-autovacuum-max-workers"/> times this memory
         may be allocated, so be careful not to set the default value
         too high.  It may be useful to control for this by separately
         setting <xref linkend="guc-autovacuum-work-mem"/>.
@@ -8540,12 +8540,35 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
        <primary><varname>autovacuum_max_worker_slots</varname> configuration parameter</primary>
       </indexterm>
       </term>
+      <listitem>
+       <para>
+        Specifies the number of backend slots to reserve for autovacuum worker
+        processes.  The default is 32.  This parameter can only be set at server
+        start.
+       </para>
+       <para>
+        Note that the value of <xref linkend="guc-autovacuum-max-workers"/> is
+        silently capped to this value.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry id="guc-autovacuum-max-workers" xreflabel="autovacuum_max_workers">
+      <term><varname>autovacuum_max_workers</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>autovacuum_max_workers</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
       <listitem>
        <para>
         Specifies the maximum number of autovacuum processes (other than the
         autovacuum launcher) that may be running at any one time.  The default
         is three.  This parameter can only be set at server start.
        </para>
+       <para>
+        Note that this value is silently capped to the value of
+        <xref linkend="guc-autovacuum-max-worker-slots"/>.
+       </para>
       </listitem>
      </varlistentry>
 
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 7b4b3f0087..2bfa05b8bc 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -864,9 +864,9 @@ HINT:  Execute a database-wide VACUUM in that database.
     seconds.  (Therefore, if the installation has <replaceable>N</replaceable> databases,
     a new worker will be launched every
     <varname>autovacuum_naptime</varname>/<replaceable>N</replaceable> seconds.)
-    A maximum of <xref linkend="guc-autovacuum-max-worker-slots"/> worker processes
+    A maximum of <xref linkend="guc-autovacuum-max-workers"/> worker processes
     are allowed to run at the same time. If there are more than
-    <varname>autovacuum_max_worker_slots</varname> databases to be processed,
+    <varname>autovacuum_max_workers</varname> databases to be processed,
     the next database will be processed as soon as the first worker finishes.
     Each worker process will check each table within its database and
     execute <command>VACUUM</command> and/or <command>ANALYZE</command> as needed.
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index f80365faff..ed7e2b462f 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -115,6 +115,7 @@
  */
 bool		autovacuum_start_daemon = false;
 int			autovacuum_max_worker_slots;
+int			autovacuum_max_workers;
 int			autovacuum_work_mem = -1;
 int			autovacuum_naptime;
 int			autovacuum_vac_thresh;
@@ -3346,6 +3347,7 @@ static bool
 av_worker_available(void)
 {
 	const dclist_head *freelist = &AutoVacuumShmem->av_freeWorkers;
+	int			reserved_slots = autovacuum_max_worker_slots - autovacuum_max_workers;
 
-	return dclist_count(freelist) > 0;
+	return dclist_count(freelist) > Max(0, reserved_slots);
 }
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 92dea7061a..92d4d10fe9 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3403,13 +3403,22 @@ struct config_int ConfigureNamesInt[] =
 	{
 		/* see max_connections */
 		{"autovacuum_max_worker_slots", PGC_POSTMASTER, AUTOVACUUM,
-			gettext_noop("Sets the maximum number of simultaneously running autovacuum worker processes."),
-			NULL
+			gettext_noop("Sets the number of backend slots to allocate for autovacuum workers."),
+			gettext_noop("autovacuum_max_workers is silently capped to this value.")
 		},
 		&autovacuum_max_worker_slots,
-		3, 1, MAX_BACKENDS,
+		32, 1, MAX_BACKENDS,
 		check_autovacuum_max_worker_slots, NULL, NULL
 	},
+	{
+		{"autovacuum_max_workers", PGC_SIGHUP, AUTOVACUUM,
+			gettext_noop("Sets the maximum number of simultaneously running autovacuum worker processes."),
+			gettext_noop("This value is silently capped to autovacuum_max_worker_slots.")
+		},
+		&autovacuum_max_workers,
+		3, 1, MAX_BACKENDS,
+		NULL, NULL, NULL
+	},
 
 	{
 		{"max_parallel_maintenance_workers", PGC_USERSET, RESOURCES_ASYNCHRONOUS,
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index c37767cecf..c46d245153 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -658,8 +658,9 @@
 
 #autovacuum = on			# Enable autovacuum subprocess?  'on'
 					# requires track_counts to also be on.
-#autovacuum_max_worker_slots = 3	# max number of autovacuum subprocesses
+autovacuum_max_worker_slots = 32	# autovacuum worker slots to allocate
 					# (change requires restart)
+#autovacuum_max_workers = 3		# max number of autovacuum subprocesses
 #autovacuum_naptime = 1min		# time between autovacuum runs
 #autovacuum_vacuum_threshold = 50	# min number of row updates before
 					# vacuum
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 754d04485d..598782fd34 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -29,6 +29,7 @@ typedef enum
 /* GUC variables */
 extern PGDLLIMPORT bool autovacuum_start_daemon;
 extern PGDLLIMPORT int autovacuum_max_worker_slots;
+extern PGDLLIMPORT int autovacuum_max_workers;
 extern PGDLLIMPORT int autovacuum_work_mem;
 extern PGDLLIMPORT int autovacuum_naptime;
 extern PGDLLIMPORT int autovacuum_vac_thresh;
diff --git a/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
index f9cdd50c19..37550b67a4 100644
--- a/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
+++ b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
@@ -21,7 +21,7 @@ $node->append_conf(
 autovacuum = off # run autovacuum only when to anti wraparound
 autovacuum_naptime = 1s
 # so it's easier to verify the order of operations
-autovacuum_max_worker_slots = 1
+autovacuum_max_workers = 1
 log_autovacuum_min_duration = 0
 ]);
 $node->start;
diff --git a/src/test/modules/xid_wraparound/t/003_wraparounds.pl b/src/test/modules/xid_wraparound/t/003_wraparounds.pl
index 99f76229d5..88063b4b52 100644
--- a/src/test/modules/xid_wraparound/t/003_wraparounds.pl
+++ b/src/test/modules/xid_wraparound/t/003_wraparounds.pl
@@ -24,7 +24,7 @@ $node->append_conf(
 autovacuum = off # run autovacuum only when to anti wraparound
 autovacuum_naptime = 1s
 # so it's easier to verify the order of operations
-autovacuum_max_worker_slots = 1
+autovacuum_max_workers = 1
 log_autovacuum_min_duration = 0
 ]);
 $node->start;
-- 
2.25.1

#13

Justin Pryzby

pryzby@telsasoft.com

over 1 year ago

In reply to: Nathan Bossart (#1)

Re: allow changing autovacuum_max_workers without restarting

On Wed, Apr 10, 2024 at 04:23:44PM -0500, Nathan Bossart wrote:

The attached proof-of-concept patch demonstrates what I have in mind.
Instead of trying to dynamically change the global process table, etc., I'm
proposing that we introduce a new GUC that sets the effective maximum
number of autovacuum workers that can be started at any time. This means
there would be two GUCs for the number of autovacuum workers: one for the
number of slots reserved for autovacuum workers, and another that restricts
the number of those slots that can be used. The former would continue to
require a restart to change its value, and users would typically want to
set it relatively high. The latter could be changed at any time and would
allow for raising or lowering the maximum number of active autovacuum
workers, up to the limit set by the other parameter.

The proof-of-concept patch keeps autovacuum_max_workers as the maximum
number of slots to reserve for workers, but I think we should instead
rename this parameter to something else and then reintroduce
autovacuum_max_workers as the new parameter that can be adjusted without
restarting. That way, autovacuum_max_workers continues to work much the
same way as in previous versions.

When I thought about this, I considered proposing to add a new GUC for
"autovacuum_policy_workers".

autovacuum_max_workers would be the same as before, requiring a restart
to change. The policy GUC would be the soft limit, changable at runtime
up to the hard limit of autovacuum_max_workers (or maybe any policy
value exceeding autovacuum_max_workers would be ignored).

We'd probably change autovacuum_max_workers to default to a higher value
(8, or 32 as in your patch), and have autovacuum_max_workers default to
3, for consistency with historic behavior. Maybe
autovacuum_policy_workers=-1 would mean to use all workers.

There's the existing idea to change autovacuum thresholds during the
busy period of the day vs. off hours. This would allow something
similar with nworkers rather than thresholds: if the goal were to reduce
the resource use of vacuum, the admin could set max_workers=8, with
policy_workers=2 during the busy period.

--
Justin

#14

Nathan Bossart

nathandbossart@gmail.com

over 1 year ago

In reply to: Justin Pryzby (#13)

Re: allow changing autovacuum_max_workers without restarting

On Mon, Apr 15, 2024 at 08:33:33AM -0500, Justin Pryzby wrote:

On Wed, Apr 10, 2024 at 04:23:44PM -0500, Nathan Bossart wrote:

The proof-of-concept patch keeps autovacuum_max_workers as the maximum
number of slots to reserve for workers, but I think we should instead
rename this parameter to something else and then reintroduce
autovacuum_max_workers as the new parameter that can be adjusted without
restarting. That way, autovacuum_max_workers continues to work much the
same way as in previous versions.

When I thought about this, I considered proposing to add a new GUC for
"autovacuum_policy_workers".

autovacuum_max_workers would be the same as before, requiring a restart
to change. The policy GUC would be the soft limit, changable at runtime
up to the hard limit of autovacuum_max_workers (or maybe any policy
value exceeding autovacuum_max_workers would be ignored).

We'd probably change autovacuum_max_workers to default to a higher value
(8, or 32 as in your patch), and have autovacuum_max_workers default to
3, for consistency with historic behavior. Maybe
autovacuum_policy_workers=-1 would mean to use all workers.

This sounds like roughly the same idea, although it is backwards from what
I'm proposing in the v1 patch set. My thinking is that by making a new
restart-only GUC that would by default be set higher than the vast majority
of systems should ever need, we could simplify migrating to these
parameters. The autovacuum_max_workers parameter would effectively retain
it's original meaning, and existing settings would continue to work
normally on v18, but users could now adjust it without restarting. If we
did it the other way, users would need to bump up autovacuum_max_workers
and restart prior to being able to raise autovacuum_policy_workers beyond
what they previously had set for autovacuum_max_workers. That being said,
I'm open to doing it this way if folks prefer this approach, as I think it
is still an improvement.

There's the existing idea to change autovacuum thresholds during the
busy period of the day vs. off hours. This would allow something
similar with nworkers rather than thresholds: if the goal were to reduce
the resource use of vacuum, the admin could set max_workers=8, with
policy_workers=2 during the busy period.

Precisely.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#15

Nathan Bossart

nathandbossart@gmail.com

over 1 year ago

In reply to: Nathan Bossart (#14)

Re: allow changing autovacuum_max_workers without restarting

On Mon, Apr 15, 2024 at 11:28:33AM -0500, Nathan Bossart wrote:

On Mon, Apr 15, 2024 at 08:33:33AM -0500, Justin Pryzby wrote:

On Wed, Apr 10, 2024 at 04:23:44PM -0500, Nathan Bossart wrote:

The proof-of-concept patch keeps autovacuum_max_workers as the maximum
number of slots to reserve for workers, but I think we should instead
rename this parameter to something else and then reintroduce
autovacuum_max_workers as the new parameter that can be adjusted without
restarting. That way, autovacuum_max_workers continues to work much the
same way as in previous versions.

When I thought about this, I considered proposing to add a new GUC for
"autovacuum_policy_workers".

autovacuum_max_workers would be the same as before, requiring a restart
to change. The policy GUC would be the soft limit, changable at runtime
up to the hard limit of autovacuum_max_workers (or maybe any policy
value exceeding autovacuum_max_workers would be ignored).

We'd probably change autovacuum_max_workers to default to a higher value
(8, or 32 as in your patch), and have autovacuum_max_workers default to
3, for consistency with historic behavior. Maybe
autovacuum_policy_workers=-1 would mean to use all workers.

This sounds like roughly the same idea, although it is backwards from what
I'm proposing in the v1 patch set. My thinking is that by making a new
restart-only GUC that would by default be set higher than the vast majority
of systems should ever need, we could simplify migrating to these
parameters. The autovacuum_max_workers parameter would effectively retain
it's original meaning, and existing settings would continue to work
normally on v18, but users could now adjust it without restarting. If we
did it the other way, users would need to bump up autovacuum_max_workers
and restart prior to being able to raise autovacuum_policy_workers beyond
what they previously had set for autovacuum_max_workers. That being said,
I'm open to doing it this way if folks prefer this approach, as I think it
is still an improvement.

Another option could be to just remove the restart-only GUC and hard-code
the upper limit of autovacuum_max_workers to 64 or 128 or something. While
that would simplify matters, I suspect it would be hard to choose an
appropriate limit that won't quickly become outdated.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#16

Imseih (AWS), Sami

simseih@amazon.com

over 1 year ago

In reply to: Nathan Bossart (#15)

Re: allow changing autovacuum_max_workers without restarting

Another option could be to just remove the restart-only GUC and hard-code
the upper limit of autovacuum_max_workers to 64 or 128 or something. While
that would simplify matters, I suspect it would be hard to choose an
appropriate limit that won't quickly become outdated.

Hardcoded values are usually hard to deal with because they are hidden either
In code or in docs.

When I thought about this, I considered proposing to add a new GUC for
"autovacuum_policy_workers".

autovacuum_max_workers would be the same as before, requiring a restart
to change. The policy GUC would be the soft limit, changable at runtime

I think autovacuum_max_workers should still be the GUC that controls
the number of concurrent autovacuums. This parameter is already well
established and changing the meaning now will be confusing.

I suspect most users will be glad it's now dynamic, but will probably
be annoyed if it's no longer doing what it's supposed to.

Regards,

Sami

#17

wenhui qiu

qiuwenhuifx@gmail.com

over 1 year ago

In reply to: Imseih (AWS), Sami (#16)

Re: allow changing autovacuum_max_workers without restarting

Agree +1,From a dba perspective, I would prefer that this parameter can be
dynamically modified, rather than adding a new parameter,What is more
difficult is how to smoothly reach the target value when the setting is
considered to be too large and needs to be lowered.

Regards

On Tue, 16 Apr 2024 at 01:41, Imseih (AWS), Sami <simseih@amazon.com> wrote:

Show quoted text

Another option could be to just remove the restart-only GUC and hard-code
the upper limit of autovacuum_max_workers to 64 or 128 or something.

While

that would simplify matters, I suspect it would be hard to choose an
appropriate limit that won't quickly become outdated.

Hardcoded values are usually hard to deal with because they are hidden
either
In code or in docs.

When I thought about this, I considered proposing to add a new GUC for
"autovacuum_policy_workers".

autovacuum_max_workers would be the same as before, requiring a restart
to change. The policy GUC would be the soft limit, changable at runtime

I think autovacuum_max_workers should still be the GUC that controls
the number of concurrent autovacuums. This parameter is already well
established and changing the meaning now will be confusing.

I suspect most users will be glad it's now dynamic, but will probably
be annoyed if it's no longer doing what it's supposed to.

Regards,

Sami

#18

Imseih (AWS), Sami

simseih@amazon.com

over 1 year ago

In reply to: Nathan Bossart (#12)

Re: allow changing autovacuum_max_workers without restarting

Here is a first attempt at a proper patch set based on the discussion thus
far. I've split it up into several small patches for ease of review, which
is probably a bit excessive. If this ever makes it to commit, they could
likely be combined.

I looked at the patch set. With the help of DEBUG2 output, I tested to ensure
that the the autovacuum_cost_limit balance adjusts correctly when the
autovacuum_max_workers value increases/decreases. I did not think the
patch will break this behavior, but it's important to verify this.

Some comments on the patch:

1. A nit. There should be a tab here.

-       dlist_head      av_freeWorkers;
+       dclist_head av_freeWorkers;

2. autovacuum_max_worker_slots documentation:

+       <para>
+        Note that the value of <xref linkend="guc-autovacuum-max-workers"/> is
+        silently capped to this value.
+       </para>

This comment looks redundant in the docs, since the entry
for autovacuum_max_workers that follows mentions the
same.

3. The docs for autovacuum_max_workers should mention that when
the value changes, consider adjusting the autovacuum_cost_limit/cost_delay.

This is not something new. Even in the current state, users should think about
these settings. However, it seems even important if this value is to be
dynamically adjusted.

Regards,

Sami

#19

Nathan Bossart

nathandbossart@gmail.com

over 1 year ago

In reply to: Imseih (AWS), Sami (#18)

4 attachment(s)

Re: allow changing autovacuum_max_workers without restarting

On Thu, Apr 18, 2024 at 05:05:03AM +0000, Imseih (AWS), Sami wrote:

I looked at the patch set. With the help of DEBUG2 output, I tested to ensure
that the the autovacuum_cost_limit balance adjusts correctly when the
autovacuum_max_workers value increases/decreases. I did not think the
patch will break this behavior, but it's important to verify this.

Great.

1. A nit. There should be a tab here.
-       dlist_head      av_freeWorkers;
+       dclist_head av_freeWorkers;

I dare not argue with pgindent.

2. autovacuum_max_worker_slots documentation:
+       <para>
+        Note that the value of <xref linkend="guc-autovacuum-max-workers"/> is
+        silently capped to this value.
+       </para>
This comment looks redundant in the docs, since the entry
for autovacuum_max_workers that follows mentions the
same.

Removed in v2. I also noticed that I forgot to update the part about when
autovacuum_max_workers can be changed. *facepalm*

3. The docs for autovacuum_max_workers should mention that when
the value changes, consider adjusting the autovacuum_cost_limit/cost_delay.

This is not something new. Even in the current state, users should think about
these settings. However, it seems even important if this value is to be
dynamically adjusted.

I don't necessarily disagree that it might be worth mentioning these
parameters, but I would argue that this should be proposed in a separate
thread.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachments:

v2-0001-Rename-autovacuum_max_workers-to-autovacuum_max_w.patchtext/x-diff; charset=us-asciiDownload

From 466f31a23605755a8e3d17c362c9f4940bf91da0 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Sat, 13 Apr 2024 15:00:08 -0500
Subject: [PATCH v2 1/4] Rename autovacuum_max_workers to
 autovacuum_max_worker_slots.

---
 doc/src/sgml/config.sgml                             |  8 ++++----
 doc/src/sgml/maintenance.sgml                        |  4 ++--
 doc/src/sgml/runtime.sgml                            | 12 ++++++------
 src/backend/access/transam/xlog.c                    |  2 +-
 src/backend/postmaster/autovacuum.c                  |  8 ++++----
 src/backend/postmaster/postmaster.c                  |  2 +-
 src/backend/storage/lmgr/proc.c                      |  6 +++---
 src/backend/utils/init/postinit.c                    | 12 ++++++------
 src/backend/utils/misc/guc_tables.c                  |  6 +++---
 src/backend/utils/misc/postgresql.conf.sample        |  2 +-
 src/include/postmaster/autovacuum.h                  |  2 +-
 src/include/utils/guc_hooks.h                        |  4 ++--
 .../modules/xid_wraparound/t/001_emergency_vacuum.pl |  2 +-
 src/test/modules/xid_wraparound/t/003_wraparounds.pl |  2 +-
 14 files changed, 36 insertions(+), 36 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d8e1282e12..b4d67a93b6 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1914,7 +1914,7 @@ include_dir 'conf.d'
        </para>
        <para>
         Note that when autovacuum runs, up to
-        <xref linkend="guc-autovacuum-max-workers"/> times this memory
+        <xref linkend="guc-autovacuum-max-worker-slots"/> times this memory
         may be allocated, so be careful not to set the default value
         too high.  It may be useful to control for this by separately
         setting <xref linkend="guc-autovacuum-work-mem"/>.
@@ -8534,10 +8534,10 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
       </listitem>
      </varlistentry>
 
-     <varlistentry id="guc-autovacuum-max-workers" xreflabel="autovacuum_max_workers">
-      <term><varname>autovacuum_max_workers</varname> (<type>integer</type>)
+     <varlistentry id="guc-autovacuum-max-worker-slots" xreflabel="autovacuum_max_worker_slots">
+      <term><varname>autovacuum_max_worker_slots</varname> (<type>integer</type>)
       <indexterm>
-       <primary><varname>autovacuum_max_workers</varname> configuration parameter</primary>
+       <primary><varname>autovacuum_max_worker_slots</varname> configuration parameter</primary>
       </indexterm>
       </term>
       <listitem>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 0be90bdc7e..5373acba41 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -864,9 +864,9 @@ HINT:  Execute a database-wide VACUUM in that database.
     seconds.  (Therefore, if the installation has <replaceable>N</replaceable> databases,
     a new worker will be launched every
     <varname>autovacuum_naptime</varname>/<replaceable>N</replaceable> seconds.)
-    A maximum of <xref linkend="guc-autovacuum-max-workers"/> worker processes
+    A maximum of <xref linkend="guc-autovacuum-max-worker-slots"/> worker processes
     are allowed to run at the same time. If there are more than
-    <varname>autovacuum_max_workers</varname> databases to be processed,
+    <varname>autovacuum_max_worker_slots</varname> databases to be processed,
     the next database will be processed as soon as the first worker finishes.
     Each worker process will check each table within its database and
     execute <command>VACUUM</command> and/or <command>ANALYZE</command> as needed.
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 6047b8171d..26a02034c8 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -781,13 +781,13 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
        <row>
         <entry><varname>SEMMNI</varname></entry>
         <entry>Maximum number of semaphore identifiers (i.e., sets)</entry>
-        <entry>at least <literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16)</literal> plus room for other applications</entry>
+        <entry>at least <literal>ceil((max_connections + autovacuum_max_worker_slots + max_wal_senders + max_worker_processes + 5) / 16)</literal> plus room for other applications</entry>
        </row>
 
        <row>
         <entry><varname>SEMMNS</varname></entry>
         <entry>Maximum number of semaphores system-wide</entry>
-        <entry><literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16) * 17</literal> plus room for other applications</entry>
+        <entry><literal>ceil((max_connections + autovacuum_max_worker_slots + max_wal_senders + max_worker_processes + 5) / 16) * 17</literal> plus room for other applications</entry>
        </row>
 
        <row>
@@ -838,7 +838,7 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     When using System V semaphores,
     <productname>PostgreSQL</productname> uses one semaphore per allowed connection
     (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
-    (<xref linkend="guc-autovacuum-max-workers"/>) and allowed background
+    (<xref linkend="guc-autovacuum-max-worker-slots"/>) and allowed background
     process (<xref linkend="guc-max-worker-processes"/>), in sets of 16.
     Each such set will
     also contain a 17th semaphore which contains a <quote>magic
@@ -846,13 +846,13 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     other applications. The maximum number of semaphores in the system
     is set by <varname>SEMMNS</varname>, which consequently must be at least
     as high as <varname>max_connections</varname> plus
-    <varname>autovacuum_max_workers</varname> plus <varname>max_wal_senders</varname>,
+    <varname>autovacuum_max_worker_slots</varname> plus <varname>max_wal_senders</varname>,
     plus <varname>max_worker_processes</varname>, plus one extra for each 16
     allowed connections plus workers (see the formula in <xref
     linkend="sysvipc-parameters"/>).  The parameter <varname>SEMMNI</varname>
     determines the limit on the number of semaphore sets that can
     exist on the system at one time.  Hence this parameter must be at
-    least <literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16)</literal>.
+    least <literal>ceil((max_connections + autovacuum_max_worker_slots + max_wal_senders + max_worker_processes + 5) / 16)</literal>.
     Lowering the number
     of allowed connections is a temporary workaround for failures,
     which are usually confusingly worded <quote>No space
@@ -883,7 +883,7 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     When using POSIX semaphores, the number of semaphores needed is the
     same as for System V, that is one semaphore per allowed connection
     (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
-    (<xref linkend="guc-autovacuum-max-workers"/>) and allowed background
+    (<xref linkend="guc-autovacuum-max-worker-slots"/>) and allowed background
     process (<xref linkend="guc-max-worker-processes"/>).
     On the platforms where this option is preferred, there is no specific
     kernel limit on the number of POSIX semaphores.
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 34a2c71812..9f9ce5da7d 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5362,7 +5362,7 @@ CheckRequiredParameterValues(void)
 	 */
 	if (ArchiveRecoveryRequested && EnableHotStandby)
 	{
-		/* We ignore autovacuum_max_workers when we make this test. */
+		/* We ignore autovacuum_max_worker_slots when we make this test. */
 		RecoveryRequiresIntParameter("max_connections",
 									 MaxConnections,
 									 ControlFile->MaxConnections);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index c367ede6f8..af3d1e218e 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -114,7 +114,7 @@
  * GUC parameters
  */
 bool		autovacuum_start_daemon = false;
-int			autovacuum_max_workers;
+int			autovacuum_max_worker_slots;
 int			autovacuum_work_mem = -1;
 int			autovacuum_naptime;
 int			autovacuum_vac_thresh;
@@ -209,7 +209,7 @@ typedef struct autovac_table
 /*-------------
  * This struct holds information about a single worker's whereabouts.  We keep
  * an array of these in shared memory, sized according to
- * autovacuum_max_workers.
+ * autovacuum_max_worker_slots.
  *
  * wi_links		entry into free list or running list
  * wi_dboid		OID of the database this worker is supposed to work on
@@ -3262,7 +3262,7 @@ AutoVacuumShmemSize(void)
 	 */
 	size = sizeof(AutoVacuumShmemStruct);
 	size = MAXALIGN(size);
-	size = add_size(size, mul_size(autovacuum_max_workers,
+	size = add_size(size, mul_size(autovacuum_max_worker_slots,
 								   sizeof(WorkerInfoData)));
 	return size;
 }
@@ -3299,7 +3299,7 @@ AutoVacuumShmemInit(void)
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
 		/* initialize the WorkerInfo free list */
-		for (i = 0; i < autovacuum_max_workers; i++)
+		for (i = 0; i < autovacuum_max_worker_slots; i++)
 		{
 			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 							&worker[i].wi_links);
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 7f3170a8f0..0faec534c0 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -4144,7 +4144,7 @@ CreateOptsFile(int argc, char *argv[], char *fullprogname)
 int
 MaxLivePostmasterChildren(void)
 {
-	return 2 * (MaxConnections + autovacuum_max_workers + 1 +
+	return 2 * (MaxConnections + autovacuum_max_worker_slots + 1 +
 				max_wal_senders + max_worker_processes);
 }
 
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index e4f256c63c..4587c5a508 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -142,7 +142,7 @@ ProcGlobalSemas(void)
  *	  So, now we grab enough semaphores to support the desired max number
  *	  of backends immediately at initialization --- if the sysadmin has set
  *	  MaxConnections, max_worker_processes, max_wal_senders, or
- *	  autovacuum_max_workers higher than his kernel will support, he'll
+ *	  autovacuum_max_worker_slots higher than his kernel will support, he'll
  *	  find out sooner rather than later.
  *
  *	  Another reason for creating semaphores here is that the semaphore
@@ -242,13 +242,13 @@ InitProcGlobal(void)
 			dlist_push_tail(&ProcGlobal->freeProcs, &proc->links);
 			proc->procgloballist = &ProcGlobal->freeProcs;
 		}
-		else if (i < MaxConnections + autovacuum_max_workers + 1)
+		else if (i < MaxConnections + autovacuum_max_worker_slots + 1)
 		{
 			/* PGPROC for AV launcher/worker, add to autovacFreeProcs list */
 			dlist_push_tail(&ProcGlobal->autovacFreeProcs, &proc->links);
 			proc->procgloballist = &ProcGlobal->autovacFreeProcs;
 		}
-		else if (i < MaxConnections + autovacuum_max_workers + 1 + max_worker_processes)
+		else if (i < MaxConnections + autovacuum_max_worker_slots + 1 + max_worker_processes)
 		{
 			/* PGPROC for bgworker, add to bgworkerFreeProcs list */
 			dlist_push_tail(&ProcGlobal->bgworkerFreeProcs, &proc->links);
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 0805398e24..c05653262f 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -577,7 +577,7 @@ InitializeMaxBackends(void)
 	Assert(MaxBackends == 0);
 
 	/* the extra unit accounts for the autovacuum launcher */
-	MaxBackends = MaxConnections + autovacuum_max_workers + 1 +
+	MaxBackends = MaxConnections + autovacuum_max_worker_slots + 1 +
 		max_worker_processes + max_wal_senders;
 
 	/* internal error because the values were all checked previously */
@@ -591,17 +591,17 @@ InitializeMaxBackends(void)
 bool
 check_max_connections(int *newval, void **extra, GucSource source)
 {
-	if (*newval + autovacuum_max_workers + 1 +
+	if (*newval + autovacuum_max_worker_slots + 1 +
 		max_worker_processes + max_wal_senders > MAX_BACKENDS)
 		return false;
 	return true;
 }
 
 /*
- * GUC check_hook for autovacuum_max_workers
+ * GUC check_hook for autovacuum_max_worker_slots
  */
 bool
-check_autovacuum_max_workers(int *newval, void **extra, GucSource source)
+check_autovacuum_max_worker_slots(int *newval, void **extra, GucSource source)
 {
 	if (MaxConnections + *newval + 1 +
 		max_worker_processes + max_wal_senders > MAX_BACKENDS)
@@ -615,7 +615,7 @@ check_autovacuum_max_workers(int *newval, void **extra, GucSource source)
 bool
 check_max_worker_processes(int *newval, void **extra, GucSource source)
 {
-	if (MaxConnections + autovacuum_max_workers + 1 +
+	if (MaxConnections + autovacuum_max_worker_slots + 1 +
 		*newval + max_wal_senders > MAX_BACKENDS)
 		return false;
 	return true;
@@ -627,7 +627,7 @@ check_max_worker_processes(int *newval, void **extra, GucSource source)
 bool
 check_max_wal_senders(int *newval, void **extra, GucSource source)
 {
-	if (MaxConnections + autovacuum_max_workers + 1 +
+	if (MaxConnections + autovacuum_max_worker_slots + 1 +
 		max_worker_processes + *newval > MAX_BACKENDS)
 		return false;
 	return true;
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index c68fdc008b..92dea7061a 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3402,13 +3402,13 @@ struct config_int ConfigureNamesInt[] =
 	},
 	{
 		/* see max_connections */
-		{"autovacuum_max_workers", PGC_POSTMASTER, AUTOVACUUM,
+		{"autovacuum_max_worker_slots", PGC_POSTMASTER, AUTOVACUUM,
 			gettext_noop("Sets the maximum number of simultaneously running autovacuum worker processes."),
 			NULL
 		},
-		&autovacuum_max_workers,
+		&autovacuum_max_worker_slots,
 		3, 1, MAX_BACKENDS,
-		check_autovacuum_max_workers, NULL, NULL
+		check_autovacuum_max_worker_slots, NULL, NULL
 	},
 
 	{
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 2166ea4a87..c37767cecf 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -658,7 +658,7 @@
 
 #autovacuum = on			# Enable autovacuum subprocess?  'on'
 					# requires track_counts to also be on.
-#autovacuum_max_workers = 3		# max number of autovacuum subprocesses
+#autovacuum_max_worker_slots = 3	# max number of autovacuum subprocesses
 					# (change requires restart)
 #autovacuum_naptime = 1min		# time between autovacuum runs
 #autovacuum_vacuum_threshold = 50	# min number of row updates before
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index cae1e8b329..754d04485d 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -28,7 +28,7 @@ typedef enum
 
 /* GUC variables */
 extern PGDLLIMPORT bool autovacuum_start_daemon;
-extern PGDLLIMPORT int autovacuum_max_workers;
+extern PGDLLIMPORT int autovacuum_max_worker_slots;
 extern PGDLLIMPORT int autovacuum_work_mem;
 extern PGDLLIMPORT int autovacuum_naptime;
 extern PGDLLIMPORT int autovacuum_vac_thresh;
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index d64dc5fcdb..22d4c50bc6 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -29,8 +29,8 @@ extern bool check_application_name(char **newval, void **extra,
 								   GucSource source);
 extern void assign_application_name(const char *newval, void *extra);
 extern const char *show_archive_command(void);
-extern bool check_autovacuum_max_workers(int *newval, void **extra,
-										 GucSource source);
+extern bool check_autovacuum_max_worker_slots(int *newval, void **extra,
+											  GucSource source);
 extern bool check_autovacuum_work_mem(int *newval, void **extra,
 									  GucSource source);
 extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
diff --git a/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
index 37550b67a4..f9cdd50c19 100644
--- a/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
+++ b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
@@ -21,7 +21,7 @@ $node->append_conf(
 autovacuum = off # run autovacuum only when to anti wraparound
 autovacuum_naptime = 1s
 # so it's easier to verify the order of operations
-autovacuum_max_workers = 1
+autovacuum_max_worker_slots = 1
 log_autovacuum_min_duration = 0
 ]);
 $node->start;
diff --git a/src/test/modules/xid_wraparound/t/003_wraparounds.pl b/src/test/modules/xid_wraparound/t/003_wraparounds.pl
index 88063b4b52..99f76229d5 100644
--- a/src/test/modules/xid_wraparound/t/003_wraparounds.pl
+++ b/src/test/modules/xid_wraparound/t/003_wraparounds.pl
@@ -24,7 +24,7 @@ $node->append_conf(
 autovacuum = off # run autovacuum only when to anti wraparound
 autovacuum_naptime = 1s
 # so it's easier to verify the order of operations
-autovacuum_max_workers = 1
+autovacuum_max_worker_slots = 1
 log_autovacuum_min_duration = 0
 ]);
 $node->start;
-- 
2.25.1

v2-0002-Convert-autovacuum-s-free-workers-list-to-a-dclis.patchtext/x-diff; charset=us-asciiDownload

From 6ba708a66dfaf964f8330a81417e3efdfeef9c92 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Sat, 13 Apr 2024 21:48:53 -0500
Subject: [PATCH v2 2/4] Convert autovacuum's free workers list to a dclist.

---
 src/backend/postmaster/autovacuum.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index af3d1e218e..e925eff1e4 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -289,7 +289,7 @@ typedef struct
 {
 	sig_atomic_t av_signal[AutoVacNumSignals];
 	pid_t		av_launcherpid;
-	dlist_head	av_freeWorkers;
+	dclist_head av_freeWorkers;
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
@@ -575,7 +575,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		 * wakening conditions.
 		 */
 
-		launcher_determine_sleep(!dlist_is_empty(&AutoVacuumShmem->av_freeWorkers),
+		launcher_determine_sleep(!dclist_is_empty(&AutoVacuumShmem->av_freeWorkers),
 								 false, &nap);
 
 		/*
@@ -636,7 +636,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		current_time = GetCurrentTimestamp();
 		LWLockAcquire(AutovacuumLock, LW_SHARED);
 
-		can_launch = !dlist_is_empty(&AutoVacuumShmem->av_freeWorkers);
+		can_launch = !dclist_is_empty(&AutoVacuumShmem->av_freeWorkers);
 
 		if (AutoVacuumShmem->av_startingWorker != NULL)
 		{
@@ -679,8 +679,8 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 					worker->wi_sharedrel = false;
 					worker->wi_proc = NULL;
 					worker->wi_launchtime = 0;
-					dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-									&worker->wi_links);
+					dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+									 &worker->wi_links);
 					AutoVacuumShmem->av_startingWorker = NULL;
 					ereport(WARNING,
 							errmsg("autovacuum worker took too long to start; canceled"));
@@ -1087,7 +1087,7 @@ do_start_worker(void)
 
 	/* return quickly when there are no free workers */
 	LWLockAcquire(AutovacuumLock, LW_SHARED);
-	if (dlist_is_empty(&AutoVacuumShmem->av_freeWorkers))
+	if (dclist_is_empty(&AutoVacuumShmem->av_freeWorkers))
 	{
 		LWLockRelease(AutovacuumLock);
 		return InvalidOid;
@@ -1240,7 +1240,7 @@ do_start_worker(void)
 		 * Get a worker entry from the freelist.  We checked above, so there
 		 * really should be a free slot.
 		 */
-		wptr = dlist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
+		wptr = dclist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
 
 		worker = dlist_container(WorkerInfoData, wi_links, wptr);
 		worker->wi_dboid = avdb->adw_datid;
@@ -1609,8 +1609,8 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
 		pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
-		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-						&MyWorkerInfo->wi_links);
+		dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+						 &MyWorkerInfo->wi_links);
 		/* not mine anymore */
 		MyWorkerInfo = NULL;
 
@@ -3289,7 +3289,7 @@ AutoVacuumShmemInit(void)
 		Assert(!found);
 
 		AutoVacuumShmem->av_launcherpid = 0;
-		dlist_init(&AutoVacuumShmem->av_freeWorkers);
+		dclist_init(&AutoVacuumShmem->av_freeWorkers);
 		dlist_init(&AutoVacuumShmem->av_runningWorkers);
 		AutoVacuumShmem->av_startingWorker = NULL;
 		memset(AutoVacuumShmem->av_workItems, 0,
@@ -3301,8 +3301,8 @@ AutoVacuumShmemInit(void)
 		/* initialize the WorkerInfo free list */
 		for (i = 0; i < autovacuum_max_worker_slots; i++)
 		{
-			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-							&worker[i].wi_links);
+			dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+							 &worker[i].wi_links);
 			pg_atomic_init_flag(&worker[i].wi_dobalance);
 		}
 
-- 
2.25.1

v2-0003-Move-free-autovacuum-worker-checks-to-a-helper-fu.patchtext/x-diff; charset=us-asciiDownload

From 6ecfb0aebb0b1d0e7d2d9f8b65038ae8c2c98f88 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Sun, 14 Apr 2024 09:04:01 -0500
Subject: [PATCH v2 3/4] Move free autovacuum worker checks to a helper
 function.

---
 src/backend/postmaster/autovacuum.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index e925eff1e4..f80365faff 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -347,6 +347,7 @@ static void autovac_report_activity(autovac_table *tab);
 static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
 									const char *nspname, const char *relname);
 static void avl_sigusr2_handler(SIGNAL_ARGS);
+static bool av_worker_available(void);
 
 
 
@@ -575,8 +576,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		 * wakening conditions.
 		 */
 
-		launcher_determine_sleep(!dclist_is_empty(&AutoVacuumShmem->av_freeWorkers),
-								 false, &nap);
+		launcher_determine_sleep(av_worker_available(), false, &nap);
 
 		/*
 		 * Wait until naptime expires or we get some type of signal (all the
@@ -636,7 +636,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		current_time = GetCurrentTimestamp();
 		LWLockAcquire(AutovacuumLock, LW_SHARED);
 
-		can_launch = !dclist_is_empty(&AutoVacuumShmem->av_freeWorkers);
+		can_launch = av_worker_available();
 
 		if (AutoVacuumShmem->av_startingWorker != NULL)
 		{
@@ -1087,7 +1087,7 @@ do_start_worker(void)
 
 	/* return quickly when there are no free workers */
 	LWLockAcquire(AutovacuumLock, LW_SHARED);
-	if (dclist_is_empty(&AutoVacuumShmem->av_freeWorkers))
+	if (!av_worker_available())
 	{
 		LWLockRelease(AutovacuumLock);
 		return InvalidOid;
@@ -3338,3 +3338,14 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
 
 	return true;
 }
+
+/*
+ * Returns whether there is a free autovacuum worker slot available.
+ */
+static bool
+av_worker_available(void)
+{
+	const dclist_head *freelist = &AutoVacuumShmem->av_freeWorkers;
+
+	return dclist_count(freelist) > 0;
+}
-- 
2.25.1

v2-0004-Reintroduce-autovacuum_max_workers-as-a-PGC_SIGHU.patchtext/x-diff; charset=us-asciiDownload

From 915e1594439328d47a04096c9811dd43a4526efe Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Sat, 13 Apr 2024 21:42:33 -0500
Subject: [PATCH v2 4/4] Reintroduce autovacuum_max_workers as a PGC_SIGHUP
 parameter.

---
 doc/src/sgml/config.sgml                      | 24 +++++++++++++++++--
 doc/src/sgml/maintenance.sgml                 |  4 ++--
 src/backend/postmaster/autovacuum.c           |  4 +++-
 src/backend/utils/misc/guc_tables.c           | 15 +++++++++---
 src/backend/utils/misc/postgresql.conf.sample |  3 ++-
 src/include/postmaster/autovacuum.h           |  1 +
 .../xid_wraparound/t/001_emergency_vacuum.pl  |  2 +-
 .../xid_wraparound/t/003_wraparounds.pl       |  2 +-
 8 files changed, 44 insertions(+), 11 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index b4d67a93b6..0f022a8056 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1914,7 +1914,7 @@ include_dir 'conf.d'
        </para>
        <para>
         Note that when autovacuum runs, up to
-        <xref linkend="guc-autovacuum-max-worker-slots"/> times this memory
+        <xref linkend="guc-autovacuum-max-workers"/> times this memory
         may be allocated, so be careful not to set the default value
         too high.  It may be useful to control for this by separately
         setting <xref linkend="guc-autovacuum-work-mem"/>.
@@ -8540,11 +8540,31 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
        <primary><varname>autovacuum_max_worker_slots</varname> configuration parameter</primary>
       </indexterm>
       </term>
+      <listitem>
+       <para>
+        Specifies the number of backend slots to reserve for autovacuum worker
+        processes.  The default is 32.  This parameter can only be set at server
+        start.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry id="guc-autovacuum-max-workers" xreflabel="autovacuum_max_workers">
+      <term><varname>autovacuum_max_workers</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>autovacuum_max_workers</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
       <listitem>
        <para>
         Specifies the maximum number of autovacuum processes (other than the
         autovacuum launcher) that may be running at any one time.  The default
-        is three.  This parameter can only be set at server start.
+        is three.  This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command line.
+       </para>
+       <para>
+        Note that this value is silently capped to the value of
+        <xref linkend="guc-autovacuum-max-worker-slots"/>.
        </para>
       </listitem>
      </varlistentry>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 5373acba41..0be90bdc7e 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -864,9 +864,9 @@ HINT:  Execute a database-wide VACUUM in that database.
     seconds.  (Therefore, if the installation has <replaceable>N</replaceable> databases,
     a new worker will be launched every
     <varname>autovacuum_naptime</varname>/<replaceable>N</replaceable> seconds.)
-    A maximum of <xref linkend="guc-autovacuum-max-worker-slots"/> worker processes
+    A maximum of <xref linkend="guc-autovacuum-max-workers"/> worker processes
     are allowed to run at the same time. If there are more than
-    <varname>autovacuum_max_worker_slots</varname> databases to be processed,
+    <varname>autovacuum_max_workers</varname> databases to be processed,
     the next database will be processed as soon as the first worker finishes.
     Each worker process will check each table within its database and
     execute <command>VACUUM</command> and/or <command>ANALYZE</command> as needed.
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index f80365faff..ed7e2b462f 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -115,6 +115,7 @@
  */
 bool		autovacuum_start_daemon = false;
 int			autovacuum_max_worker_slots;
+int			autovacuum_max_workers;
 int			autovacuum_work_mem = -1;
 int			autovacuum_naptime;
 int			autovacuum_vac_thresh;
@@ -3346,6 +3347,7 @@ static bool
 av_worker_available(void)
 {
 	const dclist_head *freelist = &AutoVacuumShmem->av_freeWorkers;
+	int			reserved_slots = autovacuum_max_worker_slots - autovacuum_max_workers;
 
-	return dclist_count(freelist) > 0;
+	return dclist_count(freelist) > Max(0, reserved_slots);
 }
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 92dea7061a..92d4d10fe9 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3403,13 +3403,22 @@ struct config_int ConfigureNamesInt[] =
 	{
 		/* see max_connections */
 		{"autovacuum_max_worker_slots", PGC_POSTMASTER, AUTOVACUUM,
-			gettext_noop("Sets the maximum number of simultaneously running autovacuum worker processes."),
-			NULL
+			gettext_noop("Sets the number of backend slots to allocate for autovacuum workers."),
+			gettext_noop("autovacuum_max_workers is silently capped to this value.")
 		},
 		&autovacuum_max_worker_slots,
-		3, 1, MAX_BACKENDS,
+		32, 1, MAX_BACKENDS,
 		check_autovacuum_max_worker_slots, NULL, NULL
 	},
+	{
+		{"autovacuum_max_workers", PGC_SIGHUP, AUTOVACUUM,
+			gettext_noop("Sets the maximum number of simultaneously running autovacuum worker processes."),
+			gettext_noop("This value is silently capped to autovacuum_max_worker_slots.")
+		},
+		&autovacuum_max_workers,
+		3, 1, MAX_BACKENDS,
+		NULL, NULL, NULL
+	},
 
 	{
 		{"max_parallel_maintenance_workers", PGC_USERSET, RESOURCES_ASYNCHRONOUS,
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index c37767cecf..c46d245153 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -658,8 +658,9 @@
 
 #autovacuum = on			# Enable autovacuum subprocess?  'on'
 					# requires track_counts to also be on.
-#autovacuum_max_worker_slots = 3	# max number of autovacuum subprocesses
+autovacuum_max_worker_slots = 32	# autovacuum worker slots to allocate
 					# (change requires restart)
+#autovacuum_max_workers = 3		# max number of autovacuum subprocesses
 #autovacuum_naptime = 1min		# time between autovacuum runs
 #autovacuum_vacuum_threshold = 50	# min number of row updates before
 					# vacuum
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 754d04485d..598782fd34 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -29,6 +29,7 @@ typedef enum
 /* GUC variables */
 extern PGDLLIMPORT bool autovacuum_start_daemon;
 extern PGDLLIMPORT int autovacuum_max_worker_slots;
+extern PGDLLIMPORT int autovacuum_max_workers;
 extern PGDLLIMPORT int autovacuum_work_mem;
 extern PGDLLIMPORT int autovacuum_naptime;
 extern PGDLLIMPORT int autovacuum_vac_thresh;
diff --git a/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
index f9cdd50c19..37550b67a4 100644
--- a/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
+++ b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
@@ -21,7 +21,7 @@ $node->append_conf(
 autovacuum = off # run autovacuum only when to anti wraparound
 autovacuum_naptime = 1s
 # so it's easier to verify the order of operations
-autovacuum_max_worker_slots = 1
+autovacuum_max_workers = 1
 log_autovacuum_min_duration = 0
 ]);
 $node->start;
diff --git a/src/test/modules/xid_wraparound/t/003_wraparounds.pl b/src/test/modules/xid_wraparound/t/003_wraparounds.pl
index 99f76229d5..88063b4b52 100644
--- a/src/test/modules/xid_wraparound/t/003_wraparounds.pl
+++ b/src/test/modules/xid_wraparound/t/003_wraparounds.pl
@@ -24,7 +24,7 @@ $node->append_conf(
 autovacuum = off # run autovacuum only when to anti wraparound
 autovacuum_naptime = 1s
 # so it's easier to verify the order of operations
-autovacuum_max_worker_slots = 1
+autovacuum_max_workers = 1
 log_autovacuum_min_duration = 0
 ]);
 $node->start;
-- 
2.25.1

#20

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Nathan Bossart (#19)

Re: allow changing autovacuum_max_workers without restarting

On Fri, Apr 19, 2024 at 11:43 AM Nathan Bossart
<nathandbossart@gmail.com> wrote:

Removed in v2. I also noticed that I forgot to update the part about when
autovacuum_max_workers can be changed. *facepalm*

I think this could help a bunch of users, but I'd still like to
complain, not so much with the desire to kill this patch as with the
desire to broaden the conversation.

Part of the underlying problem here is that, AFAIK, neither PostgreSQL
as a piece of software nor we as human beings who operate PostgreSQL
databases have much understanding of how autovacuum_max_workers should
be set. It's relatively easy to hose yourself by raising
autovacuum_max_workers to try to make things go faster, but produce
the exact opposite effect due to how the cost balancing stuff works.
But, even if you have the correct use case for autovacuum_max_workers,
something like a few large tables that take a long time to vacuum plus
a bunch of smaller ones that can't get starved just because the big
tables are in the midst of being processed, you might well ask
yourself why it's your job to figure out the correct number of
workers.

Now, before this patch, there is a fairly good reason for that, which
is that we need to reserve shared memory resources for each autovacuum
worker that might potentially run, and the system can't know how much
shared memory you'd like to reserve for that purpose. But if that were
the only problem, then this patch would probably just be proposing to
crank up the default value of that parameter rather than introducing a
second one. I bet Nathan isn't proposing that because his intuition is
that it will work out badly, and I think he's right. I bet that
cranking up the number of allowed workers will often result in running
more workers than we really should. One possible negative consequence
is that we'll end up with multiple processes fighting over the disk in
a situation where they should just take turns. I suspect there are
also ways that we can be harmed - in broadly similar fashion - by cost
balancing.

So I feel like what this proposal reveals is that we know that our
algorithm for ramping up the number of running workers doesn't really
work. And maybe that's just a consequence of the general problem that
we have no global information about how much vacuuming work there is
to be done at any given time, and therefore we cannot take any kind of
sensible guess about whether 1 more worker will help or hurt. Or,
maybe there's some way to do better than what we do today without a
big rewrite. I'm not sure. I don't think this patch should be burdened
with solving the general problem here. But I do think the general
problem is worth some discussion.

--
Robert Haas
EDB: http://www.enterprisedb.com

#21

Nathan Bossart

nathandbossart@gmail.com

over 1 year ago

In reply to: Robert Haas (#20)

Re: allow changing autovacuum_max_workers without restarting

On Fri, Apr 19, 2024 at 02:42:13PM -0400, Robert Haas wrote:

I think this could help a bunch of users, but I'd still like to
complain, not so much with the desire to kill this patch as with the
desire to broaden the conversation.

I think I subconsciously hoped this would spark a bigger discussion...

Now, before this patch, there is a fairly good reason for that, which
is that we need to reserve shared memory resources for each autovacuum
worker that might potentially run, and the system can't know how much
shared memory you'd like to reserve for that purpose. But if that were
the only problem, then this patch would probably just be proposing to
crank up the default value of that parameter rather than introducing a
second one. I bet Nathan isn't proposing that because his intuition is
that it will work out badly, and I think he's right. I bet that
cranking up the number of allowed workers will often result in running
more workers than we really should. One possible negative consequence
is that we'll end up with multiple processes fighting over the disk in
a situation where they should just take turns. I suspect there are
also ways that we can be harmed - in broadly similar fashion - by cost
balancing.

Even if we were content to bump up the default value of
autovacuum_max_workers and tell folks to just mess with the cost settings,
there are still probably many cases where bumping up the number of workers
further would be necessary. If you have a zillion tables, turning
cost-based vacuuming off completely may be insufficient to keep up, at
which point your options become limited. It can be difficult to tell
whether you might end up in this situation over time as your workload
evolves. In any case, it's not clear to me that bumping up the default
value of autovacuum_max_workers would do more good than harm. I get the
idea that the default of 3 is sufficient for a lot of clusters, so there'd
really be little upside to changing it AFAICT. (I guess this proves your
point about my intuition.)

So I feel like what this proposal reveals is that we know that our
algorithm for ramping up the number of running workers doesn't really
work. And maybe that's just a consequence of the general problem that
we have no global information about how much vacuuming work there is
to be done at any given time, and therefore we cannot take any kind of
sensible guess about whether 1 more worker will help or hurt. Or,
maybe there's some way to do better than what we do today without a
big rewrite. I'm not sure. I don't think this patch should be burdened
with solving the general problem here. But I do think the general
problem is worth some discussion.

I certainly don't want to hold up $SUBJECT for a larger rewrite of
autovacuum scheduling, but I also don't want to shy away from a larger
rewrite if it's an idea whose time has come. I'm looking forward to
hearing your ideas in your pgconf.dev talk.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#22

Imseih (AWS), Sami

simseih@amazon.com

over 1 year ago

In reply to: Robert Haas (#20)

Re: allow changing autovacuum_max_workers without restarting

Part of the underlying problem here is that, AFAIK, neither PostgreSQL
as a piece of software nor we as human beings who operate PostgreSQL
databases have much understanding of how autovacuum_max_workers should
be set. It's relatively easy to hose yourself by raising
autovacuum_max_workers to try to make things go faster, but produce
the exact opposite effect due to how the cost balancing stuff works.

Yeah, this patch will not fix this problem. Anyone who raises av_max_workers
should think about adjusting the av_cost_delay. This was discussed up the
thread [4]/messages/by-id/20240419154322.GA3988554@nathanxps13 and even without this patch, I think it's necessary to add more
documentation on the relationship between workers and cost.

So I feel like what this proposal reveals is that we know that our
algorithm for ramping up the number of running workers doesn't really
work. And maybe that's just a consequence of the general problem that
we have no global information about how much vacuuming work there is
to be done at any given time, and therefore we cannot take any kind of
sensible guess about whether 1 more worker will help or hurt. Or,
maybe there's some way to do better than what we do today without a
big rewrite. I'm not sure. I don't think this patch should be burdened
with solving the general problem here. But I do think the general
problem is worth some discussion.

This patch is only solving the operational problem of adjusting
autovacuum_max_workers, and it does so without introducing complexity.

A proposal that will alleviate the users from the burden of having to think about
autovacuum_max_workers, cost_delay and cost_limit settings will be great.
This patch may be the basis for such dynamic "auto-tuning" of autovacuum workers.

Regards,

Sami

[4]: /messages/by-id/20240419154322.GA3988554@nathanxps13

#23

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Nathan Bossart (#21)

Re: allow changing autovacuum_max_workers without restarting

On Fri, Apr 19, 2024 at 4:29 PM Nathan Bossart <nathandbossart@gmail.com> wrote:

I certainly don't want to hold up $SUBJECT for a larger rewrite of
autovacuum scheduling, but I also don't want to shy away from a larger
rewrite if it's an idea whose time has come. I'm looking forward to
hearing your ideas in your pgconf.dev talk.

Yeah, I suppose I was hoping you were going to tell me the all the
answers and thus make the talk a lot easier to write, but I guess life
isn't that simple. :-)

--
Robert Haas
EDB: http://www.enterprisedb.com

#24

Nathan Bossart

nathandbossart@gmail.com

over 1 year ago

In reply to: Imseih (AWS), Sami (#16)

Re: allow changing autovacuum_max_workers without restarting

On Mon, Apr 15, 2024 at 05:41:04PM +0000, Imseih (AWS), Sami wrote:

Another option could be to just remove the restart-only GUC and hard-code
the upper limit of autovacuum_max_workers to 64 or 128 or something. While
that would simplify matters, I suspect it would be hard to choose an
appropriate limit that won't quickly become outdated.

Hardcoded values are usually hard to deal with because they are hidden either
In code or in docs.

That's true, but using a hard-coded limit means we no longer need to add a
new GUC. Always allocating, say, 256 slots might require a few additional
kilobytes of shared memory, most of which will go unused, but that seems
unlikely to be a problem for the systems that will run Postgres v18.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#25

Imseih (AWS), Sami

simseih@amazon.com

over 1 year ago

In reply to: Nathan Bossart (#24)

Re: allow changing autovacuum_max_workers without restarting

That's true, but using a hard-coded limit means we no longer need to add a
new GUC. Always allocating, say, 256 slots might require a few additional
kilobytes of shared memory, most of which will go unused, but that seems
unlikely to be a problem for the systems that will run Postgres v18.

I agree with this.

Regards,

Sami

#26

Nathan Bossart

nathandbossart@gmail.com

over 1 year ago

In reply to: Imseih (AWS), Sami (#25)

1 attachment(s)

Re: allow changing autovacuum_max_workers without restarting

On Fri, May 03, 2024 at 12:57:18PM +0000, Imseih (AWS), Sami wrote:

That's true, but using a hard-coded limit means we no longer need to add a
new GUC. Always allocating, say, 256 slots might require a few additional
kilobytes of shared memory, most of which will go unused, but that seems
unlikely to be a problem for the systems that will run Postgres v18.

I agree with this.

Here's what this might look like. I chose an upper limit of 1024, which
seems like it "ought to be enough for anybody," at least for now.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachments:

v3-0001-allow-changing-autovacuum_max_workers-without-res.patchtext/x-diff; charset=us-asciiDownload

From 72e0496294ef0390c77cef8031ae51c1a44ebde8 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Tue, 7 May 2024 10:59:24 -0500
Subject: [PATCH v3 1/1] allow changing autovacuum_max_workers without
 restarting

---
 doc/src/sgml/config.sgml                      |  3 +-
 doc/src/sgml/runtime.sgml                     | 15 ++++---
 src/backend/access/transam/xlog.c             |  2 +-
 src/backend/postmaster/autovacuum.c           | 44 ++++++++++++-------
 src/backend/postmaster/postmaster.c           |  2 +-
 src/backend/storage/lmgr/proc.c               |  9 ++--
 src/backend/utils/init/postinit.c             | 20 ++-------
 src/backend/utils/misc/guc_tables.c           |  7 ++-
 src/backend/utils/misc/postgresql.conf.sample |  1 -
 src/include/postmaster/autovacuum.h           |  8 ++++
 src/include/utils/guc_hooks.h                 |  2 -
 11 files changed, 58 insertions(+), 55 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e93208b2e6..8e2a1d6902 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8528,7 +8528,8 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
        <para>
         Specifies the maximum number of autovacuum processes (other than the
         autovacuum launcher) that may be running at any one time.  The default
-        is three.  This parameter can only be set at server start.
+        is three.  This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command line.
        </para>
       </listitem>
      </varlistentry>
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 6047b8171d..8a672a8383 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -781,13 +781,13 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
        <row>
         <entry><varname>SEMMNI</varname></entry>
         <entry>Maximum number of semaphore identifiers (i.e., sets)</entry>
-        <entry>at least <literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16)</literal> plus room for other applications</entry>
+        <entry>at least <literal>ceil((max_connections + max_wal_senders + max_worker_processes + 1029) / 16)</literal> plus room for other applications</entry>
        </row>
 
        <row>
         <entry><varname>SEMMNS</varname></entry>
         <entry>Maximum number of semaphores system-wide</entry>
-        <entry><literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16) * 17</literal> plus room for other applications</entry>
+        <entry><literal>ceil((max_connections + max_wal_senders + max_worker_processes + 1029) / 16) * 17</literal> plus room for other applications</entry>
        </row>
 
        <row>
@@ -838,7 +838,7 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     When using System V semaphores,
     <productname>PostgreSQL</productname> uses one semaphore per allowed connection
     (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
-    (<xref linkend="guc-autovacuum-max-workers"/>) and allowed background
+    (1024) and allowed background
     process (<xref linkend="guc-max-worker-processes"/>), in sets of 16.
     Each such set will
     also contain a 17th semaphore which contains a <quote>magic
@@ -846,13 +846,14 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     other applications. The maximum number of semaphores in the system
     is set by <varname>SEMMNS</varname>, which consequently must be at least
     as high as <varname>max_connections</varname> plus
-    <varname>autovacuum_max_workers</varname> plus <varname>max_wal_senders</varname>,
-    plus <varname>max_worker_processes</varname>, plus one extra for each 16
+    <varname>max_wal_senders</varname>,
+    plus <varname>max_worker_processes</varname>, plus 1024 for autovacuum
+    worker processes, plus one extra for each 16
     allowed connections plus workers (see the formula in <xref
     linkend="sysvipc-parameters"/>).  The parameter <varname>SEMMNI</varname>
     determines the limit on the number of semaphore sets that can
     exist on the system at one time.  Hence this parameter must be at
-    least <literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16)</literal>.
+    least <literal>ceil((max_connections + max_wal_senders + max_worker_processes + 1029) / 16)</literal>.
     Lowering the number
     of allowed connections is a temporary workaround for failures,
     which are usually confusingly worded <quote>No space
@@ -883,7 +884,7 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     When using POSIX semaphores, the number of semaphores needed is the
     same as for System V, that is one semaphore per allowed connection
     (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
-    (<xref linkend="guc-autovacuum-max-workers"/>) and allowed background
+    (1024) and allowed background
     process (<xref linkend="guc-max-worker-processes"/>).
     On the platforms where this option is preferred, there is no specific
     kernel limit on the number of POSIX semaphores.
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index c3fd9c1eae..5b0312b2a6 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5362,7 +5362,7 @@ CheckRequiredParameterValues(void)
 	 */
 	if (ArchiveRecoveryRequested && EnableHotStandby)
 	{
-		/* We ignore autovacuum_max_workers when we make this test. */
+		/* We ignore autovacuum workers when we make this test. */
 		RecoveryRequiresIntParameter("max_connections",
 									 MaxConnections,
 									 ControlFile->MaxConnections);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 9a925a10cd..1ab97b9903 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -208,8 +208,7 @@ typedef struct autovac_table
 
 /*-------------
  * This struct holds information about a single worker's whereabouts.  We keep
- * an array of these in shared memory, sized according to
- * autovacuum_max_workers.
+ * an array of these in shared memory.
  *
  * wi_links		entry into free list or running list
  * wi_dboid		OID of the database this worker is supposed to work on
@@ -289,7 +288,7 @@ typedef struct
 {
 	sig_atomic_t av_signal[AutoVacNumSignals];
 	pid_t		av_launcherpid;
-	dlist_head	av_freeWorkers;
+	dclist_head av_freeWorkers;
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
@@ -347,6 +346,7 @@ static void autovac_report_activity(autovac_table *tab);
 static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
 									const char *nspname, const char *relname);
 static void avl_sigusr2_handler(SIGNAL_ARGS);
+static bool av_worker_available(void);
 
 
 
@@ -575,8 +575,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		 * wakening conditions.
 		 */
 
-		launcher_determine_sleep(!dlist_is_empty(&AutoVacuumShmem->av_freeWorkers),
-								 false, &nap);
+		launcher_determine_sleep(av_worker_available(), false, &nap);
 
 		/*
 		 * Wait until naptime expires or we get some type of signal (all the
@@ -636,7 +635,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		current_time = GetCurrentTimestamp();
 		LWLockAcquire(AutovacuumLock, LW_SHARED);
 
-		can_launch = !dlist_is_empty(&AutoVacuumShmem->av_freeWorkers);
+		can_launch = av_worker_available();
 
 		if (AutoVacuumShmem->av_startingWorker != NULL)
 		{
@@ -679,8 +678,8 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 					worker->wi_sharedrel = false;
 					worker->wi_proc = NULL;
 					worker->wi_launchtime = 0;
-					dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-									&worker->wi_links);
+					dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+									 &worker->wi_links);
 					AutoVacuumShmem->av_startingWorker = NULL;
 					ereport(WARNING,
 							errmsg("autovacuum worker took too long to start; canceled"));
@@ -1087,7 +1086,7 @@ do_start_worker(void)
 
 	/* return quickly when there are no free workers */
 	LWLockAcquire(AutovacuumLock, LW_SHARED);
-	if (dlist_is_empty(&AutoVacuumShmem->av_freeWorkers))
+	if (!av_worker_available())
 	{
 		LWLockRelease(AutovacuumLock);
 		return InvalidOid;
@@ -1240,7 +1239,7 @@ do_start_worker(void)
 		 * Get a worker entry from the freelist.  We checked above, so there
 		 * really should be a free slot.
 		 */
-		wptr = dlist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
+		wptr = dclist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
 
 		worker = dlist_container(WorkerInfoData, wi_links, wptr);
 		worker->wi_dboid = avdb->adw_datid;
@@ -1609,8 +1608,8 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
 		pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
-		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-						&MyWorkerInfo->wi_links);
+		dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+						 &MyWorkerInfo->wi_links);
 		/* not mine anymore */
 		MyWorkerInfo = NULL;
 
@@ -3265,7 +3264,7 @@ AutoVacuumShmemSize(void)
 	 */
 	size = sizeof(AutoVacuumShmemStruct);
 	size = MAXALIGN(size);
-	size = add_size(size, mul_size(autovacuum_max_workers,
+	size = add_size(size, mul_size(AUTOVAC_MAX_WORKER_SLOTS,
 								   sizeof(WorkerInfoData)));
 	return size;
 }
@@ -3292,7 +3291,7 @@ AutoVacuumShmemInit(void)
 		Assert(!found);
 
 		AutoVacuumShmem->av_launcherpid = 0;
-		dlist_init(&AutoVacuumShmem->av_freeWorkers);
+		dclist_init(&AutoVacuumShmem->av_freeWorkers);
 		dlist_init(&AutoVacuumShmem->av_runningWorkers);
 		AutoVacuumShmem->av_startingWorker = NULL;
 		memset(AutoVacuumShmem->av_workItems, 0,
@@ -3302,10 +3301,10 @@ AutoVacuumShmemInit(void)
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
 		/* initialize the WorkerInfo free list */
-		for (i = 0; i < autovacuum_max_workers; i++)
+		for (i = 0; i < AUTOVAC_MAX_WORKER_SLOTS; i++)
 		{
-			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-							&worker[i].wi_links);
+			dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+							 &worker[i].wi_links);
 			pg_atomic_init_flag(&worker[i].wi_dobalance);
 		}
 
@@ -3341,3 +3340,14 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
 
 	return true;
 }
+
+/*
+ * Returns whether there is a free autovacuum worker slot available.
+ */
+static bool
+av_worker_available(void)
+{
+	int			reserved_slots = AUTOVAC_MAX_WORKER_SLOTS - autovacuum_max_workers;
+
+	return dclist_count(&AutoVacuumShmem->av_freeWorkers) > reserved_slots;
+}
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 7f3170a8f0..cab8a8a5f4 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -4144,7 +4144,7 @@ CreateOptsFile(int argc, char *argv[], char *fullprogname)
 int
 MaxLivePostmasterChildren(void)
 {
-	return 2 * (MaxConnections + autovacuum_max_workers + 1 +
+	return 2 * (MaxConnections + AUTOVAC_MAX_WORKER_SLOTS + 1 +
 				max_wal_senders + max_worker_processes);
 }
 
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index e4f256c63c..fdf7373232 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -141,9 +141,8 @@ ProcGlobalSemas(void)
  *	  running out when trying to start another backend is a common failure.
  *	  So, now we grab enough semaphores to support the desired max number
  *	  of backends immediately at initialization --- if the sysadmin has set
- *	  MaxConnections, max_worker_processes, max_wal_senders, or
- *	  autovacuum_max_workers higher than his kernel will support, he'll
- *	  find out sooner rather than later.
+ *	  MaxConnections, max_worker_processes, or max_wal_senders higher than
+ *    his kernel will support, he'll find out sooner rather than later.
  *
  *	  Another reason for creating semaphores here is that the semaphore
  *	  implementation typically requires us to create semaphores in the
@@ -242,13 +241,13 @@ InitProcGlobal(void)
 			dlist_push_tail(&ProcGlobal->freeProcs, &proc->links);
 			proc->procgloballist = &ProcGlobal->freeProcs;
 		}
-		else if (i < MaxConnections + autovacuum_max_workers + 1)
+		else if (i < MaxConnections + AUTOVAC_MAX_WORKER_SLOTS + 1)
 		{
 			/* PGPROC for AV launcher/worker, add to autovacFreeProcs list */
 			dlist_push_tail(&ProcGlobal->autovacFreeProcs, &proc->links);
 			proc->procgloballist = &ProcGlobal->autovacFreeProcs;
 		}
-		else if (i < MaxConnections + autovacuum_max_workers + 1 + max_worker_processes)
+		else if (i < MaxConnections + AUTOVAC_MAX_WORKER_SLOTS + 1 + max_worker_processes)
 		{
 			/* PGPROC for bgworker, add to bgworkerFreeProcs list */
 			dlist_push_tail(&ProcGlobal->bgworkerFreeProcs, &proc->links);
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 0805398e24..684b22e3a5 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -577,7 +577,7 @@ InitializeMaxBackends(void)
 	Assert(MaxBackends == 0);
 
 	/* the extra unit accounts for the autovacuum launcher */
-	MaxBackends = MaxConnections + autovacuum_max_workers + 1 +
+	MaxBackends = MaxConnections + AUTOVAC_MAX_WORKER_SLOTS + 1 +
 		max_worker_processes + max_wal_senders;
 
 	/* internal error because the values were all checked previously */
@@ -591,19 +591,7 @@ InitializeMaxBackends(void)
 bool
 check_max_connections(int *newval, void **extra, GucSource source)
 {
-	if (*newval + autovacuum_max_workers + 1 +
-		max_worker_processes + max_wal_senders > MAX_BACKENDS)
-		return false;
-	return true;
-}
-
-/*
- * GUC check_hook for autovacuum_max_workers
- */
-bool
-check_autovacuum_max_workers(int *newval, void **extra, GucSource source)
-{
-	if (MaxConnections + *newval + 1 +
+	if (*newval + AUTOVAC_MAX_WORKER_SLOTS + 1 +
 		max_worker_processes + max_wal_senders > MAX_BACKENDS)
 		return false;
 	return true;
@@ -615,7 +603,7 @@ check_autovacuum_max_workers(int *newval, void **extra, GucSource source)
 bool
 check_max_worker_processes(int *newval, void **extra, GucSource source)
 {
-	if (MaxConnections + autovacuum_max_workers + 1 +
+	if (MaxConnections + AUTOVAC_MAX_WORKER_SLOTS + 1 +
 		*newval + max_wal_senders > MAX_BACKENDS)
 		return false;
 	return true;
@@ -627,7 +615,7 @@ check_max_worker_processes(int *newval, void **extra, GucSource source)
 bool
 check_max_wal_senders(int *newval, void **extra, GucSource source)
 {
-	if (MaxConnections + autovacuum_max_workers + 1 +
+	if (MaxConnections + AUTOVAC_MAX_WORKER_SLOTS + 1 +
 		max_worker_processes + *newval > MAX_BACKENDS)
 		return false;
 	return true;
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index ea2b0577bc..1e255fe263 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3380,14 +3380,13 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 	{
-		/* see max_connections */
-		{"autovacuum_max_workers", PGC_POSTMASTER, AUTOVACUUM,
+		{"autovacuum_max_workers", PGC_SIGHUP, AUTOVACUUM,
 			gettext_noop("Sets the maximum number of simultaneously running autovacuum worker processes."),
 			NULL
 		},
 		&autovacuum_max_workers,
-		3, 1, MAX_BACKENDS,
-		check_autovacuum_max_workers, NULL, NULL
+		3, 1, AUTOVAC_MAX_WORKER_SLOTS,
+		NULL, NULL, NULL
 	},
 
 	{
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 83d5df8e46..439859e3ff 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -659,7 +659,6 @@
 #autovacuum = on			# Enable autovacuum subprocess?  'on'
 					# requires track_counts to also be on.
 #autovacuum_max_workers = 3		# max number of autovacuum subprocesses
-					# (change requires restart)
 #autovacuum_naptime = 1min		# time between autovacuum runs
 #autovacuum_vacuum_threshold = 50	# min number of row updates before
 					# vacuum
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index cae1e8b329..23a7a729aa 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -66,4 +66,12 @@ extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
 extern Size AutoVacuumShmemSize(void);
 extern void AutoVacuumShmemInit(void);
 
+/*
+ * Number of autovacuum worker slots to allocate.  This is the upper limit of
+ * autovacuum_max_workers.
+ *
+ * NB: This must be less than MAX_BACKENDS.
+ */
+#define AUTOVAC_MAX_WORKER_SLOTS	(1024)
+
 #endif							/* AUTOVACUUM_H */
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index d64dc5fcdb..76346eb05e 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -29,8 +29,6 @@ extern bool check_application_name(char **newval, void **extra,
 								   GucSource source);
 extern void assign_application_name(const char *newval, void *extra);
 extern const char *show_archive_command(void);
-extern bool check_autovacuum_max_workers(int *newval, void **extra,
-										 GucSource source);
 extern bool check_autovacuum_work_mem(int *newval, void **extra,
 									  GucSource source);
 extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
-- 
2.25.1

#27

Imseih (AWS), Sami

simseih@amazon.com

over 1 year ago

In reply to: Nathan Bossart (#26)

Re: allow changing autovacuum_max_workers without restarting

That's true, but using a hard-coded limit means we no longer need to add a
new GUC. Always allocating, say, 256 slots might require a few additional
kilobytes of shared memory, most of which will go unused, but that seems
unlikely to be a problem for the systems that will run Postgres v18.

I agree with this.

Here's what this might look like. I chose an upper limit of 1024, which
seems like it "ought to be enough for anybody," at least for now.

I thought 256 was a good enough limit. In practice, I doubt anyone will
benefit from more than a few dozen autovacuum workers.
I think 1024 is way too high to even allow.

Besides that the overall patch looks good to me, but I have
some comments on the documentation.

I don't think combining 1024 + 5 = 1029 is a good idea in docs.
Breaking down the allotment and using the name of the constant
is much more clear.

I suggest
" max_connections + max_wal_senders + max_worker_processes + AUTOVAC_MAX_WORKER_SLOTS + 5"

and in other places in the docs, we should mention the actual
value of AUTOVAC_MAX_WORKER_SLOTS. Maybe in the
below section?

Instead of:
-    (<xref linkend="guc-autovacuum-max-workers"/>) and allowed background
+    (1024) and allowed background

do something like:
-    (<xref linkend="guc-autovacuum-max-workers"/>) and allowed background
+   AUTOVAC_MAX_WORKER_SLOTS  (1024) and allowed background

Also, replace the 1024 here with AUTOVAC_MAX_WORKER_SLOTS.

+    <varname>max_wal_senders</varname>,
+    plus <varname>max_worker_processes</varname>, plus 1024 for autovacuum
+    worker processes, plus one extra for each 16

Also, Not sure if I am mistaken here, but the "+ 5" in the existing docs
seems wrong.

If it refers to NUM_AUXILIARY_PROCS defined in
include/storage/proc.h, it should a "6"

#define NUM_AUXILIARY_PROCS 6

This is not a consequence of this patch, and can be dealt with
In a separate thread if my understanding is correct.

Regards,

Sami

#28

Nathan Bossart

nathandbossart@gmail.com

over 1 year ago

In reply to: Imseih (AWS), Sami (#27)

Re: allow changing autovacuum_max_workers without restarting

On Thu, May 16, 2024 at 04:37:10PM +0000, Imseih (AWS), Sami wrote:

I thought 256 was a good enough limit. In practice, I doubt anyone will
benefit from more than a few dozen autovacuum workers.
I think 1024 is way too high to even allow.

WFM

I don't think combining 1024 + 5 = 1029 is a good idea in docs.
Breaking down the allotment and using the name of the constant
is much more clear.

I suggest
" max_connections + max_wal_senders + max_worker_processes + AUTOVAC_MAX_WORKER_SLOTS + 5"

and in other places in the docs, we should mention the actual
value of AUTOVAC_MAX_WORKER_SLOTS. Maybe in the
below section?
Instead of:
-    (<xref linkend="guc-autovacuum-max-workers"/>) and allowed background
+    (1024) and allowed background
do something like:
-    (<xref linkend="guc-autovacuum-max-workers"/>) and allowed background
+   AUTOVAC_MAX_WORKER_SLOTS  (1024) and allowed background
Also, replace the 1024 here with AUTOVAC_MAX_WORKER_SLOTS.
+    <varname>max_wal_senders</varname>,
+    plus <varname>max_worker_processes</varname>, plus 1024 for autovacuum
+    worker processes, plus one extra for each 16

Part of me wonders whether documenting the exact formula is worthwhile.
This portion of the docs is rather complicated, and I can't recall ever
having to do the arithmetic is describes. Plus, see below...

Also, Not sure if I am mistaken here, but the "+ 5" in the existing docs
seems wrong.

If it refers to NUM_AUXILIARY_PROCS defined in
include/storage/proc.h, it should a "6"

#define NUM_AUXILIARY_PROCS 6

This is not a consequence of this patch, and can be dealt with
In a separate thread if my understanding is correct.

Ha, I think it should actually be "+ 7"! The value is calculated as

MaxConnections + autovacuum_max_workers + 1 + max_worker_processes + max_wal_senders + 6

Looking at the history, this documentation tends to be wrong quite often.
In v9.2, the checkpointer was introduced, and these formulas were not
updated. In v9.3, background worker processes were introduced, and the
formulas were still not updated. Finally, in v9.6, it was fixed in commit
597f7e3. Then, in v14, the archiver process was made an auxiliary process
(commit d75288f), making the formulas out-of-date again. And in v17, the
WAL summarizer was added.

On top of this, IIUC you actually need even more semaphores if your system
doesn't support atomics, and from a quick skim this doesn't seem to be
covered in this documentation.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#29

Nathan Bossart

nathandbossart@gmail.com

over 1 year ago

In reply to: Nathan Bossart (#28)

1 attachment(s)

Re: allow changing autovacuum_max_workers without restarting

On Thu, May 16, 2024 at 09:16:46PM -0500, Nathan Bossart wrote:

On Thu, May 16, 2024 at 04:37:10PM +0000, Imseih (AWS), Sami wrote:

I thought 256 was a good enough limit. In practice, I doubt anyone will
benefit from more than a few dozen autovacuum workers.
I think 1024 is way too high to even allow.

WFM

Here is an updated patch that uses 256 as the upper limit.

I don't think combining 1024 + 5 = 1029 is a good idea in docs.
Breaking down the allotment and using the name of the constant
is much more clear.

I plan to further improve this section of the documentation in v18, so I've
left the constant unexplained for now.

--
nathan

Attachments:

v4-0001-allow-changing-autovacuum_max_workers-without-res.patchtext/plain; charset=us-asciiDownload

From 056ad035c5d213f7ae49f5feb28229f35086430f Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Tue, 7 May 2024 10:59:24 -0500
Subject: [PATCH v4 1/1] allow changing autovacuum_max_workers without
 restarting

---
 doc/src/sgml/config.sgml                      |  3 +-
 doc/src/sgml/runtime.sgml                     | 15 ++++---
 src/backend/access/transam/xlog.c             |  1 -
 src/backend/postmaster/autovacuum.c           | 44 ++++++++++++-------
 src/backend/postmaster/postmaster.c           |  2 +-
 src/backend/storage/lmgr/proc.c               |  9 ++--
 src/backend/utils/init/postinit.c             | 20 ++-------
 src/backend/utils/misc/guc_tables.c           |  7 ++-
 src/backend/utils/misc/postgresql.conf.sample |  1 -
 src/include/postmaster/autovacuum.h           |  8 ++++
 src/include/utils/guc_hooks.h                 |  2 -
 11 files changed, 57 insertions(+), 55 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 698169afdb..b79a855729 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8535,7 +8535,8 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
        <para>
         Specifies the maximum number of autovacuum processes (other than the
         autovacuum launcher) that may be running at any one time.  The default
-        is three.  This parameter can only be set at server start.
+        is three.  This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command line.
        </para>
       </listitem>
      </varlistentry>
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 883a849e6f..862f4f1f45 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -781,13 +781,13 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
        <row>
         <entry><varname>SEMMNI</varname></entry>
         <entry>Maximum number of semaphore identifiers (i.e., sets)</entry>
-        <entry>at least <literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 7) / 16)</literal> plus room for other applications</entry>
+        <entry>at least <literal>ceil((max_connections + max_wal_senders + max_worker_processes + 263) / 16)</literal> plus room for other applications</entry>
        </row>
 
        <row>
         <entry><varname>SEMMNS</varname></entry>
         <entry>Maximum number of semaphores system-wide</entry>
-        <entry><literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 7) / 16) * 17</literal> plus room for other applications</entry>
+        <entry><literal>ceil((max_connections + max_wal_senders + max_worker_processes + 263) / 16) * 17</literal> plus room for other applications</entry>
        </row>
 
        <row>
@@ -838,7 +838,7 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     When using System V semaphores,
     <productname>PostgreSQL</productname> uses one semaphore per allowed connection
     (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
-    (<xref linkend="guc-autovacuum-max-workers"/>), allowed WAL sender process
+    (256), allowed WAL sender process
     (<xref linkend="guc-max-wal-senders"/>), and allowed background
     process (<xref linkend="guc-max-worker-processes"/>), in sets of 16.
     Each such set will
@@ -847,13 +847,14 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     other applications. The maximum number of semaphores in the system
     is set by <varname>SEMMNS</varname>, which consequently must be at least
     as high as <varname>max_connections</varname> plus
-    <varname>autovacuum_max_workers</varname> plus <varname>max_wal_senders</varname>,
-    plus <varname>max_worker_processes</varname>, plus one extra for each 16
+    <varname>max_wal_senders</varname>,
+    plus <varname>max_worker_processes</varname>, plus 256 for autovacuum
+    worker processes, plus one extra for each 16
     allowed connections plus workers (see the formula in <xref
     linkend="sysvipc-parameters"/>).  The parameter <varname>SEMMNI</varname>
     determines the limit on the number of semaphore sets that can
     exist on the system at one time.  Hence this parameter must be at
-    least <literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 7) / 16)</literal>.
+    least <literal>ceil((max_connections + max_wal_senders + max_worker_processes + 263) / 16)</literal>.
     Lowering the number
     of allowed connections is a temporary workaround for failures,
     which are usually confusingly worded <quote>No space
@@ -884,7 +885,7 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     When using POSIX semaphores, the number of semaphores needed is the
     same as for System V, that is one semaphore per allowed connection
     (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
-    (<xref linkend="guc-autovacuum-max-workers"/>) and allowed background
+    (256) and allowed background
     process (<xref linkend="guc-max-worker-processes"/>).
     On the platforms where this option is preferred, there is no specific
     kernel limit on the number of POSIX semaphores.
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 330e058c5f..b6511d0658 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5362,7 +5362,6 @@ CheckRequiredParameterValues(void)
 	 */
 	if (ArchiveRecoveryRequested && EnableHotStandby)
 	{
-		/* We ignore autovacuum_max_workers when we make this test. */
 		RecoveryRequiresIntParameter("max_connections",
 									 MaxConnections,
 									 ControlFile->MaxConnections);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 9a925a10cd..1ab97b9903 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -208,8 +208,7 @@ typedef struct autovac_table
 
 /*-------------
  * This struct holds information about a single worker's whereabouts.  We keep
- * an array of these in shared memory, sized according to
- * autovacuum_max_workers.
+ * an array of these in shared memory.
  *
  * wi_links		entry into free list or running list
  * wi_dboid		OID of the database this worker is supposed to work on
@@ -289,7 +288,7 @@ typedef struct
 {
 	sig_atomic_t av_signal[AutoVacNumSignals];
 	pid_t		av_launcherpid;
-	dlist_head	av_freeWorkers;
+	dclist_head av_freeWorkers;
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
@@ -347,6 +346,7 @@ static void autovac_report_activity(autovac_table *tab);
 static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
 									const char *nspname, const char *relname);
 static void avl_sigusr2_handler(SIGNAL_ARGS);
+static bool av_worker_available(void);
 
 
 
@@ -575,8 +575,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		 * wakening conditions.
 		 */
 
-		launcher_determine_sleep(!dlist_is_empty(&AutoVacuumShmem->av_freeWorkers),
-								 false, &nap);
+		launcher_determine_sleep(av_worker_available(), false, &nap);
 
 		/*
 		 * Wait until naptime expires or we get some type of signal (all the
@@ -636,7 +635,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		current_time = GetCurrentTimestamp();
 		LWLockAcquire(AutovacuumLock, LW_SHARED);
 
-		can_launch = !dlist_is_empty(&AutoVacuumShmem->av_freeWorkers);
+		can_launch = av_worker_available();
 
 		if (AutoVacuumShmem->av_startingWorker != NULL)
 		{
@@ -679,8 +678,8 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 					worker->wi_sharedrel = false;
 					worker->wi_proc = NULL;
 					worker->wi_launchtime = 0;
-					dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-									&worker->wi_links);
+					dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+									 &worker->wi_links);
 					AutoVacuumShmem->av_startingWorker = NULL;
 					ereport(WARNING,
 							errmsg("autovacuum worker took too long to start; canceled"));
@@ -1087,7 +1086,7 @@ do_start_worker(void)
 
 	/* return quickly when there are no free workers */
 	LWLockAcquire(AutovacuumLock, LW_SHARED);
-	if (dlist_is_empty(&AutoVacuumShmem->av_freeWorkers))
+	if (!av_worker_available())
 	{
 		LWLockRelease(AutovacuumLock);
 		return InvalidOid;
@@ -1240,7 +1239,7 @@ do_start_worker(void)
 		 * Get a worker entry from the freelist.  We checked above, so there
 		 * really should be a free slot.
 		 */
-		wptr = dlist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
+		wptr = dclist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
 
 		worker = dlist_container(WorkerInfoData, wi_links, wptr);
 		worker->wi_dboid = avdb->adw_datid;
@@ -1609,8 +1608,8 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
 		pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
-		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-						&MyWorkerInfo->wi_links);
+		dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+						 &MyWorkerInfo->wi_links);
 		/* not mine anymore */
 		MyWorkerInfo = NULL;
 
@@ -3265,7 +3264,7 @@ AutoVacuumShmemSize(void)
 	 */
 	size = sizeof(AutoVacuumShmemStruct);
 	size = MAXALIGN(size);
-	size = add_size(size, mul_size(autovacuum_max_workers,
+	size = add_size(size, mul_size(AUTOVAC_MAX_WORKER_SLOTS,
 								   sizeof(WorkerInfoData)));
 	return size;
 }
@@ -3292,7 +3291,7 @@ AutoVacuumShmemInit(void)
 		Assert(!found);
 
 		AutoVacuumShmem->av_launcherpid = 0;
-		dlist_init(&AutoVacuumShmem->av_freeWorkers);
+		dclist_init(&AutoVacuumShmem->av_freeWorkers);
 		dlist_init(&AutoVacuumShmem->av_runningWorkers);
 		AutoVacuumShmem->av_startingWorker = NULL;
 		memset(AutoVacuumShmem->av_workItems, 0,
@@ -3302,10 +3301,10 @@ AutoVacuumShmemInit(void)
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
 		/* initialize the WorkerInfo free list */
-		for (i = 0; i < autovacuum_max_workers; i++)
+		for (i = 0; i < AUTOVAC_MAX_WORKER_SLOTS; i++)
 		{
-			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-							&worker[i].wi_links);
+			dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+							 &worker[i].wi_links);
 			pg_atomic_init_flag(&worker[i].wi_dobalance);
 		}
 
@@ -3341,3 +3340,14 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
 
 	return true;
 }
+
+/*
+ * Returns whether there is a free autovacuum worker slot available.
+ */
+static bool
+av_worker_available(void)
+{
+	int			reserved_slots = AUTOVAC_MAX_WORKER_SLOTS - autovacuum_max_workers;
+
+	return dclist_count(&AutoVacuumShmem->av_freeWorkers) > reserved_slots;
+}
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index bf0241aed0..efa15954ec 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -4144,7 +4144,7 @@ CreateOptsFile(int argc, char *argv[], char *fullprogname)
 int
 MaxLivePostmasterChildren(void)
 {
-	return 2 * (MaxConnections + autovacuum_max_workers + 1 +
+	return 2 * (MaxConnections + AUTOVAC_MAX_WORKER_SLOTS + 1 +
 				max_wal_senders + max_worker_processes);
 }
 
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index ce29da9012..90cd770caf 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -141,9 +141,8 @@ ProcGlobalSemas(void)
  *	  running out when trying to start another backend is a common failure.
  *	  So, now we grab enough semaphores to support the desired max number
  *	  of backends immediately at initialization --- if the sysadmin has set
- *	  MaxConnections, max_worker_processes, max_wal_senders, or
- *	  autovacuum_max_workers higher than his kernel will support, he'll
- *	  find out sooner rather than later.
+ *	  MaxConnections, max_worker_processes, or max_wal_senders higher than
+ *    his kernel will support, he'll find out sooner rather than later.
  *
  *	  Another reason for creating semaphores here is that the semaphore
  *	  implementation typically requires us to create semaphores in the
@@ -242,13 +241,13 @@ InitProcGlobal(void)
 			dlist_push_tail(&ProcGlobal->freeProcs, &proc->links);
 			proc->procgloballist = &ProcGlobal->freeProcs;
 		}
-		else if (i < MaxConnections + autovacuum_max_workers + 1)
+		else if (i < MaxConnections + AUTOVAC_MAX_WORKER_SLOTS + 1)
 		{
 			/* PGPROC for AV launcher/worker, add to autovacFreeProcs list */
 			dlist_push_tail(&ProcGlobal->autovacFreeProcs, &proc->links);
 			proc->procgloballist = &ProcGlobal->autovacFreeProcs;
 		}
-		else if (i < MaxConnections + autovacuum_max_workers + 1 + max_worker_processes)
+		else if (i < MaxConnections + AUTOVAC_MAX_WORKER_SLOTS + 1 + max_worker_processes)
 		{
 			/* PGPROC for bgworker, add to bgworkerFreeProcs list */
 			dlist_push_tail(&ProcGlobal->bgworkerFreeProcs, &proc->links);
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 0805398e24..684b22e3a5 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -577,7 +577,7 @@ InitializeMaxBackends(void)
 	Assert(MaxBackends == 0);
 
 	/* the extra unit accounts for the autovacuum launcher */
-	MaxBackends = MaxConnections + autovacuum_max_workers + 1 +
+	MaxBackends = MaxConnections + AUTOVAC_MAX_WORKER_SLOTS + 1 +
 		max_worker_processes + max_wal_senders;
 
 	/* internal error because the values were all checked previously */
@@ -591,19 +591,7 @@ InitializeMaxBackends(void)
 bool
 check_max_connections(int *newval, void **extra, GucSource source)
 {
-	if (*newval + autovacuum_max_workers + 1 +
-		max_worker_processes + max_wal_senders > MAX_BACKENDS)
-		return false;
-	return true;
-}
-
-/*
- * GUC check_hook for autovacuum_max_workers
- */
-bool
-check_autovacuum_max_workers(int *newval, void **extra, GucSource source)
-{
-	if (MaxConnections + *newval + 1 +
+	if (*newval + AUTOVAC_MAX_WORKER_SLOTS + 1 +
 		max_worker_processes + max_wal_senders > MAX_BACKENDS)
 		return false;
 	return true;
@@ -615,7 +603,7 @@ check_autovacuum_max_workers(int *newval, void **extra, GucSource source)
 bool
 check_max_worker_processes(int *newval, void **extra, GucSource source)
 {
-	if (MaxConnections + autovacuum_max_workers + 1 +
+	if (MaxConnections + AUTOVAC_MAX_WORKER_SLOTS + 1 +
 		*newval + max_wal_senders > MAX_BACKENDS)
 		return false;
 	return true;
@@ -627,7 +615,7 @@ check_max_worker_processes(int *newval, void **extra, GucSource source)
 bool
 check_max_wal_senders(int *newval, void **extra, GucSource source)
 {
-	if (MaxConnections + autovacuum_max_workers + 1 +
+	if (MaxConnections + AUTOVAC_MAX_WORKER_SLOTS + 1 +
 		max_worker_processes + *newval > MAX_BACKENDS)
 		return false;
 	return true;
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 46c258be28..d97ad55c4a 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3380,14 +3380,13 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 	{
-		/* see max_connections */
-		{"autovacuum_max_workers", PGC_POSTMASTER, AUTOVACUUM,
+		{"autovacuum_max_workers", PGC_SIGHUP, AUTOVACUUM,
 			gettext_noop("Sets the maximum number of simultaneously running autovacuum worker processes."),
 			NULL
 		},
 		&autovacuum_max_workers,
-		3, 1, MAX_BACKENDS,
-		check_autovacuum_max_workers, NULL, NULL
+		3, 1, AUTOVAC_MAX_WORKER_SLOTS,
+		NULL, NULL, NULL
 	},
 
 	{
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 83d5df8e46..439859e3ff 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -659,7 +659,6 @@
 #autovacuum = on			# Enable autovacuum subprocess?  'on'
 					# requires track_counts to also be on.
 #autovacuum_max_workers = 3		# max number of autovacuum subprocesses
-					# (change requires restart)
 #autovacuum_naptime = 1min		# time between autovacuum runs
 #autovacuum_vacuum_threshold = 50	# min number of row updates before
 					# vacuum
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index cae1e8b329..f71c82830c 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -66,4 +66,12 @@ extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
 extern Size AutoVacuumShmemSize(void);
 extern void AutoVacuumShmemInit(void);
 
+/*
+ * Number of autovacuum worker slots to allocate.  This is the upper limit of
+ * autovacuum_max_workers.
+ *
+ * NB: This must be less than MAX_BACKENDS.
+ */
+#define AUTOVAC_MAX_WORKER_SLOTS	(256)
+
 #endif							/* AUTOVACUUM_H */
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index d64dc5fcdb..76346eb05e 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -29,8 +29,6 @@ extern bool check_application_name(char **newval, void **extra,
 								   GucSource source);
 extern void assign_application_name(const char *newval, void *extra);
 extern const char *show_archive_command(void);
-extern bool check_autovacuum_max_workers(int *newval, void **extra,
-										 GucSource source);
 extern bool check_autovacuum_work_mem(int *newval, void **extra,
 									  GucSource source);
 extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
-- 
2.39.3 (Apple Git-146)

#30

Andres Freund

andres@anarazel.de

over 1 year ago

In reply to: Nathan Bossart (#29)

Re: allow changing autovacuum_max_workers without restarting

Hi,

On 2024-06-03 13:52:29 -0500, Nathan Bossart wrote:

Here is an updated patch that uses 256 as the upper limit.

I don't have time to read through the entire thread right now - it'd be good
for the commit message of a patch like this to include justification for why
it's ok to make such a change. Even before actually committing it, so
reviewers have an easier time catching up.

Why do we think that increasing the number of PGPROC slots, heavyweight locks
etc by 256 isn't going to cause issues? That's not an insubstantial amount of
memory to dedicate to something that will practically never be used.

ISTM that at the very least we ought to exclude the reserved slots from the
computation of things like the number of locks resulting from
max_locks_per_transaction. It's very common to increase
max_locks_per_transaction substantially, adding ~250 to the multiplier can be
a good amount of memory. And AV workers should never need a meaningful number.

Increasing e.g. the size of the heavyweight lock table has consequences
besides the increase in memory usage, the size increase can make it less
likely for the table to fit largely into L3, thus decreasing performance.

Greetings,

Andres Freund

#31

Nathan Bossart

nathandbossart@gmail.com

over 1 year ago

In reply to: Andres Freund (#30)

Re: allow changing autovacuum_max_workers without restarting

On Mon, Jun 03, 2024 at 12:08:52PM -0700, Andres Freund wrote:

I don't have time to read through the entire thread right now - it'd be good
for the commit message of a patch like this to include justification for why
it's ok to make such a change. Even before actually committing it, so
reviewers have an easier time catching up.

Sorry about that. I think the main question (besides "should we do this?")
is whether we ought to make the upper limit configurable. My initial idea
was to split autovacuum_max_workers into two GUCs: one for the upper limit
that only be changed at server start and another for the effective limit
that can be changed up to the upper limit without restarting the server.
If we can just set a sufficiently high upper limit and avoid the extra GUC
without causing problems, that might be preferable, but I sense that you
are about to tell me that it will indeed cause problems. :)

Why do we think that increasing the number of PGPROC slots, heavyweight locks
etc by 256 isn't going to cause issues? That's not an insubstantial amount of
memory to dedicate to something that will practically never be used.

I personally have not observed problems with these kinds of bumps in
resource usage, although I may be biased towards larger systems where it
doesn't matter as much.

ISTM that at the very least we ought to exclude the reserved slots from the
computation of things like the number of locks resulting from
max_locks_per_transaction. It's very common to increase
max_locks_per_transaction substantially, adding ~250 to the multiplier can be
a good amount of memory. And AV workers should never need a meaningful number.

This is an interesting idea.

Increasing e.g. the size of the heavyweight lock table has consequences
besides the increase in memory usage, the size increase can make it less
likely for the table to fit largely into L3, thus decreasing performance.

IMHO this might be a good argument for making the upper limit configurable
and setting it relatively low by default. That's not quite as nice from a
user experience perspective, but weird, hard-to-diagnose performance issues
are certainly not nice, either.

--
nathan

#32

Andres Freund

andres@anarazel.de

over 1 year ago

In reply to: Nathan Bossart (#31)

Re: allow changing autovacuum_max_workers without restarting

Hi,

On 2024-06-03 14:28:13 -0500, Nathan Bossart wrote:

On Mon, Jun 03, 2024 at 12:08:52PM -0700, Andres Freund wrote:

Why do we think that increasing the number of PGPROC slots, heavyweight locks
etc by 256 isn't going to cause issues? That's not an insubstantial amount of
memory to dedicate to something that will practically never be used.

I personally have not observed problems with these kinds of bumps in
resource usage, although I may be biased towards larger systems where it
doesn't matter as much.

IME it matters *more* on larger systems. Or at least used to, I haven't
experimented with this in quite a while.

It's possible that we improved a bunch of things sufficiently for this to not
matter anymore.

Greetings,

Andres Freund

#33

Nathan Bossart

nathandbossart@gmail.com

over 1 year ago

In reply to: Andres Freund (#32)

Re: allow changing autovacuum_max_workers without restarting

On Mon, Jun 03, 2024 at 04:24:27PM -0700, Andres Freund wrote:

On 2024-06-03 14:28:13 -0500, Nathan Bossart wrote:

On Mon, Jun 03, 2024 at 12:08:52PM -0700, Andres Freund wrote:

Why do we think that increasing the number of PGPROC slots, heavyweight locks
etc by 256 isn't going to cause issues? That's not an insubstantial amount of
memory to dedicate to something that will practically never be used.

I personally have not observed problems with these kinds of bumps in
resource usage, although I may be biased towards larger systems where it
doesn't matter as much.

IME it matters *more* on larger systems. Or at least used to, I haven't
experimented with this in quite a while.

It's possible that we improved a bunch of things sufficiently for this to not
matter anymore.

I'm curious if there is something specific you would look into to verify
this. IIUC one concern is the lock table not fitting into L3. Is there
anything else? Any particular workloads you have in mind?

--
nathan

#34

Andres Freund

andres@anarazel.de

over 1 year ago

In reply to: Nathan Bossart (#33)

Re: allow changing autovacuum_max_workers without restarting

Hi,

On 2024-06-18 14:00:00 -0500, Nathan Bossart wrote:

On Mon, Jun 03, 2024 at 04:24:27PM -0700, Andres Freund wrote:

On 2024-06-03 14:28:13 -0500, Nathan Bossart wrote:

On Mon, Jun 03, 2024 at 12:08:52PM -0700, Andres Freund wrote:

Why do we think that increasing the number of PGPROC slots, heavyweight locks
etc by 256 isn't going to cause issues? That's not an insubstantial amount of
memory to dedicate to something that will practically never be used.

I personally have not observed problems with these kinds of bumps in
resource usage, although I may be biased towards larger systems where it
doesn't matter as much.

IME it matters *more* on larger systems. Or at least used to, I haven't
experimented with this in quite a while.

It's possible that we improved a bunch of things sufficiently for this to not
matter anymore.

I'm curious if there is something specific you would look into to verify
this. IIUC one concern is the lock table not fitting into L3. Is there
anything else? Any particular workloads you have in mind?

That was the main thing I was thinking of.

But I think I just thought of one more: It's going to *substantially* increase
the resource usage for tap tests. Right now Cluster.pm has
# conservative settings to ensure we can run multiple postmasters:
print $conf "shared_buffers = 1MB\n";
print $conf "max_connections = 10\n";

for nodes that allow streaming.

Adding 256 extra backend slots increases the shared memory usage from ~5MB to
~18MB.

I just don't see much point in reserving 256 worker "possibilities", tbh. I
can't think of any practical system where it makes sense to use this much (nor
do I think it's going to be reasonable in the next 10 years) and it's just
going to waste memory and startup time for everyone.

Nor does it make sense to me to have the max autovac workers be independent of
max_connections.

Greetings,

Andres Freund

#35

Nathan Bossart

nathandbossart@gmail.com

over 1 year ago

In reply to: Andres Freund (#34)

Re: allow changing autovacuum_max_workers without restarting

On Tue, Jun 18, 2024 at 01:43:34PM -0700, Andres Freund wrote:

I just don't see much point in reserving 256 worker "possibilities", tbh. I
can't think of any practical system where it makes sense to use this much (nor
do I think it's going to be reasonable in the next 10 years) and it's just
going to waste memory and startup time for everyone.

Given this, here are some options I see for moving this forward:

* lower the cap to, say, 64 or 32
* exclude autovacuum worker slots from computing number of locks, etc.
* make the cap configurable and default it to something low (e.g., 8)

My intent with a reserved set of 256 slots was to prevent users from
needing to deal with two GUCs. For all practical purposes, it would be
possible to change autovacuum_max_workers whenever you want. But if the
extra resource requirements are too much of a tax, I'm content to change
course.

--
nathan

#36

Andres Freund

andres@anarazel.de

over 1 year ago

In reply to: Nathan Bossart (#35)

Re: allow changing autovacuum_max_workers without restarting

Hi,

On 2024-06-18 16:09:09 -0500, Nathan Bossart wrote:

On Tue, Jun 18, 2024 at 01:43:34PM -0700, Andres Freund wrote:

I just don't see much point in reserving 256 worker "possibilities", tbh. I
can't think of any practical system where it makes sense to use this much (nor
do I think it's going to be reasonable in the next 10 years) and it's just
going to waste memory and startup time for everyone.

Given this, here are some options I see for moving this forward:

* lower the cap to, say, 64 or 32
* exclude autovacuum worker slots from computing number of locks, etc.

That seems good regardless

* make the cap configurable and default it to something low (e.g., 8)

Another one:

Have a general cap of 64, but additionally limit it to something like
max(1, min(WORKER_CAP, max_connections / 4))

so that cases like tap tests don't end up allocating vastly more worker slots
than actual connection slots.

My intent with a reserved set of 256 slots was to prevent users from
needing to deal with two GUCs. For all practical purposes, it would be
possible to change autovacuum_max_workers whenever you want. But if the
extra resource requirements are too much of a tax, I'm content to change
course.

Approximately tripling shared memory usage for tap test instances does seem
too much to me.

Greetings,

Andres Freund

#37

Nathan Bossart

nathandbossart@gmail.com

over 1 year ago

In reply to: Andres Freund (#36)

Re: allow changing autovacuum_max_workers without restarting

On Tue, Jun 18, 2024 at 02:33:31PM -0700, Andres Freund wrote:

Another one:

Have a general cap of 64, but additionally limit it to something like
max(1, min(WORKER_CAP, max_connections / 4))

so that cases like tap tests don't end up allocating vastly more worker slots
than actual connection slots.

That's a clever idea. My only concern would be that we are tethering two
parameters that aren't super closely related, but I'm unsure whether it
would cause any problems in practice.

--
nathan

#38

Nathan Bossart

nathandbossart@gmail.com

over 1 year ago

In reply to: Nathan Bossart (#37)

3 attachment(s)

Re: allow changing autovacuum_max_workers without restarting

On Tue, Jun 18, 2024 at 07:43:36PM -0500, Nathan Bossart wrote:

On Tue, Jun 18, 2024 at 02:33:31PM -0700, Andres Freund wrote:

Another one:

Have a general cap of 64, but additionally limit it to something like
max(1, min(WORKER_CAP, max_connections / 4))

so that cases like tap tests don't end up allocating vastly more worker slots
than actual connection slots.

That's a clever idea. My only concern would be that we are tethering two
parameters that aren't super closely related, but I'm unsure whether it
would cause any problems in practice.

Here is an attempt at doing this. I've added 0001 [0]https://commitfest.postgresql.org/48/4998/ and 0002 [1]https://commitfest.postgresql.org/48/5059/ as
prerequisite patches, which helps simplify 0003 a bit. It probably doesn't
work correctly for EXEC_BACKEND builds yet.

I'm still not sure about this approach. At the moment, I'm leaning towards
something more like v2 [2]/messages/by-id/20240419154322.GA3988554@nathanxps13 where the upper limit is a PGC_POSTMASTER GUC
(that we would set very low for TAP tests).

[0]: https://commitfest.postgresql.org/48/4998/
[1]: https://commitfest.postgresql.org/48/5059/
[2]: /messages/by-id/20240419154322.GA3988554@nathanxps13

--
nathan

Attachments:

v5-0001-add-num_os_semaphores-GUC.patchtext/plain; charset=us-asciiDownload

From 69245426585052af1317c1bf3e564cae0c019f52 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Tue, 21 May 2024 14:02:22 -0500
Subject: [PATCH v5 1/3] add num_os_semaphores GUC

---
 doc/src/sgml/config.sgml            | 14 +++++++++++
 doc/src/sgml/runtime.sgml           | 39 ++++++++++++++++-------------
 src/backend/storage/ipc/ipci.c      |  6 ++++-
 src/backend/utils/misc/guc_tables.c | 12 +++++++++
 4 files changed, 53 insertions(+), 18 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 0c7a9082c5..087385cb4e 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -11234,6 +11234,20 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir'
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-num-os-semaphores" xreflabel="num_os_semaphores">
+      <term><varname>num_os_semaphores</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>num_os_semaphores</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Reports the number of semaphores that are needed for the server based
+        on the number of allowed connections, worker processes, etc.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-ssl-library" xreflabel="ssl_library">
       <term><varname>ssl_library</varname> (<type>string</type>)
       <indexterm>
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 2f7c618886..26f3cbe555 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -781,13 +781,13 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
        <row>
         <entry><varname>SEMMNI</varname></entry>
         <entry>Maximum number of semaphore identifiers (i.e., sets)</entry>
-        <entry>at least <literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 7) / 16)</literal> plus room for other applications</entry>
+        <entry>at least <literal>ceil(num_os_semaphores / 16)</literal> plus room for other applications</entry>
        </row>
 
        <row>
         <entry><varname>SEMMNS</varname></entry>
         <entry>Maximum number of semaphores system-wide</entry>
-        <entry><literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 7) / 16) * 17</literal> plus room for other applications</entry>
+        <entry><literal>ceil(num_os_semaphores / 16) * 17</literal> plus room for other applications</entry>
        </row>
 
        <row>
@@ -836,30 +836,38 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
 
    <para>
     When using System V semaphores,
-    <productname>PostgreSQL</productname> uses one semaphore per allowed connection
-    (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
-    (<xref linkend="guc-autovacuum-max-workers"/>), allowed WAL sender process
-    (<xref linkend="guc-max-wal-senders"/>), and allowed background
-    process (<xref linkend="guc-max-worker-processes"/>), in sets of 16.
+    <productname>PostgreSQL</productname> uses one semaphore per allowed connection,
+    worker process, etc., in sets of 16.
     Each such set will
     also contain a 17th semaphore which contains a <quote>magic
     number</quote>, to detect collision with semaphore sets used by
     other applications. The maximum number of semaphores in the system
     is set by <varname>SEMMNS</varname>, which consequently must be at least
-    as high as <varname>max_connections</varname> plus
-    <varname>autovacuum_max_workers</varname> plus <varname>max_wal_senders</varname>,
-    plus <varname>max_worker_processes</varname>, plus one extra for each 16
-    allowed connections plus workers (see the formula in <xref
+    as high as <xref linkend="guc-num-os-semaphores"/> plus one extra for
+    each set of 16 required semaphores (see the formula in <xref
     linkend="sysvipc-parameters"/>).  The parameter <varname>SEMMNI</varname>
     determines the limit on the number of semaphore sets that can
     exist on the system at one time.  Hence this parameter must be at
-    least <literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 7) / 16)</literal>.
+    least <literal>ceil(num_os_semaphores / 16)</literal>.
     Lowering the number
     of allowed connections is a temporary workaround for failures,
     which are usually confusingly worded <quote>No space
     left on device</quote>, from the function <function>semget</function>.
    </para>
 
+   <para>
+    The number of semaphores required by <productname>PostgreSQL</productname>
+    is provided by the runtime-computed parameter
+    <varname>num_os_semaphores</varname>, which can be determined before
+    starting the server with a <command>postgres</command> command like:
+<programlisting>
+$ <userinput>postgres -D $PGDATA -C num_os_semaphores</userinput>
+</programlisting>
+    The value of <varname>num_os_semaphores</varname> should be input into
+    the aforementioned formulas to determine appropriate values for
+    <varname>SEMMNI</varname> and <varname>SEMMNS</varname>.
+   </para>
+
    <para>
     In some cases it might also be necessary to increase
     <varname>SEMMAP</varname> to be at least on the order of
@@ -882,11 +890,8 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
 
    <para>
     When using POSIX semaphores, the number of semaphores needed is the
-    same as for System V, that is one semaphore per allowed connection
-    (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
-    (<xref linkend="guc-autovacuum-max-workers"/>), allowed WAL sender process
-    (<xref linkend="guc-max-wal-senders"/>), and allowed background
-    process (<xref linkend="guc-max-worker-processes"/>).
+    same as for System V, that is one semaphore per allowed connection,
+    worker process, etc.
     On the platforms where this option is preferred, there is no specific
     kernel limit on the number of POSIX semaphores.
    </para>
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 521ed5418c..0314513aa6 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -372,11 +372,12 @@ InitializeShmemGUCs(void)
 	Size		size_b;
 	Size		size_mb;
 	Size		hp_size;
+	int			num_semas;
 
 	/*
 	 * Calculate the shared memory size and round up to the nearest megabyte.
 	 */
-	size_b = CalculateShmemSize(NULL);
+	size_b = CalculateShmemSize(&num_semas);
 	size_mb = add_size(size_b, (1024 * 1024) - 1) / (1024 * 1024);
 	sprintf(buf, "%zu", size_mb);
 	SetConfigOption("shared_memory_size", buf,
@@ -395,4 +396,7 @@ InitializeShmemGUCs(void)
 		SetConfigOption("shared_memory_size_in_huge_pages", buf,
 						PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
 	}
+
+	sprintf(buf, "%d", num_semas);
+	SetConfigOption("num_os_semaphores", buf, PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
 }
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 46c258be28..80e77cbac9 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -599,6 +599,7 @@ static int	segment_size;
 static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
+static int	num_os_semaphores;
 static bool data_checksums;
 static bool integer_datetimes;
 
@@ -2291,6 +2292,17 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"num_os_semaphores", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows the number of semaphores required for the server."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
+		},
+		&num_os_semaphores,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_timestamp_buffers", PGC_POSTMASTER, RESOURCES_MEM,
 			gettext_noop("Sets the size of the dedicated buffer pool used for the commit timestamp cache."),
-- 
2.39.3 (Apple Git-146)

v5-0002-remove-check-hooks-for-GUCs-that-contribute-to-Ma.patchtext/plain; charset=us-asciiDownload

From b65ae84fc36c440451c42ef61b383bcd98a825bd Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Wed, 19 Jun 2024 13:39:18 -0500
Subject: [PATCH v5 2/3] remove check hooks for GUCs that contribute to
 MaxBackends

---
 src/backend/utils/init/postinit.c   | 57 ++++-------------------------
 src/backend/utils/misc/guc_tables.c |  8 ++--
 src/include/utils/guc_hooks.h       |  6 ---
 3 files changed, 11 insertions(+), 60 deletions(-)

diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 0805398e24..8a629982c4 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -580,57 +580,14 @@ InitializeMaxBackends(void)
 	MaxBackends = MaxConnections + autovacuum_max_workers + 1 +
 		max_worker_processes + max_wal_senders;
 
-	/* internal error because the values were all checked previously */
 	if (MaxBackends > MAX_BACKENDS)
-		elog(ERROR, "too many backends configured");
-}
-
-/*
- * GUC check_hook for max_connections
- */
-bool
-check_max_connections(int *newval, void **extra, GucSource source)
-{
-	if (*newval + autovacuum_max_workers + 1 +
-		max_worker_processes + max_wal_senders > MAX_BACKENDS)
-		return false;
-	return true;
-}
-
-/*
- * GUC check_hook for autovacuum_max_workers
- */
-bool
-check_autovacuum_max_workers(int *newval, void **extra, GucSource source)
-{
-	if (MaxConnections + *newval + 1 +
-		max_worker_processes + max_wal_senders > MAX_BACKENDS)
-		return false;
-	return true;
-}
-
-/*
- * GUC check_hook for max_worker_processes
- */
-bool
-check_max_worker_processes(int *newval, void **extra, GucSource source)
-{
-	if (MaxConnections + autovacuum_max_workers + 1 +
-		*newval + max_wal_senders > MAX_BACKENDS)
-		return false;
-	return true;
-}
-
-/*
- * GUC check_hook for max_wal_senders
- */
-bool
-check_max_wal_senders(int *newval, void **extra, GucSource source)
-{
-	if (MaxConnections + autovacuum_max_workers + 1 +
-		max_worker_processes + *newval > MAX_BACKENDS)
-		return false;
-	return true;
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("too many backends configured"),
+				 errdetail("\"max_connections\" (%d) plus \"autovacuum_max_workers\" (%d) plus \"max_worker_processes\" (%d) plus \"max_wal_senders\" (%d) must be less than %d.",
+						   MaxConnections, autovacuum_max_workers,
+						   max_worker_processes, max_wal_senders,
+						   MAX_BACKENDS - 1)));
 }
 
 /*
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 80e77cbac9..57df7767ad 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2208,7 +2208,7 @@ struct config_int ConfigureNamesInt[] =
 		},
 		&MaxConnections,
 		100, 1, MAX_BACKENDS,
-		check_max_connections, NULL, NULL
+		NULL, NULL, NULL
 	},
 
 	{
@@ -2935,7 +2935,7 @@ struct config_int ConfigureNamesInt[] =
 		},
 		&max_wal_senders,
 		10, 0, MAX_BACKENDS,
-		check_max_wal_senders, NULL, NULL
+		NULL, NULL, NULL
 	},
 
 	{
@@ -3165,7 +3165,7 @@ struct config_int ConfigureNamesInt[] =
 		},
 		&max_worker_processes,
 		8, 0, MAX_BACKENDS,
-		check_max_worker_processes, NULL, NULL
+		NULL, NULL, NULL
 	},
 
 	{
@@ -3399,7 +3399,7 @@ struct config_int ConfigureNamesInt[] =
 		},
 		&autovacuum_max_workers,
 		3, 1, MAX_BACKENDS,
-		check_autovacuum_max_workers, NULL, NULL
+		NULL, NULL, NULL
 	},
 
 	{
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index d64dc5fcdb..6304f0679b 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -29,8 +29,6 @@ extern bool check_application_name(char **newval, void **extra,
 								   GucSource source);
 extern void assign_application_name(const char *newval, void *extra);
 extern const char *show_archive_command(void);
-extern bool check_autovacuum_max_workers(int *newval, void **extra,
-										 GucSource source);
 extern bool check_autovacuum_work_mem(int *newval, void **extra,
 									  GucSource source);
 extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
@@ -84,13 +82,9 @@ extern const char *show_log_timezone(void);
 extern bool check_maintenance_io_concurrency(int *newval, void **extra,
 											 GucSource source);
 extern void assign_maintenance_io_concurrency(int newval, void *extra);
-extern bool check_max_connections(int *newval, void **extra, GucSource source);
-extern bool check_max_wal_senders(int *newval, void **extra, GucSource source);
 extern bool check_max_slot_wal_keep_size(int *newval, void **extra,
 										 GucSource source);
 extern void assign_max_wal_size(int newval, void *extra);
-extern bool check_max_worker_processes(int *newval, void **extra,
-									   GucSource source);
 extern bool check_max_stack_depth(int *newval, void **extra, GucSource source);
 extern void assign_max_stack_depth(int newval, void *extra);
 extern bool check_multixact_member_buffers(int *newval, void **extra,
-- 
2.39.3 (Apple Git-146)

v5-0003-allow-changing-autovacuum_max_workers-without-res.patchtext/plain; charset=us-asciiDownload

From 48f7d56b99f2d98533c8767d5891a78c8411872d Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Fri, 21 Jun 2024 15:29:28 -0500
Subject: [PATCH v5 3/3] allow changing autovacuum_max_workers without
 restarting

---
 doc/src/sgml/config.sgml                      |  8 +++-
 src/backend/postmaster/autovacuum.c           | 25 ++++++++++---
 src/backend/postmaster/postmaster.c           |  2 +-
 src/backend/storage/lmgr/proc.c               |  4 +-
 src/backend/utils/init/postinit.c             | 37 +++++++++++++++++--
 src/backend/utils/misc/guc_tables.c           |  4 +-
 src/backend/utils/misc/postgresql.conf.sample |  1 -
 src/include/postmaster/autovacuum.h           |  1 +
 src/include/utils/guc_hooks.h                 |  2 +
 9 files changed, 69 insertions(+), 15 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 087385cb4e..289c2b63c4 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8554,7 +8554,13 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
        <para>
         Specifies the maximum number of autovacuum processes (other than the
         autovacuum launcher) that may be running at any one time.  The default
-        is three.  This parameter can only be set at server start.
+        is three.  This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command line.
+       </para>
+       <para>
+        This parameter is limited to 50% of
+        <xref linkend="guc-max-connections"/> (but no less than 1 or greater
+        than 64).
        </para>
       </listitem>
      </varlistentry>
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 9a925a10cd..8bf95981a0 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -130,6 +130,7 @@ double		autovacuum_vac_cost_delay;
 int			autovacuum_vac_cost_limit;
 
 int			Log_autovacuum_min_duration = 600000;
+int			autovacuum_worker_slots = -1;
 
 /* the minimum allowed time between two awakenings of the launcher */
 #define MIN_AUTOVAC_SLEEPTIME 100.0 /* milliseconds */
@@ -347,6 +348,7 @@ static void autovac_report_activity(autovac_table *tab);
 static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
 									const char *nspname, const char *relname);
 static void avl_sigusr2_handler(SIGNAL_ARGS);
+static bool av_worker_available(void);
 
 
 
@@ -575,7 +577,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		 * wakening conditions.
 		 */
 
-		launcher_determine_sleep(!dlist_is_empty(&AutoVacuumShmem->av_freeWorkers),
+		launcher_determine_sleep(av_worker_available(),
 								 false, &nap);
 
 		/*
@@ -636,7 +638,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		current_time = GetCurrentTimestamp();
 		LWLockAcquire(AutovacuumLock, LW_SHARED);
 
-		can_launch = !dlist_is_empty(&AutoVacuumShmem->av_freeWorkers);
+		can_launch = av_worker_available();
 
 		if (AutoVacuumShmem->av_startingWorker != NULL)
 		{
@@ -1087,7 +1089,7 @@ do_start_worker(void)
 
 	/* return quickly when there are no free workers */
 	LWLockAcquire(AutovacuumLock, LW_SHARED);
-	if (dlist_is_empty(&AutoVacuumShmem->av_freeWorkers))
+	if (!av_worker_available())
 	{
 		LWLockRelease(AutovacuumLock);
 		return InvalidOid;
@@ -3265,7 +3267,7 @@ AutoVacuumShmemSize(void)
 	 */
 	size = sizeof(AutoVacuumShmemStruct);
 	size = MAXALIGN(size);
-	size = add_size(size, mul_size(autovacuum_max_workers,
+	size = add_size(size, mul_size(autovacuum_worker_slots,
 								   sizeof(WorkerInfoData)));
 	return size;
 }
@@ -3302,7 +3304,7 @@ AutoVacuumShmemInit(void)
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
 		/* initialize the WorkerInfo free list */
-		for (i = 0; i < autovacuum_max_workers; i++)
+		for (i = 0; i < autovacuum_worker_slots; i++)
 		{
 			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 							&worker[i].wi_links);
@@ -3341,3 +3343,16 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
 
 	return true;
 }
+
+static bool
+av_worker_available(void)
+{
+	int			reserved = autovacuum_worker_slots - autovacuum_max_workers;
+	int			count = 0;
+	dlist_iter	iter;
+
+	dlist_foreach(iter, &AutoVacuumShmem->av_freeWorkers)
+		count++;
+
+	return count > reserved;
+}
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index bf0241aed0..2588a68fbc 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -4144,7 +4144,7 @@ CreateOptsFile(int argc, char *argv[], char *fullprogname)
 int
 MaxLivePostmasterChildren(void)
 {
-	return 2 * (MaxConnections + autovacuum_max_workers + 1 +
+	return 2 * (MaxConnections + autovacuum_worker_slots + 1 +
 				max_wal_senders + max_worker_processes);
 }
 
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index ce29da9012..10e0b681cf 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -242,13 +242,13 @@ InitProcGlobal(void)
 			dlist_push_tail(&ProcGlobal->freeProcs, &proc->links);
 			proc->procgloballist = &ProcGlobal->freeProcs;
 		}
-		else if (i < MaxConnections + autovacuum_max_workers + 1)
+		else if (i < MaxConnections + autovacuum_worker_slots + 1)
 		{
 			/* PGPROC for AV launcher/worker, add to autovacFreeProcs list */
 			dlist_push_tail(&ProcGlobal->autovacFreeProcs, &proc->links);
 			proc->procgloballist = &ProcGlobal->autovacFreeProcs;
 		}
-		else if (i < MaxConnections + autovacuum_max_workers + 1 + max_worker_processes)
+		else if (i < MaxConnections + autovacuum_worker_slots + 1 + max_worker_processes)
 		{
 			/* PGPROC for bgworker, add to bgworkerFreeProcs list */
 			dlist_push_tail(&ProcGlobal->bgworkerFreeProcs, &proc->links);
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 8a629982c4..d206dd1533 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -576,20 +576,51 @@ InitializeMaxBackends(void)
 {
 	Assert(MaxBackends == 0);
 
+	autovacuum_worker_slots = Max(1, Min(64, MaxConnections / 2));
+	if (autovacuum_max_workers > autovacuum_worker_slots)
+	{
+		/* keep in sync with check_autovacuum_max_workers() */
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("invalid setting for \"autovacuum_max_workers\""),
+				 errdetail("\"autovacuum_max_workers\" (%d) is limited to 50%% of \"max_connections\" (but no less than 1 or greater than 64) (%d)",
+						   autovacuum_max_workers, autovacuum_worker_slots)));
+	}
+
 	/* the extra unit accounts for the autovacuum launcher */
-	MaxBackends = MaxConnections + autovacuum_max_workers + 1 +
+	MaxBackends = MaxConnections + autovacuum_worker_slots + 1 +
 		max_worker_processes + max_wal_senders;
 
 	if (MaxBackends > MAX_BACKENDS)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("too many backends configured"),
-				 errdetail("\"max_connections\" (%d) plus \"autovacuum_max_workers\" (%d) plus \"max_worker_processes\" (%d) plus \"max_wal_senders\" (%d) must be less than %d.",
-						   MaxConnections, autovacuum_max_workers,
+				 errdetail("\"max_connections\" (%d) plus autovacuum worker slots (%d) plus \"max_worker_processes\" (%d) plus \"max_wal_senders\" (%d) must be less than %d.",
+						   MaxConnections, autovacuum_worker_slots,
 						   max_worker_processes, max_wal_senders,
 						   MAX_BACKENDS - 1)));
 }
 
+/*
+ * GUC check_hook for autovacuum_max_workers
+ */
+bool
+check_autovacuum_max_workers(int *newval, void **extra, GucSource source)
+{
+	/*
+	 * If autovacuum_worker_slots has not yet been initialized, just return
+	 * true.  We'll fail startup in PostmasterMain() if needed.
+	 */
+	if (autovacuum_worker_slots == -1 || *newval <= autovacuum_worker_slots)
+		return true;
+
+	/* keep in sync with matching error message in InitializeMaxBackends() */
+	GUC_check_errmsg("invalid setting for \"autovacuum_max_workers\"");
+	GUC_check_errdetail("\"autovacuum_max_workers\" (%d) is limited to 50%% of \"max_connections\" (but no less than 1 or greater than 64) (%d)",
+						autovacuum_max_workers, autovacuum_worker_slots);
+	return false;
+}
+
 /*
  * Early initialization of a backend (either standalone or under postmaster).
  * This happens even before InitPostgres.
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 57df7767ad..f1babff58e 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3393,13 +3393,13 @@ struct config_int ConfigureNamesInt[] =
 	},
 	{
 		/* see max_connections */
-		{"autovacuum_max_workers", PGC_POSTMASTER, AUTOVACUUM,
+		{"autovacuum_max_workers", PGC_SIGHUP, AUTOVACUUM,
 			gettext_noop("Sets the maximum number of simultaneously running autovacuum worker processes."),
 			NULL
 		},
 		&autovacuum_max_workers,
 		3, 1, MAX_BACKENDS,
-		NULL, NULL, NULL
+		check_autovacuum_max_workers, NULL, NULL
 	},
 
 	{
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index e0567de219..fec719fe79 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -659,7 +659,6 @@
 #autovacuum = on			# Enable autovacuum subprocess?  'on'
 					# requires track_counts to also be on.
 #autovacuum_max_workers = 3		# max number of autovacuum subprocesses
-					# (change requires restart)
 #autovacuum_naptime = 1min		# time between autovacuum runs
 #autovacuum_vacuum_threshold = 50	# min number of row updates before
 					# vacuum
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index cae1e8b329..2b2aa1b713 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -46,6 +46,7 @@ extern PGDLLIMPORT int autovacuum_vac_cost_limit;
 extern PGDLLIMPORT int AutovacuumLauncherPid;
 
 extern PGDLLIMPORT int Log_autovacuum_min_duration;
+extern PGDLLIMPORT int autovacuum_worker_slots;
 
 /* Status inquiry functions */
 extern bool AutoVacuumingActive(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 6304f0679b..b3e1f9f2cb 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -29,6 +29,8 @@ extern bool check_application_name(char **newval, void **extra,
 								   GucSource source);
 extern void assign_application_name(const char *newval, void *extra);
 extern const char *show_archive_command(void);
+extern bool check_autovacuum_max_workers(int *newval, void **extra,
+										 GucSource source);
 extern bool check_autovacuum_work_mem(int *newval, void **extra,
 									  GucSource source);
 extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
-- 
2.39.3 (Apple Git-146)

#39

Nathan Bossart

nathandbossart@gmail.com

over 1 year ago

In reply to: Nathan Bossart (#38)

1 attachment(s)

Re: allow changing autovacuum_max_workers without restarting

On Fri, Jun 21, 2024 at 03:44:07PM -0500, Nathan Bossart wrote:

I'm still not sure about this approach. At the moment, I'm leaning towards
something more like v2 [2] where the upper limit is a PGC_POSTMASTER GUC
(that we would set very low for TAP tests).

Like so.

--
nathan

Attachments:

v6-0001-allow-changing-autovacuum_max_workers-without-res.patchtext/plain; charset=us-asciiDownload

From c1c33c6c157a7cec81180714369b2978b09e402f Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Sat, 22 Jun 2024 15:05:44 -0500
Subject: [PATCH v6 1/1] allow changing autovacuum_max_workers without
 restarting

---
 doc/src/sgml/config.sgml                      | 28 +++++++++++-
 doc/src/sgml/runtime.sgml                     | 12 ++---
 src/backend/access/transam/xlog.c             |  2 +-
 src/backend/postmaster/autovacuum.c           | 44 ++++++++++++-------
 src/backend/postmaster/postmaster.c           |  2 +-
 src/backend/storage/lmgr/proc.c               |  6 +--
 src/backend/utils/init/postinit.c             | 12 ++---
 src/backend/utils/misc/guc_tables.c           | 13 +++++-
 src/backend/utils/misc/postgresql.conf.sample |  3 +-
 src/include/postmaster/autovacuum.h           |  1 +
 src/include/utils/guc_hooks.h                 |  4 +-
 src/test/perl/PostgreSQL/Test/Cluster.pm      |  1 +
 12 files changed, 89 insertions(+), 39 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 0c7a9082c5..64411918a4 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8544,6 +8544,25 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-autovacuum-worker-slots" xreflabel="autovacuum_worker_slots">
+      <term><varname>autovacuum_worker_slots</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>autovacuum_worker_slots</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Specifies the number of backend slots to reserve for autovacuum worker
+        processes.  The default is 16.  This parameter can only be set at server
+        start.
+       </para>
+       <para>
+        When changing this value, consider also adjusting
+        <xref linkend="guc-autovacuum-max-workers"/>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-autovacuum-max-workers" xreflabel="autovacuum_max_workers">
       <term><varname>autovacuum_max_workers</varname> (<type>integer</type>)
       <indexterm>
@@ -8554,7 +8573,14 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
        <para>
         Specifies the maximum number of autovacuum processes (other than the
         autovacuum launcher) that may be running at any one time.  The default
-        is three.  This parameter can only be set at server start.
+        is three.  This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command line.
+       </para>
+       <para>
+        Note that a setting for this value which is higher than
+        <xref linkend="guc-autovacuum-worker-slots"/> will have no effect,
+        since autovacuum workers are taken from the pool of slots established
+        by that setting.
        </para>
       </listitem>
      </varlistentry>
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 2f7c618886..4bb37faffe 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -781,13 +781,13 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
        <row>
         <entry><varname>SEMMNI</varname></entry>
         <entry>Maximum number of semaphore identifiers (i.e., sets)</entry>
-        <entry>at least <literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 7) / 16)</literal> plus room for other applications</entry>
+        <entry>at least <literal>ceil((max_connections + autovacuum_worker_slots + max_wal_senders + max_worker_processes + 7) / 16)</literal> plus room for other applications</entry>
        </row>
 
        <row>
         <entry><varname>SEMMNS</varname></entry>
         <entry>Maximum number of semaphores system-wide</entry>
-        <entry><literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 7) / 16) * 17</literal> plus room for other applications</entry>
+        <entry><literal>ceil((max_connections + autovacuum_worker_slots + max_wal_senders + max_worker_processes + 7) / 16) * 17</literal> plus room for other applications</entry>
        </row>
 
        <row>
@@ -838,7 +838,7 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     When using System V semaphores,
     <productname>PostgreSQL</productname> uses one semaphore per allowed connection
     (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
-    (<xref linkend="guc-autovacuum-max-workers"/>), allowed WAL sender process
+    (<xref linkend="guc-autovacuum-worker-slots"/>), allowed WAL sender process
     (<xref linkend="guc-max-wal-senders"/>), and allowed background
     process (<xref linkend="guc-max-worker-processes"/>), in sets of 16.
     Each such set will
@@ -847,13 +847,13 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     other applications. The maximum number of semaphores in the system
     is set by <varname>SEMMNS</varname>, which consequently must be at least
     as high as <varname>max_connections</varname> plus
-    <varname>autovacuum_max_workers</varname> plus <varname>max_wal_senders</varname>,
+    <varname>autovacuum_worker_slots</varname> plus <varname>max_wal_senders</varname>,
     plus <varname>max_worker_processes</varname>, plus one extra for each 16
     allowed connections plus workers (see the formula in <xref
     linkend="sysvipc-parameters"/>).  The parameter <varname>SEMMNI</varname>
     determines the limit on the number of semaphore sets that can
     exist on the system at one time.  Hence this parameter must be at
-    least <literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 7) / 16)</literal>.
+    least <literal>ceil((max_connections + autovacuum_worker_slots + max_wal_senders + max_worker_processes + 7) / 16)</literal>.
     Lowering the number
     of allowed connections is a temporary workaround for failures,
     which are usually confusingly worded <quote>No space
@@ -884,7 +884,7 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     When using POSIX semaphores, the number of semaphores needed is the
     same as for System V, that is one semaphore per allowed connection
     (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
-    (<xref linkend="guc-autovacuum-max-workers"/>), allowed WAL sender process
+    (<xref linkend="guc-autovacuum-worker-slots"/>), allowed WAL sender process
     (<xref linkend="guc-max-wal-senders"/>), and allowed background
     process (<xref linkend="guc-max-worker-processes"/>).
     On the platforms where this option is preferred, there is no specific
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 330e058c5f..fec72086b9 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5362,7 +5362,7 @@ CheckRequiredParameterValues(void)
 	 */
 	if (ArchiveRecoveryRequested && EnableHotStandby)
 	{
-		/* We ignore autovacuum_max_workers when we make this test. */
+		/* We ignore autovacuum_worker_slots when we make this test. */
 		RecoveryRequiresIntParameter("max_connections",
 									 MaxConnections,
 									 ControlFile->MaxConnections);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 9a925a10cd..5f413305a2 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -114,6 +114,7 @@
  * GUC parameters
  */
 bool		autovacuum_start_daemon = false;
+int			autovacuum_worker_slots;
 int			autovacuum_max_workers;
 int			autovacuum_work_mem = -1;
 int			autovacuum_naptime;
@@ -209,7 +210,7 @@ typedef struct autovac_table
 /*-------------
  * This struct holds information about a single worker's whereabouts.  We keep
  * an array of these in shared memory, sized according to
- * autovacuum_max_workers.
+ * autovacuum_worker_slots.
  *
  * wi_links		entry into free list or running list
  * wi_dboid		OID of the database this worker is supposed to work on
@@ -289,7 +290,7 @@ typedef struct
 {
 	sig_atomic_t av_signal[AutoVacNumSignals];
 	pid_t		av_launcherpid;
-	dlist_head	av_freeWorkers;
+	dclist_head av_freeWorkers;
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
@@ -347,6 +348,7 @@ static void autovac_report_activity(autovac_table *tab);
 static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
 									const char *nspname, const char *relname);
 static void avl_sigusr2_handler(SIGNAL_ARGS);
+static bool av_worker_available(void);
 
 
 
@@ -575,8 +577,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		 * wakening conditions.
 		 */
 
-		launcher_determine_sleep(!dlist_is_empty(&AutoVacuumShmem->av_freeWorkers),
-								 false, &nap);
+		launcher_determine_sleep(av_worker_available(), false, &nap);
 
 		/*
 		 * Wait until naptime expires or we get some type of signal (all the
@@ -636,7 +637,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		current_time = GetCurrentTimestamp();
 		LWLockAcquire(AutovacuumLock, LW_SHARED);
 
-		can_launch = !dlist_is_empty(&AutoVacuumShmem->av_freeWorkers);
+		can_launch = av_worker_available();
 
 		if (AutoVacuumShmem->av_startingWorker != NULL)
 		{
@@ -679,8 +680,8 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 					worker->wi_sharedrel = false;
 					worker->wi_proc = NULL;
 					worker->wi_launchtime = 0;
-					dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-									&worker->wi_links);
+					dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+									 &worker->wi_links);
 					AutoVacuumShmem->av_startingWorker = NULL;
 					ereport(WARNING,
 							errmsg("autovacuum worker took too long to start; canceled"));
@@ -1087,7 +1088,7 @@ do_start_worker(void)
 
 	/* return quickly when there are no free workers */
 	LWLockAcquire(AutovacuumLock, LW_SHARED);
-	if (dlist_is_empty(&AutoVacuumShmem->av_freeWorkers))
+	if (!av_worker_available())
 	{
 		LWLockRelease(AutovacuumLock);
 		return InvalidOid;
@@ -1240,7 +1241,7 @@ do_start_worker(void)
 		 * Get a worker entry from the freelist.  We checked above, so there
 		 * really should be a free slot.
 		 */
-		wptr = dlist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
+		wptr = dclist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
 
 		worker = dlist_container(WorkerInfoData, wi_links, wptr);
 		worker->wi_dboid = avdb->adw_datid;
@@ -1609,8 +1610,8 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
 		pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
-		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-						&MyWorkerInfo->wi_links);
+		dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+						 &MyWorkerInfo->wi_links);
 		/* not mine anymore */
 		MyWorkerInfo = NULL;
 
@@ -3265,7 +3266,7 @@ AutoVacuumShmemSize(void)
 	 */
 	size = sizeof(AutoVacuumShmemStruct);
 	size = MAXALIGN(size);
-	size = add_size(size, mul_size(autovacuum_max_workers,
+	size = add_size(size, mul_size(autovacuum_worker_slots,
 								   sizeof(WorkerInfoData)));
 	return size;
 }
@@ -3292,7 +3293,7 @@ AutoVacuumShmemInit(void)
 		Assert(!found);
 
 		AutoVacuumShmem->av_launcherpid = 0;
-		dlist_init(&AutoVacuumShmem->av_freeWorkers);
+		dclist_init(&AutoVacuumShmem->av_freeWorkers);
 		dlist_init(&AutoVacuumShmem->av_runningWorkers);
 		AutoVacuumShmem->av_startingWorker = NULL;
 		memset(AutoVacuumShmem->av_workItems, 0,
@@ -3302,10 +3303,10 @@ AutoVacuumShmemInit(void)
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
 		/* initialize the WorkerInfo free list */
-		for (i = 0; i < autovacuum_max_workers; i++)
+		for (i = 0; i < autovacuum_worker_slots; i++)
 		{
-			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-							&worker[i].wi_links);
+			dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+							 &worker[i].wi_links);
 			pg_atomic_init_flag(&worker[i].wi_dobalance);
 		}
 
@@ -3341,3 +3342,14 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
 
 	return true;
 }
+
+/*
+ * Returns whether there is a free autovacuum worker slot available.
+ */
+static bool
+av_worker_available(void)
+{
+	int			reserved = autovacuum_worker_slots - autovacuum_max_workers;
+
+	return dclist_count(&AutoVacuumShmem->av_freeWorkers) > Max(0, reserved);
+}
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index bf0241aed0..2588a68fbc 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -4144,7 +4144,7 @@ CreateOptsFile(int argc, char *argv[], char *fullprogname)
 int
 MaxLivePostmasterChildren(void)
 {
-	return 2 * (MaxConnections + autovacuum_max_workers + 1 +
+	return 2 * (MaxConnections + autovacuum_worker_slots + 1 +
 				max_wal_senders + max_worker_processes);
 }
 
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index ce29da9012..80dbd793b5 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -142,7 +142,7 @@ ProcGlobalSemas(void)
  *	  So, now we grab enough semaphores to support the desired max number
  *	  of backends immediately at initialization --- if the sysadmin has set
  *	  MaxConnections, max_worker_processes, max_wal_senders, or
- *	  autovacuum_max_workers higher than his kernel will support, he'll
+ *	  autovacuum_worker_slots higher than his kernel will support, he'll
  *	  find out sooner rather than later.
  *
  *	  Another reason for creating semaphores here is that the semaphore
@@ -242,13 +242,13 @@ InitProcGlobal(void)
 			dlist_push_tail(&ProcGlobal->freeProcs, &proc->links);
 			proc->procgloballist = &ProcGlobal->freeProcs;
 		}
-		else if (i < MaxConnections + autovacuum_max_workers + 1)
+		else if (i < MaxConnections + autovacuum_worker_slots + 1)
 		{
 			/* PGPROC for AV launcher/worker, add to autovacFreeProcs list */
 			dlist_push_tail(&ProcGlobal->autovacFreeProcs, &proc->links);
 			proc->procgloballist = &ProcGlobal->autovacFreeProcs;
 		}
-		else if (i < MaxConnections + autovacuum_max_workers + 1 + max_worker_processes)
+		else if (i < MaxConnections + autovacuum_worker_slots + 1 + max_worker_processes)
 		{
 			/* PGPROC for bgworker, add to bgworkerFreeProcs list */
 			dlist_push_tail(&ProcGlobal->bgworkerFreeProcs, &proc->links);
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 0805398e24..8fb7753fe0 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -577,7 +577,7 @@ InitializeMaxBackends(void)
 	Assert(MaxBackends == 0);
 
 	/* the extra unit accounts for the autovacuum launcher */
-	MaxBackends = MaxConnections + autovacuum_max_workers + 1 +
+	MaxBackends = MaxConnections + autovacuum_worker_slots + 1 +
 		max_worker_processes + max_wal_senders;
 
 	/* internal error because the values were all checked previously */
@@ -591,17 +591,17 @@ InitializeMaxBackends(void)
 bool
 check_max_connections(int *newval, void **extra, GucSource source)
 {
-	if (*newval + autovacuum_max_workers + 1 +
+	if (*newval + autovacuum_worker_slots + 1 +
 		max_worker_processes + max_wal_senders > MAX_BACKENDS)
 		return false;
 	return true;
 }
 
 /*
- * GUC check_hook for autovacuum_max_workers
+ * GUC check_hook for autovacuum_worker_slots
  */
 bool
-check_autovacuum_max_workers(int *newval, void **extra, GucSource source)
+check_autovacuum_worker_slots(int *newval, void **extra, GucSource source)
 {
 	if (MaxConnections + *newval + 1 +
 		max_worker_processes + max_wal_senders > MAX_BACKENDS)
@@ -615,7 +615,7 @@ check_autovacuum_max_workers(int *newval, void **extra, GucSource source)
 bool
 check_max_worker_processes(int *newval, void **extra, GucSource source)
 {
-	if (MaxConnections + autovacuum_max_workers + 1 +
+	if (MaxConnections + autovacuum_worker_slots + 1 +
 		*newval + max_wal_senders > MAX_BACKENDS)
 		return false;
 	return true;
@@ -627,7 +627,7 @@ check_max_worker_processes(int *newval, void **extra, GucSource source)
 bool
 check_max_wal_senders(int *newval, void **extra, GucSource source)
 {
-	if (MaxConnections + autovacuum_max_workers + 1 +
+	if (MaxConnections + autovacuum_worker_slots + 1 +
 		max_worker_processes + *newval > MAX_BACKENDS)
 		return false;
 	return true;
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 46c258be28..d9fa3bc8db 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3381,13 +3381,22 @@ struct config_int ConfigureNamesInt[] =
 	},
 	{
 		/* see max_connections */
-		{"autovacuum_max_workers", PGC_POSTMASTER, AUTOVACUUM,
+		{"autovacuum_worker_slots", PGC_POSTMASTER, AUTOVACUUM,
+			gettext_noop("Sets the number of backend slots to allocate for autovacuum workers."),
+			NULL
+		},
+		&autovacuum_worker_slots,
+		16, 1, MAX_BACKENDS,
+		check_autovacuum_worker_slots, NULL, NULL
+	},
+	{
+		{"autovacuum_max_workers", PGC_SIGHUP, AUTOVACUUM,
 			gettext_noop("Sets the maximum number of simultaneously running autovacuum worker processes."),
 			NULL
 		},
 		&autovacuum_max_workers,
 		3, 1, MAX_BACKENDS,
-		check_autovacuum_max_workers, NULL, NULL
+		NULL, NULL, NULL
 	},
 
 	{
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index e0567de219..4084e80cb9 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -658,8 +658,9 @@
 
 #autovacuum = on			# Enable autovacuum subprocess?  'on'
 					# requires track_counts to also be on.
-#autovacuum_max_workers = 3		# max number of autovacuum subprocesses
+autovacuum_worker_slots = 16	# autovacuum worker slots to allocate
 					# (change requires restart)
+#autovacuum_max_workers = 3		# max number of autovacuum subprocesses
 #autovacuum_naptime = 1min		# time between autovacuum runs
 #autovacuum_vacuum_threshold = 50	# min number of row updates before
 					# vacuum
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index cae1e8b329..190baa699d 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -28,6 +28,7 @@ typedef enum
 
 /* GUC variables */
 extern PGDLLIMPORT bool autovacuum_start_daemon;
+extern PGDLLIMPORT int autovacuum_worker_slots;
 extern PGDLLIMPORT int autovacuum_max_workers;
 extern PGDLLIMPORT int autovacuum_work_mem;
 extern PGDLLIMPORT int autovacuum_naptime;
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index d64dc5fcdb..aa5da2c286 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -29,8 +29,8 @@ extern bool check_application_name(char **newval, void **extra,
 								   GucSource source);
 extern void assign_application_name(const char *newval, void *extra);
 extern const char *show_archive_command(void);
-extern bool check_autovacuum_max_workers(int *newval, void **extra,
-										 GucSource source);
+extern bool check_autovacuum_worker_slots(int *newval, void **extra,
+										  GucSource source);
 extern bool check_autovacuum_work_mem(int *newval, void **extra,
 									  GucSource source);
 extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 83f385a487..9428e9cd1a 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -608,6 +608,7 @@ sub init
 		}
 		print $conf "max_wal_senders = 10\n";
 		print $conf "max_replication_slots = 10\n";
+		print $conf "autovacuum_worker_slots = 3\n";
 		print $conf "wal_log_hints = on\n";
 		print $conf "hot_standby = on\n";
 		# conservative settings to ensure we can run multiple postmasters:
-- 
2.39.3 (Apple Git-146)

#40

Nathan Bossart

nathandbossart@gmail.com

over 1 year ago

In reply to: Nathan Bossart (#39)

1 attachment(s)

Re: allow changing autovacuum_max_workers without restarting

Here is a rebased patch.

One thing that still bugs me is that there is no feedback sent to the user
when autovacuum_max_workers is set higher than autovacuum_worker_slots. I
think we should at least emit a WARNING, perhaps from the autovacuum
launcher, i.e., once when the launcher starts and then again as needed via
HandleAutoVacLauncherInterrupts(). Or we could fail to start in
PostmasterMain() and then ignore later misconfigurations via a GUC check
hook. I'm not too thrilled about adding more GUC check hooks that depend
on the value of other GUCs, but I do like the idea of failing instead of
silently proceeding with a different value than the user configured. Any
thoughts?

--
nathan

Attachments:

v7-0001-allow-changing-autovacuum_max_workers-without-res.patchtext/plain; charset=us-asciiDownload

From bd486d1ab302c4654b9cfbc57230bcf9b140711e Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Sat, 22 Jun 2024 15:05:44 -0500
Subject: [PATCH v7 1/1] allow changing autovacuum_max_workers without
 restarting

---
 doc/src/sgml/config.sgml                      | 28 +++++++++++-
 doc/src/sgml/runtime.sgml                     | 12 ++---
 src/backend/access/transam/xlog.c             |  2 +-
 src/backend/postmaster/autovacuum.c           | 44 ++++++++++++-------
 src/backend/postmaster/postmaster.c           |  2 +-
 src/backend/storage/lmgr/proc.c               |  6 +--
 src/backend/utils/init/postinit.c             |  6 +--
 src/backend/utils/misc/guc_tables.c           | 11 ++++-
 src/backend/utils/misc/postgresql.conf.sample |  3 +-
 src/include/postmaster/autovacuum.h           |  1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |  1 +
 11 files changed, 83 insertions(+), 33 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f627a3e63c..da30d1ea4f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8544,6 +8544,25 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-autovacuum-worker-slots" xreflabel="autovacuum_worker_slots">
+      <term><varname>autovacuum_worker_slots</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>autovacuum_worker_slots</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Specifies the number of backend slots to reserve for autovacuum worker
+        processes.  The default is 16.  This parameter can only be set at server
+        start.
+       </para>
+       <para>
+        When changing this value, consider also adjusting
+        <xref linkend="guc-autovacuum-max-workers"/>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-autovacuum-max-workers" xreflabel="autovacuum_max_workers">
       <term><varname>autovacuum_max_workers</varname> (<type>integer</type>)
       <indexterm>
@@ -8554,7 +8573,14 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
        <para>
         Specifies the maximum number of autovacuum processes (other than the
         autovacuum launcher) that may be running at any one time.  The default
-        is three.  This parameter can only be set at server start.
+        is three.  This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command line.
+       </para>
+       <para>
+        Note that a setting for this value which is higher than
+        <xref linkend="guc-autovacuum-worker-slots"/> will have no effect,
+        since autovacuum workers are taken from the pool of slots established
+        by that setting.
        </para>
       </listitem>
      </varlistentry>
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 2f7c618886..4bb37faffe 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -781,13 +781,13 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
        <row>
         <entry><varname>SEMMNI</varname></entry>
         <entry>Maximum number of semaphore identifiers (i.e., sets)</entry>
-        <entry>at least <literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 7) / 16)</literal> plus room for other applications</entry>
+        <entry>at least <literal>ceil((max_connections + autovacuum_worker_slots + max_wal_senders + max_worker_processes + 7) / 16)</literal> plus room for other applications</entry>
        </row>
 
        <row>
         <entry><varname>SEMMNS</varname></entry>
         <entry>Maximum number of semaphores system-wide</entry>
-        <entry><literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 7) / 16) * 17</literal> plus room for other applications</entry>
+        <entry><literal>ceil((max_connections + autovacuum_worker_slots + max_wal_senders + max_worker_processes + 7) / 16) * 17</literal> plus room for other applications</entry>
        </row>
 
        <row>
@@ -838,7 +838,7 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     When using System V semaphores,
     <productname>PostgreSQL</productname> uses one semaphore per allowed connection
     (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
-    (<xref linkend="guc-autovacuum-max-workers"/>), allowed WAL sender process
+    (<xref linkend="guc-autovacuum-worker-slots"/>), allowed WAL sender process
     (<xref linkend="guc-max-wal-senders"/>), and allowed background
     process (<xref linkend="guc-max-worker-processes"/>), in sets of 16.
     Each such set will
@@ -847,13 +847,13 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     other applications. The maximum number of semaphores in the system
     is set by <varname>SEMMNS</varname>, which consequently must be at least
     as high as <varname>max_connections</varname> plus
-    <varname>autovacuum_max_workers</varname> plus <varname>max_wal_senders</varname>,
+    <varname>autovacuum_worker_slots</varname> plus <varname>max_wal_senders</varname>,
     plus <varname>max_worker_processes</varname>, plus one extra for each 16
     allowed connections plus workers (see the formula in <xref
     linkend="sysvipc-parameters"/>).  The parameter <varname>SEMMNI</varname>
     determines the limit on the number of semaphore sets that can
     exist on the system at one time.  Hence this parameter must be at
-    least <literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 7) / 16)</literal>.
+    least <literal>ceil((max_connections + autovacuum_worker_slots + max_wal_senders + max_worker_processes + 7) / 16)</literal>.
     Lowering the number
     of allowed connections is a temporary workaround for failures,
     which are usually confusingly worded <quote>No space
@@ -884,7 +884,7 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     When using POSIX semaphores, the number of semaphores needed is the
     same as for System V, that is one semaphore per allowed connection
     (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
-    (<xref linkend="guc-autovacuum-max-workers"/>), allowed WAL sender process
+    (<xref linkend="guc-autovacuum-worker-slots"/>), allowed WAL sender process
     (<xref linkend="guc-max-wal-senders"/>), and allowed background
     process (<xref linkend="guc-max-worker-processes"/>).
     On the platforms where this option is preferred, there is no specific
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 33e27a6e72..816f9f2b4b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5362,7 +5362,7 @@ CheckRequiredParameterValues(void)
 	 */
 	if (ArchiveRecoveryRequested && EnableHotStandby)
 	{
-		/* We ignore autovacuum_max_workers when we make this test. */
+		/* We ignore autovacuum_worker_slots when we make this test. */
 		RecoveryRequiresIntParameter("max_connections",
 									 MaxConnections,
 									 ControlFile->MaxConnections);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 928754b51c..565e14ca9b 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -114,6 +114,7 @@
  * GUC parameters
  */
 bool		autovacuum_start_daemon = false;
+int			autovacuum_worker_slots;
 int			autovacuum_max_workers;
 int			autovacuum_work_mem = -1;
 int			autovacuum_naptime;
@@ -209,7 +210,7 @@ typedef struct autovac_table
 /*-------------
  * This struct holds information about a single worker's whereabouts.  We keep
  * an array of these in shared memory, sized according to
- * autovacuum_max_workers.
+ * autovacuum_worker_slots.
  *
  * wi_links		entry into free list or running list
  * wi_dboid		OID of the database this worker is supposed to work on
@@ -289,7 +290,7 @@ typedef struct
 {
 	sig_atomic_t av_signal[AutoVacNumSignals];
 	pid_t		av_launcherpid;
-	dlist_head	av_freeWorkers;
+	dclist_head av_freeWorkers;
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
@@ -347,6 +348,7 @@ static void autovac_report_activity(autovac_table *tab);
 static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
 									const char *nspname, const char *relname);
 static void avl_sigusr2_handler(SIGNAL_ARGS);
+static bool av_worker_available(void);
 
 
 
@@ -575,8 +577,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		 * wakening conditions.
 		 */
 
-		launcher_determine_sleep(!dlist_is_empty(&AutoVacuumShmem->av_freeWorkers),
-								 false, &nap);
+		launcher_determine_sleep(av_worker_available(), false, &nap);
 
 		/*
 		 * Wait until naptime expires or we get some type of signal (all the
@@ -636,7 +637,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		current_time = GetCurrentTimestamp();
 		LWLockAcquire(AutovacuumLock, LW_SHARED);
 
-		can_launch = !dlist_is_empty(&AutoVacuumShmem->av_freeWorkers);
+		can_launch = av_worker_available();
 
 		if (AutoVacuumShmem->av_startingWorker != NULL)
 		{
@@ -679,8 +680,8 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 					worker->wi_sharedrel = false;
 					worker->wi_proc = NULL;
 					worker->wi_launchtime = 0;
-					dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-									&worker->wi_links);
+					dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+									 &worker->wi_links);
 					AutoVacuumShmem->av_startingWorker = NULL;
 					ereport(WARNING,
 							errmsg("autovacuum worker took too long to start; canceled"));
@@ -1087,7 +1088,7 @@ do_start_worker(void)
 
 	/* return quickly when there are no free workers */
 	LWLockAcquire(AutovacuumLock, LW_SHARED);
-	if (dlist_is_empty(&AutoVacuumShmem->av_freeWorkers))
+	if (!av_worker_available())
 	{
 		LWLockRelease(AutovacuumLock);
 		return InvalidOid;
@@ -1240,7 +1241,7 @@ do_start_worker(void)
 		 * Get a worker entry from the freelist.  We checked above, so there
 		 * really should be a free slot.
 		 */
-		wptr = dlist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
+		wptr = dclist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
 
 		worker = dlist_container(WorkerInfoData, wi_links, wptr);
 		worker->wi_dboid = avdb->adw_datid;
@@ -1609,8 +1610,8 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
 		pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
-		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-						&MyWorkerInfo->wi_links);
+		dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+						 &MyWorkerInfo->wi_links);
 		/* not mine anymore */
 		MyWorkerInfo = NULL;
 
@@ -3265,7 +3266,7 @@ AutoVacuumShmemSize(void)
 	 */
 	size = sizeof(AutoVacuumShmemStruct);
 	size = MAXALIGN(size);
-	size = add_size(size, mul_size(autovacuum_max_workers,
+	size = add_size(size, mul_size(autovacuum_worker_slots,
 								   sizeof(WorkerInfoData)));
 	return size;
 }
@@ -3292,7 +3293,7 @@ AutoVacuumShmemInit(void)
 		Assert(!found);
 
 		AutoVacuumShmem->av_launcherpid = 0;
-		dlist_init(&AutoVacuumShmem->av_freeWorkers);
+		dclist_init(&AutoVacuumShmem->av_freeWorkers);
 		dlist_init(&AutoVacuumShmem->av_runningWorkers);
 		AutoVacuumShmem->av_startingWorker = NULL;
 		memset(AutoVacuumShmem->av_workItems, 0,
@@ -3302,10 +3303,10 @@ AutoVacuumShmemInit(void)
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
 		/* initialize the WorkerInfo free list */
-		for (i = 0; i < autovacuum_max_workers; i++)
+		for (i = 0; i < autovacuum_worker_slots; i++)
 		{
-			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-							&worker[i].wi_links);
+			dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+							 &worker[i].wi_links);
 			pg_atomic_init_flag(&worker[i].wi_dobalance);
 		}
 
@@ -3341,3 +3342,14 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
 
 	return true;
 }
+
+/*
+ * Returns whether there is a free autovacuum worker slot available.
+ */
+static bool
+av_worker_available(void)
+{
+	int			reserved = autovacuum_worker_slots - autovacuum_max_workers;
+
+	return dclist_count(&AutoVacuumShmem->av_freeWorkers) > Max(0, reserved);
+}
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 6f974a8d21..e4c824fcb1 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -4144,7 +4144,7 @@ CreateOptsFile(int argc, char *argv[], char *fullprogname)
 int
 MaxLivePostmasterChildren(void)
 {
-	return 2 * (MaxConnections + autovacuum_max_workers + 1 +
+	return 2 * (MaxConnections + autovacuum_worker_slots + 1 +
 				max_wal_senders + max_worker_processes);
 }
 
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 1b23efb26f..c20c9338ec 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -142,7 +142,7 @@ ProcGlobalSemas(void)
  *	  So, now we grab enough semaphores to support the desired max number
  *	  of backends immediately at initialization --- if the sysadmin has set
  *	  MaxConnections, max_worker_processes, max_wal_senders, or
- *	  autovacuum_max_workers higher than his kernel will support, he'll
+ *	  autovacuum_worker_slots higher than his kernel will support, he'll
  *	  find out sooner rather than later.
  *
  *	  Another reason for creating semaphores here is that the semaphore
@@ -242,13 +242,13 @@ InitProcGlobal(void)
 			dlist_push_tail(&ProcGlobal->freeProcs, &proc->links);
 			proc->procgloballist = &ProcGlobal->freeProcs;
 		}
-		else if (i < MaxConnections + autovacuum_max_workers + 1)
+		else if (i < MaxConnections + autovacuum_worker_slots + 1)
 		{
 			/* PGPROC for AV launcher/worker, add to autovacFreeProcs list */
 			dlist_push_tail(&ProcGlobal->autovacFreeProcs, &proc->links);
 			proc->procgloballist = &ProcGlobal->autovacFreeProcs;
 		}
-		else if (i < MaxConnections + autovacuum_max_workers + 1 + max_worker_processes)
+		else if (i < MaxConnections + autovacuum_worker_slots + 1 + max_worker_processes)
 		{
 			/* PGPROC for bgworker, add to bgworkerFreeProcs list */
 			dlist_push_tail(&ProcGlobal->bgworkerFreeProcs, &proc->links);
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 25867c8bd5..acbae29baf 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -577,15 +577,15 @@ InitializeMaxBackends(void)
 	Assert(MaxBackends == 0);
 
 	/* the extra unit accounts for the autovacuum launcher */
-	MaxBackends = MaxConnections + autovacuum_max_workers + 1 +
+	MaxBackends = MaxConnections + autovacuum_worker_slots + 1 +
 		max_worker_processes + max_wal_senders;
 
 	if (MaxBackends > MAX_BACKENDS)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("too many server processes configured"),
-				 errdetail("\"max_connections\" (%d) plus \"autovacuum_max_workers\" (%d) plus \"max_worker_processes\" (%d) plus \"max_wal_senders\" (%d) must be less than %d.",
-						   MaxConnections, autovacuum_max_workers,
+				 errdetail("\"max_connections\" (%d) plus \"autovacuum_worker_slots\" (%d) plus \"max_worker_processes\" (%d) plus \"max_wal_senders\" (%d) must be less than %d.",
+						   MaxConnections, autovacuum_worker_slots,
 						   max_worker_processes, max_wal_senders,
 						   MAX_BACKENDS)));
 }
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 630ed0f162..6ffca198e9 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3383,7 +3383,16 @@ struct config_int ConfigureNamesInt[] =
 	},
 	{
 		/* see max_connections */
-		{"autovacuum_max_workers", PGC_POSTMASTER, AUTOVACUUM,
+		{"autovacuum_worker_slots", PGC_POSTMASTER, AUTOVACUUM,
+			gettext_noop("Sets the number of backend slots to allocate for autovacuum workers."),
+			NULL
+		},
+		&autovacuum_worker_slots,
+		16, 1, MAX_BACKENDS,
+		NULL, NULL, NULL
+	},
+	{
+		{"autovacuum_max_workers", PGC_SIGHUP, AUTOVACUUM,
 			gettext_noop("Sets the maximum number of simultaneously running autovacuum worker processes."),
 			NULL
 		},
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 9ec9f97e92..52a4d44b59 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -658,8 +658,9 @@
 
 #autovacuum = on			# Enable autovacuum subprocess?  'on'
 					# requires track_counts to also be on.
-#autovacuum_max_workers = 3		# max number of autovacuum subprocesses
+autovacuum_worker_slots = 16	# autovacuum worker slots to allocate
 					# (change requires restart)
+#autovacuum_max_workers = 3		# max number of autovacuum subprocesses
 #autovacuum_naptime = 1min		# time between autovacuum runs
 #autovacuum_vacuum_threshold = 50	# min number of row updates before
 					# vacuum
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index cae1e8b329..190baa699d 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -28,6 +28,7 @@ typedef enum
 
 /* GUC variables */
 extern PGDLLIMPORT bool autovacuum_start_daemon;
+extern PGDLLIMPORT int autovacuum_worker_slots;
 extern PGDLLIMPORT int autovacuum_max_workers;
 extern PGDLLIMPORT int autovacuum_work_mem;
 extern PGDLLIMPORT int autovacuum_naptime;
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 0135c5a795..98a5039709 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -608,6 +608,7 @@ sub init
 		}
 		print $conf "max_wal_senders = 10\n";
 		print $conf "max_replication_slots = 10\n";
+		print $conf "autovacuum_worker_slots = 3\n";
 		print $conf "wal_log_hints = on\n";
 		print $conf "hot_standby = on\n";
 		# conservative settings to ensure we can run multiple postmasters:
-- 
2.39.3 (Apple Git-146)

#41

Nathan Bossart

nathandbossart@gmail.com

over 1 year ago

In reply to: Nathan Bossart (#40)

1 attachment(s)

Re: allow changing autovacuum_max_workers without restarting

On Mon, Jul 08, 2024 at 02:29:16PM -0500, Nathan Bossart wrote:

One thing that still bugs me is that there is no feedback sent to the user
when autovacuum_max_workers is set higher than autovacuum_worker_slots. I
think we should at least emit a WARNING, perhaps from the autovacuum
launcher, i.e., once when the launcher starts and then again as needed via
HandleAutoVacLauncherInterrupts(). Or we could fail to start in
PostmasterMain() and then ignore later misconfigurations via a GUC check
hook. I'm not too thrilled about adding more GUC check hooks that depend
on the value of other GUCs, but I do like the idea of failing instead of
silently proceeding with a different value than the user configured. Any
thoughts?

From recent discussions, it sounds like there isn't much appetite for GUC
check hooks that depend on the values of other GUCs. Here is a new version
of the patch that adds the WARNING described above.

--
nathan

Attachments:

v8-0001-allow-changing-autovacuum_max_workers-without-res.patchtext/plain; charset=us-asciiDownload

From e59c8199858b0331c2d9ec7a40d26f0e89657bf4 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Sat, 22 Jun 2024 15:05:44 -0500
Subject: [PATCH v8 1/1] allow changing autovacuum_max_workers without
 restarting

---
 doc/src/sgml/config.sgml                      | 28 ++++++-
 doc/src/sgml/runtime.sgml                     | 12 +--
 src/backend/access/transam/xlog.c             |  2 +-
 src/backend/postmaster/autovacuum.c           | 76 +++++++++++++++----
 src/backend/postmaster/postmaster.c           |  2 +-
 src/backend/storage/lmgr/proc.c               |  6 +-
 src/backend/utils/init/postinit.c             |  6 +-
 src/backend/utils/misc/guc_tables.c           | 11 ++-
 src/backend/utils/misc/postgresql.conf.sample |  3 +-
 src/include/postmaster/autovacuum.h           |  1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |  1 +
 11 files changed, 115 insertions(+), 33 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 3dec0b7cfe..d5101dd76c 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8552,6 +8552,25 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-autovacuum-worker-slots" xreflabel="autovacuum_worker_slots">
+      <term><varname>autovacuum_worker_slots</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>autovacuum_worker_slots</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Specifies the number of backend slots to reserve for autovacuum worker
+        processes.  The default is 16.  This parameter can only be set at server
+        start.
+       </para>
+       <para>
+        When changing this value, consider also adjusting
+        <xref linkend="guc-autovacuum-max-workers"/>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-autovacuum-max-workers" xreflabel="autovacuum_max_workers">
       <term><varname>autovacuum_max_workers</varname> (<type>integer</type>)
       <indexterm>
@@ -8562,7 +8581,14 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
        <para>
         Specifies the maximum number of autovacuum processes (other than the
         autovacuum launcher) that may be running at any one time.  The default
-        is three.  This parameter can only be set at server start.
+        is three.  This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command line.
+       </para>
+       <para>
+        Note that a setting for this value which is higher than
+        <xref linkend="guc-autovacuum-worker-slots"/> will have no effect,
+        since autovacuum workers are taken from the pool of slots established
+        by that setting.
        </para>
       </listitem>
      </varlistentry>
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 2f7c618886..4bb37faffe 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -781,13 +781,13 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
        <row>
         <entry><varname>SEMMNI</varname></entry>
         <entry>Maximum number of semaphore identifiers (i.e., sets)</entry>
-        <entry>at least <literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 7) / 16)</literal> plus room for other applications</entry>
+        <entry>at least <literal>ceil((max_connections + autovacuum_worker_slots + max_wal_senders + max_worker_processes + 7) / 16)</literal> plus room for other applications</entry>
        </row>
 
        <row>
         <entry><varname>SEMMNS</varname></entry>
         <entry>Maximum number of semaphores system-wide</entry>
-        <entry><literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 7) / 16) * 17</literal> plus room for other applications</entry>
+        <entry><literal>ceil((max_connections + autovacuum_worker_slots + max_wal_senders + max_worker_processes + 7) / 16) * 17</literal> plus room for other applications</entry>
        </row>
 
        <row>
@@ -838,7 +838,7 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     When using System V semaphores,
     <productname>PostgreSQL</productname> uses one semaphore per allowed connection
     (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
-    (<xref linkend="guc-autovacuum-max-workers"/>), allowed WAL sender process
+    (<xref linkend="guc-autovacuum-worker-slots"/>), allowed WAL sender process
     (<xref linkend="guc-max-wal-senders"/>), and allowed background
     process (<xref linkend="guc-max-worker-processes"/>), in sets of 16.
     Each such set will
@@ -847,13 +847,13 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     other applications. The maximum number of semaphores in the system
     is set by <varname>SEMMNS</varname>, which consequently must be at least
     as high as <varname>max_connections</varname> plus
-    <varname>autovacuum_max_workers</varname> plus <varname>max_wal_senders</varname>,
+    <varname>autovacuum_worker_slots</varname> plus <varname>max_wal_senders</varname>,
     plus <varname>max_worker_processes</varname>, plus one extra for each 16
     allowed connections plus workers (see the formula in <xref
     linkend="sysvipc-parameters"/>).  The parameter <varname>SEMMNI</varname>
     determines the limit on the number of semaphore sets that can
     exist on the system at one time.  Hence this parameter must be at
-    least <literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 7) / 16)</literal>.
+    least <literal>ceil((max_connections + autovacuum_worker_slots + max_wal_senders + max_worker_processes + 7) / 16)</literal>.
     Lowering the number
     of allowed connections is a temporary workaround for failures,
     which are usually confusingly worded <quote>No space
@@ -884,7 +884,7 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     When using POSIX semaphores, the number of semaphores needed is the
     same as for System V, that is one semaphore per allowed connection
     (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
-    (<xref linkend="guc-autovacuum-max-workers"/>), allowed WAL sender process
+    (<xref linkend="guc-autovacuum-worker-slots"/>), allowed WAL sender process
     (<xref linkend="guc-max-wal-senders"/>), and allowed background
     process (<xref linkend="guc-max-worker-processes"/>).
     On the platforms where this option is preferred, there is no specific
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 636be5ca4d..67cabd214d 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5362,7 +5362,7 @@ CheckRequiredParameterValues(void)
 	 */
 	if (ArchiveRecoveryRequested && EnableHotStandby)
 	{
-		/* We ignore autovacuum_max_workers when we make this test. */
+		/* We ignore autovacuum_worker_slots when we make this test. */
 		RecoveryRequiresIntParameter("max_connections",
 									 MaxConnections,
 									 ControlFile->MaxConnections);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 4e4a0ccbef..937cd940aa 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -115,6 +115,7 @@
  * GUC parameters
  */
 bool		autovacuum_start_daemon = false;
+int			autovacuum_worker_slots;
 int			autovacuum_max_workers;
 int			autovacuum_work_mem = -1;
 int			autovacuum_naptime;
@@ -210,7 +211,7 @@ typedef struct autovac_table
 /*-------------
  * This struct holds information about a single worker's whereabouts.  We keep
  * an array of these in shared memory, sized according to
- * autovacuum_max_workers.
+ * autovacuum_worker_slots.
  *
  * wi_links		entry into free list or running list
  * wi_dboid		OID of the database this worker is supposed to work on
@@ -290,7 +291,7 @@ typedef struct
 {
 	sig_atomic_t av_signal[AutoVacNumSignals];
 	pid_t		av_launcherpid;
-	dlist_head	av_freeWorkers;
+	dclist_head av_freeWorkers;
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
@@ -348,6 +349,8 @@ static void autovac_report_activity(autovac_table *tab);
 static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
 									const char *nspname, const char *relname);
 static void avl_sigusr2_handler(SIGNAL_ARGS);
+static bool av_worker_available(void);
+static void CheckAutovacuumWorkerGUCs(void);
 
 
 
@@ -424,6 +427,12 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 										  ALLOCSET_DEFAULT_SIZES);
 	MemoryContextSwitchTo(AutovacMemCxt);
 
+	/*
+	 * Emit a WARNING if autovacuum_worker_slots < autovacuum_max_workers.  We
+	 * do this on startup and on subsequent configuration reloads as needed.
+	 */
+	CheckAutovacuumWorkerGUCs();
+
 	/*
 	 * If an exception is encountered, processing resumes here.
 	 *
@@ -576,8 +585,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		 * wakening conditions.
 		 */
 
-		launcher_determine_sleep(!dlist_is_empty(&AutoVacuumShmem->av_freeWorkers),
-								 false, &nap);
+		launcher_determine_sleep(av_worker_available(), false, &nap);
 
 		/*
 		 * Wait until naptime expires or we get some type of signal (all the
@@ -637,7 +645,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		current_time = GetCurrentTimestamp();
 		LWLockAcquire(AutovacuumLock, LW_SHARED);
 
-		can_launch = !dlist_is_empty(&AutoVacuumShmem->av_freeWorkers);
+		can_launch = av_worker_available();
 
 		if (AutoVacuumShmem->av_startingWorker != NULL)
 		{
@@ -680,8 +688,8 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 					worker->wi_sharedrel = false;
 					worker->wi_proc = NULL;
 					worker->wi_launchtime = 0;
-					dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-									&worker->wi_links);
+					dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+									 &worker->wi_links);
 					AutoVacuumShmem->av_startingWorker = NULL;
 					ereport(WARNING,
 							errmsg("autovacuum worker took too long to start; canceled"));
@@ -746,6 +754,8 @@ HandleAutoVacLauncherInterrupts(void)
 
 	if (ConfigReloadPending)
 	{
+		int			autovacuum_max_workers_prev = autovacuum_max_workers;
+
 		ConfigReloadPending = false;
 		ProcessConfigFile(PGC_SIGHUP);
 
@@ -753,6 +763,14 @@ HandleAutoVacLauncherInterrupts(void)
 		if (!AutoVacuumingActive())
 			AutoVacLauncherShutdown();
 
+		/*
+		 * If autovacuum_max_workers changed, emit a WARNING if
+		 * autovacuum_worker_slots < autovacuum_max_workers.  If it didn't
+		 * change, skip this to avoid too many repeated log messages.
+		 */
+		if (autovacuum_max_workers_prev != autovacuum_max_workers)
+			CheckAutovacuumWorkerGUCs();
+
 		/* rebuild the list in case the naptime changed */
 		rebuild_database_list(InvalidOid);
 	}
@@ -1088,7 +1106,7 @@ do_start_worker(void)
 
 	/* return quickly when there are no free workers */
 	LWLockAcquire(AutovacuumLock, LW_SHARED);
-	if (dlist_is_empty(&AutoVacuumShmem->av_freeWorkers))
+	if (!av_worker_available())
 	{
 		LWLockRelease(AutovacuumLock);
 		return InvalidOid;
@@ -1241,7 +1259,7 @@ do_start_worker(void)
 		 * Get a worker entry from the freelist.  We checked above, so there
 		 * really should be a free slot.
 		 */
-		wptr = dlist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
+		wptr = dclist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
 
 		worker = dlist_container(WorkerInfoData, wi_links, wptr);
 		worker->wi_dboid = avdb->adw_datid;
@@ -1610,8 +1628,8 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
 		pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
-		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-						&MyWorkerInfo->wi_links);
+		dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+						 &MyWorkerInfo->wi_links);
 		/* not mine anymore */
 		MyWorkerInfo = NULL;
 
@@ -3272,7 +3290,7 @@ AutoVacuumShmemSize(void)
 	 */
 	size = sizeof(AutoVacuumShmemStruct);
 	size = MAXALIGN(size);
-	size = add_size(size, mul_size(autovacuum_max_workers,
+	size = add_size(size, mul_size(autovacuum_worker_slots,
 								   sizeof(WorkerInfoData)));
 	return size;
 }
@@ -3299,7 +3317,7 @@ AutoVacuumShmemInit(void)
 		Assert(!found);
 
 		AutoVacuumShmem->av_launcherpid = 0;
-		dlist_init(&AutoVacuumShmem->av_freeWorkers);
+		dclist_init(&AutoVacuumShmem->av_freeWorkers);
 		dlist_init(&AutoVacuumShmem->av_runningWorkers);
 		AutoVacuumShmem->av_startingWorker = NULL;
 		memset(AutoVacuumShmem->av_workItems, 0,
@@ -3309,10 +3327,10 @@ AutoVacuumShmemInit(void)
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
 		/* initialize the WorkerInfo free list */
-		for (i = 0; i < autovacuum_max_workers; i++)
+		for (i = 0; i < autovacuum_worker_slots; i++)
 		{
-			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-							&worker[i].wi_links);
+			dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+							 &worker[i].wi_links);
 			pg_atomic_init_flag(&worker[i].wi_dobalance);
 		}
 
@@ -3348,3 +3366,29 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
 
 	return true;
 }
+
+/*
+ * Returns whether there is a free autovacuum worker slot available.
+ */
+static bool
+av_worker_available(void)
+{
+	int			reserved = autovacuum_worker_slots - autovacuum_max_workers;
+
+	return dclist_count(&AutoVacuumShmem->av_freeWorkers) > Max(0, reserved);
+}
+
+/*
+ * Emit a WARNING if autovacuum_worker_slots < autovacuum_max_workers.
+ */
+static void
+CheckAutovacuumWorkerGUCs(void)
+{
+	if (autovacuum_worker_slots < autovacuum_max_workers)
+		ereport(WARNING,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("\"autovacuum_max_workers\" (%d) should be less than or equal to \"autovacuum_worker_slots\" (%d)",
+						autovacuum_max_workers, autovacuum_worker_slots),
+				 errdetail("The server will continue running but will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
+						   autovacuum_worker_slots)));
+}
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 6f974a8d21..e4c824fcb1 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -4144,7 +4144,7 @@ CreateOptsFile(int argc, char *argv[], char *fullprogname)
 int
 MaxLivePostmasterChildren(void)
 {
-	return 2 * (MaxConnections + autovacuum_max_workers + 1 +
+	return 2 * (MaxConnections + autovacuum_worker_slots + 1 +
 				max_wal_senders + max_worker_processes);
 }
 
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 1b23efb26f..c20c9338ec 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -142,7 +142,7 @@ ProcGlobalSemas(void)
  *	  So, now we grab enough semaphores to support the desired max number
  *	  of backends immediately at initialization --- if the sysadmin has set
  *	  MaxConnections, max_worker_processes, max_wal_senders, or
- *	  autovacuum_max_workers higher than his kernel will support, he'll
+ *	  autovacuum_worker_slots higher than his kernel will support, he'll
  *	  find out sooner rather than later.
  *
  *	  Another reason for creating semaphores here is that the semaphore
@@ -242,13 +242,13 @@ InitProcGlobal(void)
 			dlist_push_tail(&ProcGlobal->freeProcs, &proc->links);
 			proc->procgloballist = &ProcGlobal->freeProcs;
 		}
-		else if (i < MaxConnections + autovacuum_max_workers + 1)
+		else if (i < MaxConnections + autovacuum_worker_slots + 1)
 		{
 			/* PGPROC for AV launcher/worker, add to autovacFreeProcs list */
 			dlist_push_tail(&ProcGlobal->autovacFreeProcs, &proc->links);
 			proc->procgloballist = &ProcGlobal->autovacFreeProcs;
 		}
-		else if (i < MaxConnections + autovacuum_max_workers + 1 + max_worker_processes)
+		else if (i < MaxConnections + autovacuum_worker_slots + 1 + max_worker_processes)
 		{
 			/* PGPROC for bgworker, add to bgworkerFreeProcs list */
 			dlist_push_tail(&ProcGlobal->bgworkerFreeProcs, &proc->links);
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 25867c8bd5..acbae29baf 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -577,15 +577,15 @@ InitializeMaxBackends(void)
 	Assert(MaxBackends == 0);
 
 	/* the extra unit accounts for the autovacuum launcher */
-	MaxBackends = MaxConnections + autovacuum_max_workers + 1 +
+	MaxBackends = MaxConnections + autovacuum_worker_slots + 1 +
 		max_worker_processes + max_wal_senders;
 
 	if (MaxBackends > MAX_BACKENDS)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("too many server processes configured"),
-				 errdetail("\"max_connections\" (%d) plus \"autovacuum_max_workers\" (%d) plus \"max_worker_processes\" (%d) plus \"max_wal_senders\" (%d) must be less than %d.",
-						   MaxConnections, autovacuum_max_workers,
+				 errdetail("\"max_connections\" (%d) plus \"autovacuum_worker_slots\" (%d) plus \"max_worker_processes\" (%d) plus \"max_wal_senders\" (%d) must be less than %d.",
+						   MaxConnections, autovacuum_worker_slots,
 						   max_worker_processes, max_wal_senders,
 						   MAX_BACKENDS)));
 }
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 630ed0f162..6ffca198e9 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3383,7 +3383,16 @@ struct config_int ConfigureNamesInt[] =
 	},
 	{
 		/* see max_connections */
-		{"autovacuum_max_workers", PGC_POSTMASTER, AUTOVACUUM,
+		{"autovacuum_worker_slots", PGC_POSTMASTER, AUTOVACUUM,
+			gettext_noop("Sets the number of backend slots to allocate for autovacuum workers."),
+			NULL
+		},
+		&autovacuum_worker_slots,
+		16, 1, MAX_BACKENDS,
+		NULL, NULL, NULL
+	},
+	{
+		{"autovacuum_max_workers", PGC_SIGHUP, AUTOVACUUM,
 			gettext_noop("Sets the maximum number of simultaneously running autovacuum worker processes."),
 			NULL
 		},
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 9ec9f97e92..52a4d44b59 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -658,8 +658,9 @@
 
 #autovacuum = on			# Enable autovacuum subprocess?  'on'
 					# requires track_counts to also be on.
-#autovacuum_max_workers = 3		# max number of autovacuum subprocesses
+autovacuum_worker_slots = 16	# autovacuum worker slots to allocate
 					# (change requires restart)
+#autovacuum_max_workers = 3		# max number of autovacuum subprocesses
 #autovacuum_naptime = 1min		# time between autovacuum runs
 #autovacuum_vacuum_threshold = 50	# min number of row updates before
 					# vacuum
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index cae1e8b329..190baa699d 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -28,6 +28,7 @@ typedef enum
 
 /* GUC variables */
 extern PGDLLIMPORT bool autovacuum_start_daemon;
+extern PGDLLIMPORT int autovacuum_worker_slots;
 extern PGDLLIMPORT int autovacuum_max_workers;
 extern PGDLLIMPORT int autovacuum_work_mem;
 extern PGDLLIMPORT int autovacuum_naptime;
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 32ee98aebc..a5047038f3 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -617,6 +617,7 @@ sub init
 		}
 		print $conf "max_wal_senders = 10\n";
 		print $conf "max_replication_slots = 10\n";
+		print $conf "autovacuum_worker_slots = 3\n";
 		print $conf "wal_log_hints = on\n";
 		print $conf "hot_standby = on\n";
 		# conservative settings to ensure we can run multiple postmasters:
-- 
2.39.3 (Apple Git-146)

#42

Nathan Bossart

nathandbossart@gmail.com

over 1 year ago

In reply to: Nathan Bossart (#41)

1 attachment(s)

Re: allow changing autovacuum_max_workers without restarting

rebased

--
nathan

Attachments:

v9-0001-allow-changing-autovacuum_max_workers-without-res.patchtext/plain; charset=us-asciiDownload

From 61513f744012c2b9b59085ce8c4a960da9e56ee7 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Sat, 22 Jun 2024 15:05:44 -0500
Subject: [PATCH v9 1/1] allow changing autovacuum_max_workers without
 restarting

---
 doc/src/sgml/config.sgml                      | 28 ++++++-
 doc/src/sgml/runtime.sgml                     |  4 +-
 src/backend/access/transam/xlog.c             |  2 +-
 src/backend/postmaster/autovacuum.c           | 76 +++++++++++++++----
 src/backend/postmaster/postmaster.c           |  2 +-
 src/backend/storage/lmgr/proc.c               |  6 +-
 src/backend/utils/init/postinit.c             |  6 +-
 src/backend/utils/misc/guc_tables.c           | 11 ++-
 src/backend/utils/misc/postgresql.conf.sample |  3 +-
 src/include/postmaster/autovacuum.h           |  1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |  1 +
 11 files changed, 111 insertions(+), 29 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 57cd7bb972..7fc270c0ae 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8552,6 +8552,25 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-autovacuum-worker-slots" xreflabel="autovacuum_worker_slots">
+      <term><varname>autovacuum_worker_slots</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>autovacuum_worker_slots</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Specifies the number of backend slots to reserve for autovacuum worker
+        processes.  The default is 16.  This parameter can only be set at server
+        start.
+       </para>
+       <para>
+        When changing this value, consider also adjusting
+        <xref linkend="guc-autovacuum-max-workers"/>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-autovacuum-max-workers" xreflabel="autovacuum_max_workers">
       <term><varname>autovacuum_max_workers</varname> (<type>integer</type>)
       <indexterm>
@@ -8562,7 +8581,14 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
        <para>
         Specifies the maximum number of autovacuum processes (other than the
         autovacuum launcher) that may be running at any one time.  The default
-        is three.  This parameter can only be set at server start.
+        is three.  This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command line.
+       </para>
+       <para>
+        Note that a setting for this value which is higher than
+        <xref linkend="guc-autovacuum-worker-slots"/> will have no effect,
+        since autovacuum workers are taken from the pool of slots established
+        by that setting.
        </para>
       </listitem>
      </varlistentry>
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 2c4d5ef640..a1f43556ac 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -838,7 +838,7 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     When using System V semaphores,
     <productname>PostgreSQL</productname> uses one semaphore per allowed connection
     (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
-    (<xref linkend="guc-autovacuum-max-workers"/>), allowed WAL sender process
+    (<xref linkend="guc-autovacuum-worker-slots"/>), allowed WAL sender process
     (<xref linkend="guc-max-wal-senders"/>), allowed background
     process (<xref linkend="guc-max-worker-processes"/>), etc., in sets of 16.
     The runtime-computed parameter <xref linkend="guc-num-os-semaphores"/>
@@ -891,7 +891,7 @@ $ <userinput>postgres -D $PGDATA -C num_os_semaphores</userinput>
     When using POSIX semaphores, the number of semaphores needed is the
     same as for System V, that is one semaphore per allowed connection
     (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
-    (<xref linkend="guc-autovacuum-max-workers"/>), allowed WAL sender process
+    (<xref linkend="guc-autovacuum-worker-slots"/>), allowed WAL sender process
     (<xref linkend="guc-max-wal-senders"/>), allowed background
     process (<xref linkend="guc-max-worker-processes"/>), etc.
     On the platforms where this option is preferred, there is no specific
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index f86f4b5c4b..64c1edb798 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5373,7 +5373,7 @@ CheckRequiredParameterValues(void)
 	 */
 	if (ArchiveRecoveryRequested && EnableHotStandby)
 	{
-		/* We ignore autovacuum_max_workers when we make this test. */
+		/* We ignore autovacuum_worker_slots when we make this test. */
 		RecoveryRequiresIntParameter("max_connections",
 									 MaxConnections,
 									 ControlFile->MaxConnections);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 4e4a0ccbef..937cd940aa 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -115,6 +115,7 @@
  * GUC parameters
  */
 bool		autovacuum_start_daemon = false;
+int			autovacuum_worker_slots;
 int			autovacuum_max_workers;
 int			autovacuum_work_mem = -1;
 int			autovacuum_naptime;
@@ -210,7 +211,7 @@ typedef struct autovac_table
 /*-------------
  * This struct holds information about a single worker's whereabouts.  We keep
  * an array of these in shared memory, sized according to
- * autovacuum_max_workers.
+ * autovacuum_worker_slots.
  *
  * wi_links		entry into free list or running list
  * wi_dboid		OID of the database this worker is supposed to work on
@@ -290,7 +291,7 @@ typedef struct
 {
 	sig_atomic_t av_signal[AutoVacNumSignals];
 	pid_t		av_launcherpid;
-	dlist_head	av_freeWorkers;
+	dclist_head av_freeWorkers;
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
@@ -348,6 +349,8 @@ static void autovac_report_activity(autovac_table *tab);
 static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
 									const char *nspname, const char *relname);
 static void avl_sigusr2_handler(SIGNAL_ARGS);
+static bool av_worker_available(void);
+static void CheckAutovacuumWorkerGUCs(void);
 
 
 
@@ -424,6 +427,12 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 										  ALLOCSET_DEFAULT_SIZES);
 	MemoryContextSwitchTo(AutovacMemCxt);
 
+	/*
+	 * Emit a WARNING if autovacuum_worker_slots < autovacuum_max_workers.  We
+	 * do this on startup and on subsequent configuration reloads as needed.
+	 */
+	CheckAutovacuumWorkerGUCs();
+
 	/*
 	 * If an exception is encountered, processing resumes here.
 	 *
@@ -576,8 +585,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		 * wakening conditions.
 		 */
 
-		launcher_determine_sleep(!dlist_is_empty(&AutoVacuumShmem->av_freeWorkers),
-								 false, &nap);
+		launcher_determine_sleep(av_worker_available(), false, &nap);
 
 		/*
 		 * Wait until naptime expires or we get some type of signal (all the
@@ -637,7 +645,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		current_time = GetCurrentTimestamp();
 		LWLockAcquire(AutovacuumLock, LW_SHARED);
 
-		can_launch = !dlist_is_empty(&AutoVacuumShmem->av_freeWorkers);
+		can_launch = av_worker_available();
 
 		if (AutoVacuumShmem->av_startingWorker != NULL)
 		{
@@ -680,8 +688,8 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 					worker->wi_sharedrel = false;
 					worker->wi_proc = NULL;
 					worker->wi_launchtime = 0;
-					dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-									&worker->wi_links);
+					dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+									 &worker->wi_links);
 					AutoVacuumShmem->av_startingWorker = NULL;
 					ereport(WARNING,
 							errmsg("autovacuum worker took too long to start; canceled"));
@@ -746,6 +754,8 @@ HandleAutoVacLauncherInterrupts(void)
 
 	if (ConfigReloadPending)
 	{
+		int			autovacuum_max_workers_prev = autovacuum_max_workers;
+
 		ConfigReloadPending = false;
 		ProcessConfigFile(PGC_SIGHUP);
 
@@ -753,6 +763,14 @@ HandleAutoVacLauncherInterrupts(void)
 		if (!AutoVacuumingActive())
 			AutoVacLauncherShutdown();
 
+		/*
+		 * If autovacuum_max_workers changed, emit a WARNING if
+		 * autovacuum_worker_slots < autovacuum_max_workers.  If it didn't
+		 * change, skip this to avoid too many repeated log messages.
+		 */
+		if (autovacuum_max_workers_prev != autovacuum_max_workers)
+			CheckAutovacuumWorkerGUCs();
+
 		/* rebuild the list in case the naptime changed */
 		rebuild_database_list(InvalidOid);
 	}
@@ -1088,7 +1106,7 @@ do_start_worker(void)
 
 	/* return quickly when there are no free workers */
 	LWLockAcquire(AutovacuumLock, LW_SHARED);
-	if (dlist_is_empty(&AutoVacuumShmem->av_freeWorkers))
+	if (!av_worker_available())
 	{
 		LWLockRelease(AutovacuumLock);
 		return InvalidOid;
@@ -1241,7 +1259,7 @@ do_start_worker(void)
 		 * Get a worker entry from the freelist.  We checked above, so there
 		 * really should be a free slot.
 		 */
-		wptr = dlist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
+		wptr = dclist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
 
 		worker = dlist_container(WorkerInfoData, wi_links, wptr);
 		worker->wi_dboid = avdb->adw_datid;
@@ -1610,8 +1628,8 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
 		pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
-		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-						&MyWorkerInfo->wi_links);
+		dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+						 &MyWorkerInfo->wi_links);
 		/* not mine anymore */
 		MyWorkerInfo = NULL;
 
@@ -3272,7 +3290,7 @@ AutoVacuumShmemSize(void)
 	 */
 	size = sizeof(AutoVacuumShmemStruct);
 	size = MAXALIGN(size);
-	size = add_size(size, mul_size(autovacuum_max_workers,
+	size = add_size(size, mul_size(autovacuum_worker_slots,
 								   sizeof(WorkerInfoData)));
 	return size;
 }
@@ -3299,7 +3317,7 @@ AutoVacuumShmemInit(void)
 		Assert(!found);
 
 		AutoVacuumShmem->av_launcherpid = 0;
-		dlist_init(&AutoVacuumShmem->av_freeWorkers);
+		dclist_init(&AutoVacuumShmem->av_freeWorkers);
 		dlist_init(&AutoVacuumShmem->av_runningWorkers);
 		AutoVacuumShmem->av_startingWorker = NULL;
 		memset(AutoVacuumShmem->av_workItems, 0,
@@ -3309,10 +3327,10 @@ AutoVacuumShmemInit(void)
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
 		/* initialize the WorkerInfo free list */
-		for (i = 0; i < autovacuum_max_workers; i++)
+		for (i = 0; i < autovacuum_worker_slots; i++)
 		{
-			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-							&worker[i].wi_links);
+			dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+							 &worker[i].wi_links);
 			pg_atomic_init_flag(&worker[i].wi_dobalance);
 		}
 
@@ -3348,3 +3366,29 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
 
 	return true;
 }
+
+/*
+ * Returns whether there is a free autovacuum worker slot available.
+ */
+static bool
+av_worker_available(void)
+{
+	int			reserved = autovacuum_worker_slots - autovacuum_max_workers;
+
+	return dclist_count(&AutoVacuumShmem->av_freeWorkers) > Max(0, reserved);
+}
+
+/*
+ * Emit a WARNING if autovacuum_worker_slots < autovacuum_max_workers.
+ */
+static void
+CheckAutovacuumWorkerGUCs(void)
+{
+	if (autovacuum_worker_slots < autovacuum_max_workers)
+		ereport(WARNING,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("\"autovacuum_max_workers\" (%d) should be less than or equal to \"autovacuum_worker_slots\" (%d)",
+						autovacuum_max_workers, autovacuum_worker_slots),
+				 errdetail("The server will continue running but will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
+						   autovacuum_worker_slots)));
+}
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 6f974a8d21..e4c824fcb1 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -4144,7 +4144,7 @@ CreateOptsFile(int argc, char *argv[], char *fullprogname)
 int
 MaxLivePostmasterChildren(void)
 {
-	return 2 * (MaxConnections + autovacuum_max_workers + 1 +
+	return 2 * (MaxConnections + autovacuum_worker_slots + 1 +
 				max_wal_senders + max_worker_processes);
 }
 
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 1b23efb26f..c20c9338ec 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -142,7 +142,7 @@ ProcGlobalSemas(void)
  *	  So, now we grab enough semaphores to support the desired max number
  *	  of backends immediately at initialization --- if the sysadmin has set
  *	  MaxConnections, max_worker_processes, max_wal_senders, or
- *	  autovacuum_max_workers higher than his kernel will support, he'll
+ *	  autovacuum_worker_slots higher than his kernel will support, he'll
  *	  find out sooner rather than later.
  *
  *	  Another reason for creating semaphores here is that the semaphore
@@ -242,13 +242,13 @@ InitProcGlobal(void)
 			dlist_push_tail(&ProcGlobal->freeProcs, &proc->links);
 			proc->procgloballist = &ProcGlobal->freeProcs;
 		}
-		else if (i < MaxConnections + autovacuum_max_workers + 1)
+		else if (i < MaxConnections + autovacuum_worker_slots + 1)
 		{
 			/* PGPROC for AV launcher/worker, add to autovacFreeProcs list */
 			dlist_push_tail(&ProcGlobal->autovacFreeProcs, &proc->links);
 			proc->procgloballist = &ProcGlobal->autovacFreeProcs;
 		}
-		else if (i < MaxConnections + autovacuum_max_workers + 1 + max_worker_processes)
+		else if (i < MaxConnections + autovacuum_worker_slots + 1 + max_worker_processes)
 		{
 			/* PGPROC for bgworker, add to bgworkerFreeProcs list */
 			dlist_push_tail(&ProcGlobal->bgworkerFreeProcs, &proc->links);
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 25867c8bd5..acbae29baf 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -577,15 +577,15 @@ InitializeMaxBackends(void)
 	Assert(MaxBackends == 0);
 
 	/* the extra unit accounts for the autovacuum launcher */
-	MaxBackends = MaxConnections + autovacuum_max_workers + 1 +
+	MaxBackends = MaxConnections + autovacuum_worker_slots + 1 +
 		max_worker_processes + max_wal_senders;
 
 	if (MaxBackends > MAX_BACKENDS)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("too many server processes configured"),
-				 errdetail("\"max_connections\" (%d) plus \"autovacuum_max_workers\" (%d) plus \"max_worker_processes\" (%d) plus \"max_wal_senders\" (%d) must be less than %d.",
-						   MaxConnections, autovacuum_max_workers,
+				 errdetail("\"max_connections\" (%d) plus \"autovacuum_worker_slots\" (%d) plus \"max_worker_processes\" (%d) plus \"max_wal_senders\" (%d) must be less than %d.",
+						   MaxConnections, autovacuum_worker_slots,
 						   max_worker_processes, max_wal_senders,
 						   MAX_BACKENDS)));
 }
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 6a623f5f34..19f6384638 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3385,7 +3385,16 @@ struct config_int ConfigureNamesInt[] =
 	},
 	{
 		/* see max_connections */
-		{"autovacuum_max_workers", PGC_POSTMASTER, AUTOVACUUM,
+		{"autovacuum_worker_slots", PGC_POSTMASTER, AUTOVACUUM,
+			gettext_noop("Sets the number of backend slots to allocate for autovacuum workers."),
+			NULL
+		},
+		&autovacuum_worker_slots,
+		16, 1, MAX_BACKENDS,
+		NULL, NULL, NULL
+	},
+	{
+		{"autovacuum_max_workers", PGC_SIGHUP, AUTOVACUUM,
 			gettext_noop("Sets the maximum number of simultaneously running autovacuum worker processes."),
 			NULL
 		},
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 9ec9f97e92..52a4d44b59 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -658,8 +658,9 @@
 
 #autovacuum = on			# Enable autovacuum subprocess?  'on'
 					# requires track_counts to also be on.
-#autovacuum_max_workers = 3		# max number of autovacuum subprocesses
+autovacuum_worker_slots = 16	# autovacuum worker slots to allocate
 					# (change requires restart)
+#autovacuum_max_workers = 3		# max number of autovacuum subprocesses
 #autovacuum_naptime = 1min		# time between autovacuum runs
 #autovacuum_vacuum_threshold = 50	# min number of row updates before
 					# vacuum
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index cae1e8b329..190baa699d 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -28,6 +28,7 @@ typedef enum
 
 /* GUC variables */
 extern PGDLLIMPORT bool autovacuum_start_daemon;
+extern PGDLLIMPORT int autovacuum_worker_slots;
 extern PGDLLIMPORT int autovacuum_max_workers;
 extern PGDLLIMPORT int autovacuum_work_mem;
 extern PGDLLIMPORT int autovacuum_naptime;
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 32ee98aebc..a5047038f3 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -617,6 +617,7 @@ sub init
 		}
 		print $conf "max_wal_senders = 10\n";
 		print $conf "max_replication_slots = 10\n";
+		print $conf "autovacuum_worker_slots = 3\n";
 		print $conf "wal_log_hints = on\n";
 		print $conf "hot_standby = on\n";
 		# conservative settings to ensure we can run multiple postmasters:
-- 
2.39.3 (Apple Git-146)

#43

Nathan Bossart

nathandbossart@gmail.com

over 1 year ago

In reply to: Nathan Bossart (#42)

Re: allow changing autovacuum_max_workers without restarting

If there are no remaining concerns, I'd like to move forward with
committing v9 in September's commitfest.

--
nathan

#44

Nathan Bossart

nathandbossart@gmail.com

about 1 year ago

In reply to: Nathan Bossart (#43)

1 attachment(s)

Re: allow changing autovacuum_max_workers without restarting

rebased

--
nathan

Attachments:

v10-0001-allow-changing-autovacuum_max_workers-without-re.patchtext/plain; charset=us-asciiDownload

From e877271830e076338f999ee72b9d8148e469d5d2 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Sat, 22 Jun 2024 15:05:44 -0500
Subject: [PATCH v10 1/1] allow changing autovacuum_max_workers without
 restarting

---
 doc/src/sgml/config.sgml                      | 28 ++++++-
 doc/src/sgml/runtime.sgml                     |  4 +-
 src/backend/access/transam/xlog.c             |  2 +-
 src/backend/postmaster/autovacuum.c           | 76 +++++++++++++++----
 src/backend/postmaster/pmchild.c              |  4 +-
 src/backend/storage/lmgr/proc.c               |  6 +-
 src/backend/utils/init/postinit.c             |  6 +-
 src/backend/utils/misc/guc_tables.c           | 11 ++-
 src/backend/utils/misc/postgresql.conf.sample |  3 +-
 src/include/postmaster/autovacuum.h           |  1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |  1 +
 11 files changed, 112 insertions(+), 30 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a84e60c09b..7db171198a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8590,6 +8590,25 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-autovacuum-worker-slots" xreflabel="autovacuum_worker_slots">
+      <term><varname>autovacuum_worker_slots</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>autovacuum_worker_slots</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Specifies the number of backend slots to reserve for autovacuum worker
+        processes.  The default is 16.  This parameter can only be set at server
+        start.
+       </para>
+       <para>
+        When changing this value, consider also adjusting
+        <xref linkend="guc-autovacuum-max-workers"/>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-autovacuum-max-workers" xreflabel="autovacuum_max_workers">
       <term><varname>autovacuum_max_workers</varname> (<type>integer</type>)
       <indexterm>
@@ -8600,7 +8619,14 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
        <para>
         Specifies the maximum number of autovacuum processes (other than the
         autovacuum launcher) that may be running at any one time.  The default
-        is three.  This parameter can only be set at server start.
+        is three.  This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command line.
+       </para>
+       <para>
+        Note that a setting for this value which is higher than
+        <xref linkend="guc-autovacuum-worker-slots"/> will have no effect,
+        since autovacuum workers are taken from the pool of slots established
+        by that setting.
        </para>
       </listitem>
      </varlistentry>
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index bcd81e2415..8b7ae27908 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -839,7 +839,7 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     When using System V semaphores,
     <productname>PostgreSQL</productname> uses one semaphore per allowed connection
     (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
-    (<xref linkend="guc-autovacuum-max-workers"/>), allowed WAL sender process
+    (<xref linkend="guc-autovacuum-worker-slots"/>), allowed WAL sender process
     (<xref linkend="guc-max-wal-senders"/>), allowed background
     process (<xref linkend="guc-max-worker-processes"/>), etc., in sets of 16.
     The runtime-computed parameter <xref linkend="guc-num-os-semaphores"/>
@@ -892,7 +892,7 @@ $ <userinput>postgres -D $PGDATA -C num_os_semaphores</userinput>
     When using POSIX semaphores, the number of semaphores needed is the
     same as for System V, that is one semaphore per allowed connection
     (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
-    (<xref linkend="guc-autovacuum-max-workers"/>), allowed WAL sender process
+    (<xref linkend="guc-autovacuum-worker-slots"/>), allowed WAL sender process
     (<xref linkend="guc-max-wal-senders"/>), allowed background
     process (<xref linkend="guc-max-worker-processes"/>), etc.
     On the platforms where this option is preferred, there is no specific
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 6f58412bca..706f2127de 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5403,7 +5403,7 @@ CheckRequiredParameterValues(void)
 	 */
 	if (ArchiveRecoveryRequested && EnableHotStandby)
 	{
-		/* We ignore autovacuum_max_workers when we make this test. */
+		/* We ignore autovacuum_worker_slots when we make this test. */
 		RecoveryRequiresIntParameter("max_connections",
 									 MaxConnections,
 									 ControlFile->MaxConnections);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index dc3cf87aba..963924cbc7 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -115,6 +115,7 @@
  * GUC parameters
  */
 bool		autovacuum_start_daemon = false;
+int			autovacuum_worker_slots;
 int			autovacuum_max_workers;
 int			autovacuum_work_mem = -1;
 int			autovacuum_naptime;
@@ -210,7 +211,7 @@ typedef struct autovac_table
 /*-------------
  * This struct holds information about a single worker's whereabouts.  We keep
  * an array of these in shared memory, sized according to
- * autovacuum_max_workers.
+ * autovacuum_worker_slots.
  *
  * wi_links		entry into free list or running list
  * wi_dboid		OID of the database this worker is supposed to work on
@@ -291,7 +292,7 @@ typedef struct
 {
 	sig_atomic_t av_signal[AutoVacNumSignals];
 	pid_t		av_launcherpid;
-	dlist_head	av_freeWorkers;
+	dclist_head av_freeWorkers;
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
@@ -349,6 +350,8 @@ static void autovac_report_activity(autovac_table *tab);
 static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
 									const char *nspname, const char *relname);
 static void avl_sigusr2_handler(SIGNAL_ARGS);
+static bool av_worker_available(void);
+static void CheckAutovacuumWorkerGUCs(void);
 
 
 
@@ -425,6 +428,12 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 										  ALLOCSET_DEFAULT_SIZES);
 	MemoryContextSwitchTo(AutovacMemCxt);
 
+	/*
+	 * Emit a WARNING if autovacuum_worker_slots < autovacuum_max_workers.  We
+	 * do this on startup and on subsequent configuration reloads as needed.
+	 */
+	CheckAutovacuumWorkerGUCs();
+
 	/*
 	 * If an exception is encountered, processing resumes here.
 	 *
@@ -577,8 +586,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		 * wakening conditions.
 		 */
 
-		launcher_determine_sleep(!dlist_is_empty(&AutoVacuumShmem->av_freeWorkers),
-								 false, &nap);
+		launcher_determine_sleep(av_worker_available(), false, &nap);
 
 		/*
 		 * Wait until naptime expires or we get some type of signal (all the
@@ -638,7 +646,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		current_time = GetCurrentTimestamp();
 		LWLockAcquire(AutovacuumLock, LW_SHARED);
 
-		can_launch = !dlist_is_empty(&AutoVacuumShmem->av_freeWorkers);
+		can_launch = av_worker_available();
 
 		if (AutoVacuumShmem->av_startingWorker != NULL)
 		{
@@ -681,8 +689,8 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 					worker->wi_sharedrel = false;
 					worker->wi_proc = NULL;
 					worker->wi_launchtime = 0;
-					dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-									&worker->wi_links);
+					dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+									 &worker->wi_links);
 					AutoVacuumShmem->av_startingWorker = NULL;
 					ereport(WARNING,
 							errmsg("autovacuum worker took too long to start; canceled"));
@@ -747,6 +755,8 @@ HandleAutoVacLauncherInterrupts(void)
 
 	if (ConfigReloadPending)
 	{
+		int			autovacuum_max_workers_prev = autovacuum_max_workers;
+
 		ConfigReloadPending = false;
 		ProcessConfigFile(PGC_SIGHUP);
 
@@ -754,6 +764,14 @@ HandleAutoVacLauncherInterrupts(void)
 		if (!AutoVacuumingActive())
 			AutoVacLauncherShutdown();
 
+		/*
+		 * If autovacuum_max_workers changed, emit a WARNING if
+		 * autovacuum_worker_slots < autovacuum_max_workers.  If it didn't
+		 * change, skip this to avoid too many repeated log messages.
+		 */
+		if (autovacuum_max_workers_prev != autovacuum_max_workers)
+			CheckAutovacuumWorkerGUCs();
+
 		/* rebuild the list in case the naptime changed */
 		rebuild_database_list(InvalidOid);
 	}
@@ -1089,7 +1107,7 @@ do_start_worker(void)
 
 	/* return quickly when there are no free workers */
 	LWLockAcquire(AutovacuumLock, LW_SHARED);
-	if (dlist_is_empty(&AutoVacuumShmem->av_freeWorkers))
+	if (!av_worker_available())
 	{
 		LWLockRelease(AutovacuumLock);
 		return InvalidOid;
@@ -1242,7 +1260,7 @@ do_start_worker(void)
 		 * Get a worker entry from the freelist.  We checked above, so there
 		 * really should be a free slot.
 		 */
-		wptr = dlist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
+		wptr = dclist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
 
 		worker = dlist_container(WorkerInfoData, wi_links, wptr);
 		worker->wi_dboid = avdb->adw_datid;
@@ -1611,8 +1629,8 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
 		pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
-		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-						&MyWorkerInfo->wi_links);
+		dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+						 &MyWorkerInfo->wi_links);
 		/* not mine anymore */
 		MyWorkerInfo = NULL;
 
@@ -3273,7 +3291,7 @@ AutoVacuumShmemSize(void)
 	 */
 	size = sizeof(AutoVacuumShmemStruct);
 	size = MAXALIGN(size);
-	size = add_size(size, mul_size(autovacuum_max_workers,
+	size = add_size(size, mul_size(autovacuum_worker_slots,
 								   sizeof(WorkerInfoData)));
 	return size;
 }
@@ -3300,7 +3318,7 @@ AutoVacuumShmemInit(void)
 		Assert(!found);
 
 		AutoVacuumShmem->av_launcherpid = 0;
-		dlist_init(&AutoVacuumShmem->av_freeWorkers);
+		dclist_init(&AutoVacuumShmem->av_freeWorkers);
 		dlist_init(&AutoVacuumShmem->av_runningWorkers);
 		AutoVacuumShmem->av_startingWorker = NULL;
 		memset(AutoVacuumShmem->av_workItems, 0,
@@ -3310,10 +3328,10 @@ AutoVacuumShmemInit(void)
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
 		/* initialize the WorkerInfo free list */
-		for (i = 0; i < autovacuum_max_workers; i++)
+		for (i = 0; i < autovacuum_worker_slots; i++)
 		{
-			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-							&worker[i].wi_links);
+			dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+							 &worker[i].wi_links);
 			pg_atomic_init_flag(&worker[i].wi_dobalance);
 		}
 
@@ -3349,3 +3367,29 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
 
 	return true;
 }
+
+/*
+ * Returns whether there is a free autovacuum worker slot available.
+ */
+static bool
+av_worker_available(void)
+{
+	int			reserved = autovacuum_worker_slots - autovacuum_max_workers;
+
+	return dclist_count(&AutoVacuumShmem->av_freeWorkers) > Max(0, reserved);
+}
+
+/*
+ * Emit a WARNING if autovacuum_worker_slots < autovacuum_max_workers.
+ */
+static void
+CheckAutovacuumWorkerGUCs(void)
+{
+	if (autovacuum_worker_slots < autovacuum_max_workers)
+		ereport(WARNING,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("\"autovacuum_max_workers\" (%d) should be less than or equal to \"autovacuum_worker_slots\" (%d)",
+						autovacuum_max_workers, autovacuum_worker_slots),
+				 errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
+						   autovacuum_worker_slots)));
+}
diff --git a/src/backend/postmaster/pmchild.c b/src/backend/postmaster/pmchild.c
index 381cf005a9..821c225aad 100644
--- a/src/backend/postmaster/pmchild.c
+++ b/src/backend/postmaster/pmchild.c
@@ -8,7 +8,7 @@
  * child process is allocated a PMChild struct from a fixed pool of structs.
  * The size of the pool is determined by various settings that configure how
  * many worker processes and backend connections are allowed, i.e.
- * autovacuum_max_workers, max_worker_processes, max_wal_senders, and
+ * autovacuum_worker_slots, max_worker_processes, max_wal_senders, and
  * max_connections.
  *
  * Dead-end backends are handled slightly differently.  There is no limit
@@ -99,7 +99,7 @@ InitPostmasterChildSlots(void)
 	 */
 	pmchild_pools[B_BACKEND].size = 2 * (MaxConnections + max_wal_senders);
 
-	pmchild_pools[B_AUTOVAC_WORKER].size = autovacuum_max_workers;
+	pmchild_pools[B_AUTOVAC_WORKER].size = autovacuum_worker_slots;
 	pmchild_pools[B_BG_WORKER].size = max_worker_processes;
 
 	/*
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 720ef99ee8..b617db1a8c 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -150,7 +150,7 @@ ProcGlobalSemas(void)
  *	  So, now we grab enough semaphores to support the desired max number
  *	  of backends immediately at initialization --- if the sysadmin has set
  *	  MaxConnections, max_worker_processes, max_wal_senders, or
- *	  autovacuum_max_workers higher than his kernel will support, he'll
+ *	  autovacuum_worker_slots higher than his kernel will support, he'll
  *	  find out sooner rather than later.
  *
  *	  Another reason for creating semaphores here is that the semaphore
@@ -282,13 +282,13 @@ InitProcGlobal(void)
 			dlist_push_tail(&ProcGlobal->freeProcs, &proc->links);
 			proc->procgloballist = &ProcGlobal->freeProcs;
 		}
-		else if (i < MaxConnections + autovacuum_max_workers + 1)
+		else if (i < MaxConnections + autovacuum_worker_slots + 1)
 		{
 			/* PGPROC for AV launcher/worker, add to autovacFreeProcs list */
 			dlist_push_tail(&ProcGlobal->autovacFreeProcs, &proc->links);
 			proc->procgloballist = &ProcGlobal->autovacFreeProcs;
 		}
-		else if (i < MaxConnections + autovacuum_max_workers + 1 + max_worker_processes)
+		else if (i < MaxConnections + autovacuum_worker_slots + 1 + max_worker_processes)
 		{
 			/* PGPROC for bgworker, add to bgworkerFreeProcs list */
 			dlist_push_tail(&ProcGlobal->bgworkerFreeProcs, &proc->links);
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 5b657a3f13..729756b84c 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -545,15 +545,15 @@ InitializeMaxBackends(void)
 	Assert(MaxBackends == 0);
 
 	/* the extra unit accounts for the autovacuum launcher */
-	MaxBackends = MaxConnections + autovacuum_max_workers + 1 +
+	MaxBackends = MaxConnections + autovacuum_worker_slots + 1 +
 		max_worker_processes + max_wal_senders;
 
 	if (MaxBackends > MAX_BACKENDS)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("too many server processes configured"),
-				 errdetail("\"max_connections\" (%d) plus \"autovacuum_max_workers\" (%d) plus \"max_worker_processes\" (%d) plus \"max_wal_senders\" (%d) must be less than %d.",
-						   MaxConnections, autovacuum_max_workers,
+				 errdetail("\"max_connections\" (%d) plus \"autovacuum_worker_slots\" (%d) plus \"max_worker_processes\" (%d) plus \"max_wal_senders\" (%d) must be less than %d.",
+						   MaxConnections, autovacuum_worker_slots,
 						   max_worker_processes, max_wal_senders,
 						   MAX_BACKENDS)));
 }
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8a67f01200..4b37c0d43c 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3448,7 +3448,16 @@ struct config_int ConfigureNamesInt[] =
 	},
 	{
 		/* see max_connections */
-		{"autovacuum_max_workers", PGC_POSTMASTER, AUTOVACUUM,
+		{"autovacuum_worker_slots", PGC_POSTMASTER, AUTOVACUUM,
+			gettext_noop("Sets the number of backend slots to allocate for autovacuum workers."),
+			NULL
+		},
+		&autovacuum_worker_slots,
+		16, 1, MAX_BACKENDS,
+		NULL, NULL, NULL
+	},
+	{
+		{"autovacuum_max_workers", PGC_SIGHUP, AUTOVACUUM,
 			gettext_noop("Sets the maximum number of simultaneously running autovacuum worker processes."),
 			NULL
 		},
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 39a3ac2312..6225a24102 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -659,8 +659,9 @@
 
 #autovacuum = on			# Enable autovacuum subprocess?  'on'
 					# requires track_counts to also be on.
-#autovacuum_max_workers = 3		# max number of autovacuum subprocesses
+autovacuum_worker_slots = 16	# autovacuum worker slots to allocate
 					# (change requires restart)
+#autovacuum_max_workers = 3		# max number of autovacuum subprocesses
 #autovacuum_naptime = 1min		# time between autovacuum runs
 #autovacuum_vacuum_threshold = 50	# min number of row updates before
 					# vacuum
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index cae1e8b329..190baa699d 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -28,6 +28,7 @@ typedef enum
 
 /* GUC variables */
 extern PGDLLIMPORT bool autovacuum_start_daemon;
+extern PGDLLIMPORT int autovacuum_worker_slots;
 extern PGDLLIMPORT int autovacuum_max_workers;
 extern PGDLLIMPORT int autovacuum_work_mem;
 extern PGDLLIMPORT int autovacuum_naptime;
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 508e5e3917..e827da5342 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -707,6 +707,7 @@ sub init
 		}
 		print $conf "max_wal_senders = 10\n";
 		print $conf "max_replication_slots = 10\n";
+		print $conf "autovacuum_worker_slots = 3\n";
 		print $conf "wal_log_hints = on\n";
 		print $conf "hot_standby = on\n";
 		# conservative settings to ensure we can run multiple postmasters:
-- 
2.39.5 (Apple Git-154)

#45

Yogesh Sharma

yogesh.sharma@catprosystems.com

about 1 year ago

In reply to: Nathan Bossart (#44)

Re: allow changing autovacuum_max_workers without restarting

The following review has been posted through the commitfest application:
make installcheck-world: tested, failed
Implements feature: tested, failed
Spec compliant: tested, failed
Documentation: not tested

Hi,
- Tested patch with check-world.
- Verified CheckAutovacuumWorkerGUCs functionality and the correct WARNING was reported.
- For feature specific testing, I created multiple tables and generated bloat. Expected behavior was witnessed.
Lower autovacuum_worker_slots = 16 setting is better suited to start with.

Thanks
Yogesh

#46

Nathan Bossart

nathandbossart@gmail.com

about 1 year ago

In reply to: Nathan Bossart (#44)

1 attachment(s)

Re: allow changing autovacuum_max_workers without restarting

I think I've been saying I would commit this since August, but now I am
planning to do so first thing in the new year. In v11 of the patch, I
moved the initial startup WARNING to autovac_init() to avoid repeatedly
logging when the launcher restarts (e.g., for emergency vacuums when the
autovacuum parameter is disabled). Otherwise, I just made a couple of
cosmetic alterations and added a commit message.

--
nathan

Attachments:

v11-0001-Allow-changing-autovacuum_max_workers-without-re.patchtext/plain; charset=us-asciiDownload

From dad43c4138c2cc0712006565bb4984c2a211e8fe Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Fri, 20 Dec 2024 13:33:26 -0600
Subject: [PATCH v11 1/1] Allow changing autovacuum_max_workers without
 restarting.

This commit introduces a new parameter named
autovacuum_worker_slots that controls how many autovacuum worker
slots to reserve during server startup.  Modifying this new
parameter's value does require a server restart, but it should
typically be set to the upper bound of what you might realistically
need to set autovacuum_max_workers.  With that new parameter in
place, autovacuum_max_workers can now be changed with a SIGHUP
(e.g., pg_ctl reload).

If autovacuum_max_workers is set higher than
autovacuum_worker_slots, a WARNING is emitted, and the server will
only start up to autovacuum_worker_slots workers at a given time.
If autovacuum_max_workers is set to a value less than the number of
currently-running autovacuum workers, the existing workers will
continue running, but no new workers will be started until the
number of running autovacuum workers drops below
autovacuum_max_workers.

Reviewed-by: Sami Imseih, Justin Pryzby, Robert Haas, Andres Freund, Yogesh Sharma
Discussion: https://postgr.es/m/20240410212344.GA1824549%40nathanxps13
---
 doc/src/sgml/config.sgml                      | 28 ++++++-
 doc/src/sgml/runtime.sgml                     |  4 +-
 src/backend/access/transam/xlog.c             |  2 +-
 src/backend/postmaster/autovacuum.c           | 82 +++++++++++++++----
 src/backend/postmaster/pmchild.c              |  4 +-
 src/backend/storage/lmgr/proc.c               |  6 +-
 src/backend/utils/init/postinit.c             |  6 +-
 src/backend/utils/misc/guc_tables.c           | 11 ++-
 src/backend/utils/misc/postgresql.conf.sample |  3 +-
 src/include/postmaster/autovacuum.h           |  1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |  1 +
 11 files changed, 117 insertions(+), 31 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index fbdd6ce5740..740ff5d5044 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8630,6 +8630,25 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-autovacuum-worker-slots" xreflabel="autovacuum_worker_slots">
+      <term><varname>autovacuum_worker_slots</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>autovacuum_worker_slots</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Specifies the number of backend slots to reserve for autovacuum worker
+        processes.  The default is 16.  This parameter can only be set at server
+        start.
+       </para>
+       <para>
+        When changing this value, consider also adjusting
+        <xref linkend="guc-autovacuum-max-workers"/>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-autovacuum-max-workers" xreflabel="autovacuum_max_workers">
       <term><varname>autovacuum_max_workers</varname> (<type>integer</type>)
       <indexterm>
@@ -8640,7 +8659,14 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
        <para>
         Specifies the maximum number of autovacuum processes (other than the
         autovacuum launcher) that may be running at any one time.  The default
-        is three.  This parameter can only be set at server start.
+        is three.  This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command line.
+       </para>
+       <para>
+        Note that a setting for this value which is higher than
+        <xref linkend="guc-autovacuum-worker-slots"/> will have no effect,
+        since autovacuum workers are taken from the pool of slots established
+        by that setting.
        </para>
       </listitem>
      </varlistentry>
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 94135e9d5ee..537e356315d 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -839,7 +839,7 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     When using System V semaphores,
     <productname>PostgreSQL</productname> uses one semaphore per allowed connection
     (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
-    (<xref linkend="guc-autovacuum-max-workers"/>), allowed WAL sender process
+    (<xref linkend="guc-autovacuum-worker-slots"/>), allowed WAL sender process
     (<xref linkend="guc-max-wal-senders"/>), allowed background
     process (<xref linkend="guc-max-worker-processes"/>), etc., in sets of 16.
     The runtime-computed parameter <xref linkend="guc-num-os-semaphores"/>
@@ -892,7 +892,7 @@ $ <userinput>postgres -D $PGDATA -C num_os_semaphores</userinput>
     When using POSIX semaphores, the number of semaphores needed is the
     same as for System V, that is one semaphore per allowed connection
     (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
-    (<xref linkend="guc-autovacuum-max-workers"/>), allowed WAL sender process
+    (<xref linkend="guc-autovacuum-worker-slots"/>), allowed WAL sender process
     (<xref linkend="guc-max-wal-senders"/>), allowed background
     process (<xref linkend="guc-max-worker-processes"/>), etc.
     On the platforms where this option is preferred, there is no specific
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 6f58412bcab..706f2127def 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5403,7 +5403,7 @@ CheckRequiredParameterValues(void)
 	 */
 	if (ArchiveRecoveryRequested && EnableHotStandby)
 	{
-		/* We ignore autovacuum_max_workers when we make this test. */
+		/* We ignore autovacuum_worker_slots when we make this test. */
 		RecoveryRequiresIntParameter("max_connections",
 									 MaxConnections,
 									 ControlFile->MaxConnections);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index dc3cf87abab..0349e51042f 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -115,6 +115,7 @@
  * GUC parameters
  */
 bool		autovacuum_start_daemon = false;
+int			autovacuum_worker_slots;
 int			autovacuum_max_workers;
 int			autovacuum_work_mem = -1;
 int			autovacuum_naptime;
@@ -210,7 +211,7 @@ typedef struct autovac_table
 /*-------------
  * This struct holds information about a single worker's whereabouts.  We keep
  * an array of these in shared memory, sized according to
- * autovacuum_max_workers.
+ * autovacuum_worker_slots.
  *
  * wi_links		entry into free list or running list
  * wi_dboid		OID of the database this worker is supposed to work on
@@ -291,7 +292,7 @@ typedef struct
 {
 	sig_atomic_t av_signal[AutoVacNumSignals];
 	pid_t		av_launcherpid;
-	dlist_head	av_freeWorkers;
+	dclist_head av_freeWorkers;
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
@@ -349,6 +350,8 @@ static void autovac_report_activity(autovac_table *tab);
 static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
 									const char *nspname, const char *relname);
 static void avl_sigusr2_handler(SIGNAL_ARGS);
+static bool av_worker_available(void);
+static void check_av_worker_gucs(void);
 
 
 
@@ -577,8 +580,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		 * wakening conditions.
 		 */
 
-		launcher_determine_sleep(!dlist_is_empty(&AutoVacuumShmem->av_freeWorkers),
-								 false, &nap);
+		launcher_determine_sleep(av_worker_available(), false, &nap);
 
 		/*
 		 * Wait until naptime expires or we get some type of signal (all the
@@ -638,7 +640,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		current_time = GetCurrentTimestamp();
 		LWLockAcquire(AutovacuumLock, LW_SHARED);
 
-		can_launch = !dlist_is_empty(&AutoVacuumShmem->av_freeWorkers);
+		can_launch = av_worker_available();
 
 		if (AutoVacuumShmem->av_startingWorker != NULL)
 		{
@@ -681,8 +683,8 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 					worker->wi_sharedrel = false;
 					worker->wi_proc = NULL;
 					worker->wi_launchtime = 0;
-					dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-									&worker->wi_links);
+					dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+									 &worker->wi_links);
 					AutoVacuumShmem->av_startingWorker = NULL;
 					ereport(WARNING,
 							errmsg("autovacuum worker took too long to start; canceled"));
@@ -747,6 +749,8 @@ HandleAutoVacLauncherInterrupts(void)
 
 	if (ConfigReloadPending)
 	{
+		int			autovacuum_max_workers_prev = autovacuum_max_workers;
+
 		ConfigReloadPending = false;
 		ProcessConfigFile(PGC_SIGHUP);
 
@@ -754,6 +758,14 @@ HandleAutoVacLauncherInterrupts(void)
 		if (!AutoVacuumingActive())
 			AutoVacLauncherShutdown();
 
+		/*
+		 * If autovacuum_max_workers changed, emit a WARNING if
+		 * autovacuum_worker_slots < autovacuum_max_workers.  If it didn't
+		 * change, skip this to avoid too many repeated log messages.
+		 */
+		if (autovacuum_max_workers_prev != autovacuum_max_workers)
+			check_av_worker_gucs();
+
 		/* rebuild the list in case the naptime changed */
 		rebuild_database_list(InvalidOid);
 	}
@@ -1089,7 +1101,7 @@ do_start_worker(void)
 
 	/* return quickly when there are no free workers */
 	LWLockAcquire(AutovacuumLock, LW_SHARED);
-	if (dlist_is_empty(&AutoVacuumShmem->av_freeWorkers))
+	if (!av_worker_available())
 	{
 		LWLockRelease(AutovacuumLock);
 		return InvalidOid;
@@ -1242,7 +1254,7 @@ do_start_worker(void)
 		 * Get a worker entry from the freelist.  We checked above, so there
 		 * really should be a free slot.
 		 */
-		wptr = dlist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
+		wptr = dclist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
 
 		worker = dlist_container(WorkerInfoData, wi_links, wptr);
 		worker->wi_dboid = avdb->adw_datid;
@@ -1611,8 +1623,8 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
 		pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
-		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-						&MyWorkerInfo->wi_links);
+		dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+						 &MyWorkerInfo->wi_links);
 		/* not mine anymore */
 		MyWorkerInfo = NULL;
 
@@ -3253,10 +3265,14 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
 void
 autovac_init(void)
 {
-	if (autovacuum_start_daemon && !pgstat_track_counts)
+	if (!autovacuum_start_daemon)
+		return;
+	else if (!pgstat_track_counts)
 		ereport(WARNING,
 				(errmsg("autovacuum not started because of misconfiguration"),
 				 errhint("Enable the \"track_counts\" option.")));
+	else
+		check_av_worker_gucs();
 }
 
 /*
@@ -3273,7 +3289,7 @@ AutoVacuumShmemSize(void)
 	 */
 	size = sizeof(AutoVacuumShmemStruct);
 	size = MAXALIGN(size);
-	size = add_size(size, mul_size(autovacuum_max_workers,
+	size = add_size(size, mul_size(autovacuum_worker_slots,
 								   sizeof(WorkerInfoData)));
 	return size;
 }
@@ -3300,7 +3316,7 @@ AutoVacuumShmemInit(void)
 		Assert(!found);
 
 		AutoVacuumShmem->av_launcherpid = 0;
-		dlist_init(&AutoVacuumShmem->av_freeWorkers);
+		dclist_init(&AutoVacuumShmem->av_freeWorkers);
 		dlist_init(&AutoVacuumShmem->av_runningWorkers);
 		AutoVacuumShmem->av_startingWorker = NULL;
 		memset(AutoVacuumShmem->av_workItems, 0,
@@ -3310,10 +3326,10 @@ AutoVacuumShmemInit(void)
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
 		/* initialize the WorkerInfo free list */
-		for (i = 0; i < autovacuum_max_workers; i++)
+		for (i = 0; i < autovacuum_worker_slots; i++)
 		{
-			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-							&worker[i].wi_links);
+			dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+							 &worker[i].wi_links);
 			pg_atomic_init_flag(&worker[i].wi_dobalance);
 		}
 
@@ -3349,3 +3365,35 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
 
 	return true;
 }
+
+/*
+ * Returns whether there is a free autovacuum worker slot available.
+ */
+static bool
+av_worker_available(void)
+{
+	int			free_slots;
+	int			reserved_slots;
+
+	free_slots = dclist_count(&AutoVacuumShmem->av_freeWorkers);
+
+	reserved_slots = autovacuum_worker_slots - autovacuum_max_workers;
+	reserved_slots = Max(0, reserved_slots);
+
+	return free_slots > reserved_slots;
+}
+
+/*
+ * Emits a WARNING if autovacuum_worker_slots < autovacuum_max_workers.
+ */
+static void
+check_av_worker_gucs(void)
+{
+	if (autovacuum_worker_slots < autovacuum_max_workers)
+		ereport(WARNING,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("\"autovacuum_max_workers\" (%d) should be less than or equal to \"autovacuum_worker_slots\" (%d)",
+						autovacuum_max_workers, autovacuum_worker_slots),
+				 errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
+						   autovacuum_worker_slots)));
+}
diff --git a/src/backend/postmaster/pmchild.c b/src/backend/postmaster/pmchild.c
index 381cf005a9b..821c225aad3 100644
--- a/src/backend/postmaster/pmchild.c
+++ b/src/backend/postmaster/pmchild.c
@@ -8,7 +8,7 @@
  * child process is allocated a PMChild struct from a fixed pool of structs.
  * The size of the pool is determined by various settings that configure how
  * many worker processes and backend connections are allowed, i.e.
- * autovacuum_max_workers, max_worker_processes, max_wal_senders, and
+ * autovacuum_worker_slots, max_worker_processes, max_wal_senders, and
  * max_connections.
  *
  * Dead-end backends are handled slightly differently.  There is no limit
@@ -99,7 +99,7 @@ InitPostmasterChildSlots(void)
 	 */
 	pmchild_pools[B_BACKEND].size = 2 * (MaxConnections + max_wal_senders);
 
-	pmchild_pools[B_AUTOVAC_WORKER].size = autovacuum_max_workers;
+	pmchild_pools[B_AUTOVAC_WORKER].size = autovacuum_worker_slots;
 	pmchild_pools[B_BG_WORKER].size = max_worker_processes;
 
 	/*
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 720ef99ee83..b617db1a8ce 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -150,7 +150,7 @@ ProcGlobalSemas(void)
  *	  So, now we grab enough semaphores to support the desired max number
  *	  of backends immediately at initialization --- if the sysadmin has set
  *	  MaxConnections, max_worker_processes, max_wal_senders, or
- *	  autovacuum_max_workers higher than his kernel will support, he'll
+ *	  autovacuum_worker_slots higher than his kernel will support, he'll
  *	  find out sooner rather than later.
  *
  *	  Another reason for creating semaphores here is that the semaphore
@@ -282,13 +282,13 @@ InitProcGlobal(void)
 			dlist_push_tail(&ProcGlobal->freeProcs, &proc->links);
 			proc->procgloballist = &ProcGlobal->freeProcs;
 		}
-		else if (i < MaxConnections + autovacuum_max_workers + 1)
+		else if (i < MaxConnections + autovacuum_worker_slots + 1)
 		{
 			/* PGPROC for AV launcher/worker, add to autovacFreeProcs list */
 			dlist_push_tail(&ProcGlobal->autovacFreeProcs, &proc->links);
 			proc->procgloballist = &ProcGlobal->autovacFreeProcs;
 		}
-		else if (i < MaxConnections + autovacuum_max_workers + 1 + max_worker_processes)
+		else if (i < MaxConnections + autovacuum_worker_slots + 1 + max_worker_processes)
 		{
 			/* PGPROC for bgworker, add to bgworkerFreeProcs list */
 			dlist_push_tail(&ProcGlobal->bgworkerFreeProcs, &proc->links);
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 5b657a3f135..729756b84c9 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -545,15 +545,15 @@ InitializeMaxBackends(void)
 	Assert(MaxBackends == 0);
 
 	/* the extra unit accounts for the autovacuum launcher */
-	MaxBackends = MaxConnections + autovacuum_max_workers + 1 +
+	MaxBackends = MaxConnections + autovacuum_worker_slots + 1 +
 		max_worker_processes + max_wal_senders;
 
 	if (MaxBackends > MAX_BACKENDS)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("too many server processes configured"),
-				 errdetail("\"max_connections\" (%d) plus \"autovacuum_max_workers\" (%d) plus \"max_worker_processes\" (%d) plus \"max_wal_senders\" (%d) must be less than %d.",
-						   MaxConnections, autovacuum_max_workers,
+				 errdetail("\"max_connections\" (%d) plus \"autovacuum_worker_slots\" (%d) plus \"max_worker_processes\" (%d) plus \"max_wal_senders\" (%d) must be less than %d.",
+						   MaxConnections, autovacuum_worker_slots,
 						   max_worker_processes, max_wal_senders,
 						   MAX_BACKENDS)));
 }
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8cf1afbad20..e2da6487e43 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3467,7 +3467,16 @@ struct config_int ConfigureNamesInt[] =
 	},
 	{
 		/* see max_connections */
-		{"autovacuum_max_workers", PGC_POSTMASTER, AUTOVACUUM,
+		{"autovacuum_worker_slots", PGC_POSTMASTER, AUTOVACUUM,
+			gettext_noop("Sets the number of backend slots to allocate for autovacuum workers."),
+			NULL
+		},
+		&autovacuum_worker_slots,
+		16, 1, MAX_BACKENDS,
+		NULL, NULL, NULL
+	},
+	{
+		{"autovacuum_max_workers", PGC_SIGHUP, AUTOVACUUM,
 			gettext_noop("Sets the maximum number of simultaneously running autovacuum worker processes."),
 			NULL
 		},
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index a2ac7575ca7..b2bc43383db 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -661,8 +661,9 @@
 
 #autovacuum = on			# Enable autovacuum subprocess?  'on'
 					# requires track_counts to also be on.
-#autovacuum_max_workers = 3		# max number of autovacuum subprocesses
+autovacuum_worker_slots = 16	# autovacuum worker slots to allocate
 					# (change requires restart)
+#autovacuum_max_workers = 3		# max number of autovacuum subprocesses
 #autovacuum_naptime = 1min		# time between autovacuum runs
 #autovacuum_vacuum_threshold = 50	# min number of row updates before
 					# vacuum
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index cae1e8b3294..190baa699da 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -28,6 +28,7 @@ typedef enum
 
 /* GUC variables */
 extern PGDLLIMPORT bool autovacuum_start_daemon;
+extern PGDLLIMPORT int autovacuum_worker_slots;
 extern PGDLLIMPORT int autovacuum_max_workers;
 extern PGDLLIMPORT int autovacuum_work_mem;
 extern PGDLLIMPORT int autovacuum_naptime;
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 508e5e3917a..e827da53422 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -707,6 +707,7 @@ sub init
 		}
 		print $conf "max_wal_senders = 10\n";
 		print $conf "max_replication_slots = 10\n";
+		print $conf "autovacuum_worker_slots = 3\n";
 		print $conf "wal_log_hints = on\n";
 		print $conf "hot_standby = on\n";
 		# conservative settings to ensure we can run multiple postmasters:
-- 
2.39.5 (Apple Git-154)

#47

Nathan Bossart

nathandbossart@gmail.com

about 1 year ago

In reply to: Nathan Bossart (#46)

1 attachment(s)

Re: allow changing autovacuum_max_workers without restarting

On Fri, Dec 20, 2024 at 01:46:20PM -0600, Nathan Bossart wrote:

I think I've been saying I would commit this since August, but now I am
planning to do so first thing in the new year. In v11 of the patch, I
moved the initial startup WARNING to autovac_init() to avoid repeatedly
logging when the launcher restarts (e.g., for emergency vacuums when the
autovacuum parameter is disabled). Otherwise, I just made a couple of
cosmetic alterations and added a commit message.

This one needed a rebase after commit 2bdf1b2.

--
nathan

Attachments:

v12-0001-Allow-changing-autovacuum_max_workers-without-re.patchtext/plain; charset=us-asciiDownload

From 46ea1dbcbb3f344c656db7fead89e3e9753dfadc Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Fri, 20 Dec 2024 13:33:26 -0600
Subject: [PATCH v12 1/1] Allow changing autovacuum_max_workers without
 restarting.

This commit introduces a new parameter named
autovacuum_worker_slots that controls how many autovacuum worker
slots to reserve during server startup.  Modifying this new
parameter's value does require a server restart, but it should
typically be set to the upper bound of what you might realistically
need to set autovacuum_max_workers.  With that new parameter in
place, autovacuum_max_workers can now be changed with a SIGHUP
(e.g., pg_ctl reload).

If autovacuum_max_workers is set higher than
autovacuum_worker_slots, a WARNING is emitted, and the server will
only start up to autovacuum_worker_slots workers at a given time.
If autovacuum_max_workers is set to a value less than the number of
currently-running autovacuum workers, the existing workers will
continue running, but no new workers will be started until the
number of running autovacuum workers drops below
autovacuum_max_workers.

Reviewed-by: Sami Imseih, Justin Pryzby, Robert Haas, Andres Freund, Yogesh Sharma
Discussion: https://postgr.es/m/20240410212344.GA1824549%40nathanxps13
---
 doc/src/sgml/config.sgml                      | 28 ++++++-
 doc/src/sgml/runtime.sgml                     |  4 +-
 src/backend/access/transam/xlog.c             |  2 +-
 src/backend/postmaster/autovacuum.c           | 82 +++++++++++++++----
 src/backend/postmaster/pmchild.c              |  4 +-
 src/backend/storage/lmgr/proc.c               |  6 +-
 src/backend/utils/init/postinit.c             |  6 +-
 src/backend/utils/misc/guc_tables.c           | 11 ++-
 src/backend/utils/misc/postgresql.conf.sample |  3 +-
 src/include/postmaster/autovacuum.h           |  1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |  1 +
 11 files changed, 117 insertions(+), 31 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index fbdd6ce5740..740ff5d5044 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8630,6 +8630,25 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-autovacuum-worker-slots" xreflabel="autovacuum_worker_slots">
+      <term><varname>autovacuum_worker_slots</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>autovacuum_worker_slots</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Specifies the number of backend slots to reserve for autovacuum worker
+        processes.  The default is 16.  This parameter can only be set at server
+        start.
+       </para>
+       <para>
+        When changing this value, consider also adjusting
+        <xref linkend="guc-autovacuum-max-workers"/>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-autovacuum-max-workers" xreflabel="autovacuum_max_workers">
       <term><varname>autovacuum_max_workers</varname> (<type>integer</type>)
       <indexterm>
@@ -8640,7 +8659,14 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
        <para>
         Specifies the maximum number of autovacuum processes (other than the
         autovacuum launcher) that may be running at any one time.  The default
-        is three.  This parameter can only be set at server start.
+        is three.  This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command line.
+       </para>
+       <para>
+        Note that a setting for this value which is higher than
+        <xref linkend="guc-autovacuum-worker-slots"/> will have no effect,
+        since autovacuum workers are taken from the pool of slots established
+        by that setting.
        </para>
       </listitem>
      </varlistentry>
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 8750044852d..59f39e89924 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -839,7 +839,7 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
     When using System V semaphores,
     <productname>PostgreSQL</productname> uses one semaphore per allowed connection
     (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
-    (<xref linkend="guc-autovacuum-max-workers"/>), allowed WAL sender process
+    (<xref linkend="guc-autovacuum-worker-slots"/>), allowed WAL sender process
     (<xref linkend="guc-max-wal-senders"/>), allowed background
     process (<xref linkend="guc-max-worker-processes"/>), etc., in sets of 19.
     The runtime-computed parameter <xref linkend="guc-num-os-semaphores"/>
@@ -892,7 +892,7 @@ $ <userinput>postgres -D $PGDATA -C num_os_semaphores</userinput>
     When using POSIX semaphores, the number of semaphores needed is the
     same as for System V, that is one semaphore per allowed connection
     (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
-    (<xref linkend="guc-autovacuum-max-workers"/>), allowed WAL sender process
+    (<xref linkend="guc-autovacuum-worker-slots"/>), allowed WAL sender process
     (<xref linkend="guc-max-wal-senders"/>), allowed background
     process (<xref linkend="guc-max-worker-processes"/>), etc.
     On the platforms where this option is preferred, there is no specific
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b9ea92a5427..bf3dbda901d 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5403,7 +5403,7 @@ CheckRequiredParameterValues(void)
 	 */
 	if (ArchiveRecoveryRequested && EnableHotStandby)
 	{
-		/* We ignore autovacuum_max_workers when we make this test. */
+		/* We ignore autovacuum_worker_slots when we make this test. */
 		RecoveryRequiresIntParameter("max_connections",
 									 MaxConnections,
 									 ControlFile->MaxConnections);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 3f826532b88..0ab921a169b 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -115,6 +115,7 @@
  * GUC parameters
  */
 bool		autovacuum_start_daemon = false;
+int			autovacuum_worker_slots;
 int			autovacuum_max_workers;
 int			autovacuum_work_mem = -1;
 int			autovacuum_naptime;
@@ -210,7 +211,7 @@ typedef struct autovac_table
 /*-------------
  * This struct holds information about a single worker's whereabouts.  We keep
  * an array of these in shared memory, sized according to
- * autovacuum_max_workers.
+ * autovacuum_worker_slots.
  *
  * wi_links		entry into free list or running list
  * wi_dboid		OID of the database this worker is supposed to work on
@@ -291,7 +292,7 @@ typedef struct
 {
 	sig_atomic_t av_signal[AutoVacNumSignals];
 	pid_t		av_launcherpid;
-	dlist_head	av_freeWorkers;
+	dclist_head av_freeWorkers;
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
@@ -349,6 +350,8 @@ static void autovac_report_activity(autovac_table *tab);
 static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
 									const char *nspname, const char *relname);
 static void avl_sigusr2_handler(SIGNAL_ARGS);
+static bool av_worker_available(void);
+static void check_av_worker_gucs(void);
 
 
 
@@ -577,8 +580,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		 * wakening conditions.
 		 */
 
-		launcher_determine_sleep(!dlist_is_empty(&AutoVacuumShmem->av_freeWorkers),
-								 false, &nap);
+		launcher_determine_sleep(av_worker_available(), false, &nap);
 
 		/*
 		 * Wait until naptime expires or we get some type of signal (all the
@@ -638,7 +640,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 		current_time = GetCurrentTimestamp();
 		LWLockAcquire(AutovacuumLock, LW_SHARED);
 
-		can_launch = !dlist_is_empty(&AutoVacuumShmem->av_freeWorkers);
+		can_launch = av_worker_available();
 
 		if (AutoVacuumShmem->av_startingWorker != NULL)
 		{
@@ -681,8 +683,8 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 					worker->wi_sharedrel = false;
 					worker->wi_proc = NULL;
 					worker->wi_launchtime = 0;
-					dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-									&worker->wi_links);
+					dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+									 &worker->wi_links);
 					AutoVacuumShmem->av_startingWorker = NULL;
 					ereport(WARNING,
 							errmsg("autovacuum worker took too long to start; canceled"));
@@ -747,6 +749,8 @@ HandleAutoVacLauncherInterrupts(void)
 
 	if (ConfigReloadPending)
 	{
+		int			autovacuum_max_workers_prev = autovacuum_max_workers;
+
 		ConfigReloadPending = false;
 		ProcessConfigFile(PGC_SIGHUP);
 
@@ -754,6 +758,14 @@ HandleAutoVacLauncherInterrupts(void)
 		if (!AutoVacuumingActive())
 			AutoVacLauncherShutdown();
 
+		/*
+		 * If autovacuum_max_workers changed, emit a WARNING if
+		 * autovacuum_worker_slots < autovacuum_max_workers.  If it didn't
+		 * change, skip this to avoid too many repeated log messages.
+		 */
+		if (autovacuum_max_workers_prev != autovacuum_max_workers)
+			check_av_worker_gucs();
+
 		/* rebuild the list in case the naptime changed */
 		rebuild_database_list(InvalidOid);
 	}
@@ -1089,7 +1101,7 @@ do_start_worker(void)
 
 	/* return quickly when there are no free workers */
 	LWLockAcquire(AutovacuumLock, LW_SHARED);
-	if (dlist_is_empty(&AutoVacuumShmem->av_freeWorkers))
+	if (!av_worker_available())
 	{
 		LWLockRelease(AutovacuumLock);
 		return InvalidOid;
@@ -1242,7 +1254,7 @@ do_start_worker(void)
 		 * Get a worker entry from the freelist.  We checked above, so there
 		 * really should be a free slot.
 		 */
-		wptr = dlist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
+		wptr = dclist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
 
 		worker = dlist_container(WorkerInfoData, wi_links, wptr);
 		worker->wi_dboid = avdb->adw_datid;
@@ -1615,8 +1627,8 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
 		pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
-		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-						&MyWorkerInfo->wi_links);
+		dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+						 &MyWorkerInfo->wi_links);
 		/* not mine anymore */
 		MyWorkerInfo = NULL;
 
@@ -3248,10 +3260,14 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
 void
 autovac_init(void)
 {
-	if (autovacuum_start_daemon && !pgstat_track_counts)
+	if (!autovacuum_start_daemon)
+		return;
+	else if (!pgstat_track_counts)
 		ereport(WARNING,
 				(errmsg("autovacuum not started because of misconfiguration"),
 				 errhint("Enable the \"track_counts\" option.")));
+	else
+		check_av_worker_gucs();
 }
 
 /*
@@ -3268,7 +3284,7 @@ AutoVacuumShmemSize(void)
 	 */
 	size = sizeof(AutoVacuumShmemStruct);
 	size = MAXALIGN(size);
-	size = add_size(size, mul_size(autovacuum_max_workers,
+	size = add_size(size, mul_size(autovacuum_worker_slots,
 								   sizeof(WorkerInfoData)));
 	return size;
 }
@@ -3295,7 +3311,7 @@ AutoVacuumShmemInit(void)
 		Assert(!found);
 
 		AutoVacuumShmem->av_launcherpid = 0;
-		dlist_init(&AutoVacuumShmem->av_freeWorkers);
+		dclist_init(&AutoVacuumShmem->av_freeWorkers);
 		dlist_init(&AutoVacuumShmem->av_runningWorkers);
 		AutoVacuumShmem->av_startingWorker = NULL;
 		memset(AutoVacuumShmem->av_workItems, 0,
@@ -3305,10 +3321,10 @@ AutoVacuumShmemInit(void)
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
 		/* initialize the WorkerInfo free list */
-		for (i = 0; i < autovacuum_max_workers; i++)
+		for (i = 0; i < autovacuum_worker_slots; i++)
 		{
-			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
-							&worker[i].wi_links);
+			dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
+							 &worker[i].wi_links);
 			pg_atomic_init_flag(&worker[i].wi_dobalance);
 		}
 
@@ -3344,3 +3360,35 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
 
 	return true;
 }
+
+/*
+ * Returns whether there is a free autovacuum worker slot available.
+ */
+static bool
+av_worker_available(void)
+{
+	int			free_slots;
+	int			reserved_slots;
+
+	free_slots = dclist_count(&AutoVacuumShmem->av_freeWorkers);
+
+	reserved_slots = autovacuum_worker_slots - autovacuum_max_workers;
+	reserved_slots = Max(0, reserved_slots);
+
+	return free_slots > reserved_slots;
+}
+
+/*
+ * Emits a WARNING if autovacuum_worker_slots < autovacuum_max_workers.
+ */
+static void
+check_av_worker_gucs(void)
+{
+	if (autovacuum_worker_slots < autovacuum_max_workers)
+		ereport(WARNING,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("\"autovacuum_max_workers\" (%d) should be less than or equal to \"autovacuum_worker_slots\" (%d)",
+						autovacuum_max_workers, autovacuum_worker_slots),
+				 errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
+						   autovacuum_worker_slots)));
+}
diff --git a/src/backend/postmaster/pmchild.c b/src/backend/postmaster/pmchild.c
index 0d53812406c..0d473226c3a 100644
--- a/src/backend/postmaster/pmchild.c
+++ b/src/backend/postmaster/pmchild.c
@@ -8,7 +8,7 @@
  * child process is allocated a PMChild struct from a fixed pool of structs.
  * The size of the pool is determined by various settings that configure how
  * many worker processes and backend connections are allowed, i.e.
- * autovacuum_max_workers, max_worker_processes, max_wal_senders, and
+ * autovacuum_worker_slots, max_worker_processes, max_wal_senders, and
  * max_connections.
  *
  * Dead-end backends are handled slightly differently.  There is no limit
@@ -99,7 +99,7 @@ InitPostmasterChildSlots(void)
 	 */
 	pmchild_pools[B_BACKEND].size = 2 * (MaxConnections + max_wal_senders);
 
-	pmchild_pools[B_AUTOVAC_WORKER].size = autovacuum_max_workers;
+	pmchild_pools[B_AUTOVAC_WORKER].size = autovacuum_worker_slots;
 	pmchild_pools[B_BG_WORKER].size = max_worker_processes;
 
 	/*
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index c7a972df7dd..49204f91a20 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -150,7 +150,7 @@ ProcGlobalSemas(void)
  *	  So, now we grab enough semaphores to support the desired max number
  *	  of backends immediately at initialization --- if the sysadmin has set
  *	  MaxConnections, max_worker_processes, max_wal_senders, or
- *	  autovacuum_max_workers higher than his kernel will support, he'll
+ *	  autovacuum_worker_slots higher than his kernel will support, he'll
  *	  find out sooner rather than later.
  *
  *	  Another reason for creating semaphores here is that the semaphore
@@ -284,13 +284,13 @@ InitProcGlobal(void)
 			dlist_push_tail(&ProcGlobal->freeProcs, &proc->links);
 			proc->procgloballist = &ProcGlobal->freeProcs;
 		}
-		else if (i < MaxConnections + autovacuum_max_workers + NUM_SPECIAL_WORKER_PROCS)
+		else if (i < MaxConnections + autovacuum_worker_slots + NUM_SPECIAL_WORKER_PROCS)
 		{
 			/* PGPROC for AV or special worker, add to autovacFreeProcs list */
 			dlist_push_tail(&ProcGlobal->autovacFreeProcs, &proc->links);
 			proc->procgloballist = &ProcGlobal->autovacFreeProcs;
 		}
-		else if (i < MaxConnections + autovacuum_max_workers + NUM_SPECIAL_WORKER_PROCS + max_worker_processes)
+		else if (i < MaxConnections + autovacuum_worker_slots + NUM_SPECIAL_WORKER_PROCS + max_worker_processes)
 		{
 			/* PGPROC for bgworker, add to bgworkerFreeProcs list */
 			dlist_push_tail(&ProcGlobal->bgworkerFreeProcs, &proc->links);
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 46dbc46a97e..01bb6a410cb 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -547,15 +547,15 @@ InitializeMaxBackends(void)
 	Assert(MaxBackends == 0);
 
 	/* Note that this does not include "auxiliary" processes */
-	MaxBackends = MaxConnections + autovacuum_max_workers +
+	MaxBackends = MaxConnections + autovacuum_worker_slots +
 		max_worker_processes + max_wal_senders + NUM_SPECIAL_WORKER_PROCS;
 
 	if (MaxBackends > MAX_BACKENDS)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("too many server processes configured"),
-				 errdetail("\"max_connections\" (%d) plus \"autovacuum_max_workers\" (%d) plus \"max_worker_processes\" (%d) plus \"max_wal_senders\" (%d) must be less than %d.",
-						   MaxConnections, autovacuum_max_workers,
+				 errdetail("\"max_connections\" (%d) plus \"autovacuum_worker_slots\" (%d) plus \"max_worker_processes\" (%d) plus \"max_wal_senders\" (%d) must be less than %d.",
+						   MaxConnections, autovacuum_worker_slots,
 						   max_worker_processes, max_wal_senders,
 						   MAX_BACKENDS - (NUM_SPECIAL_WORKER_PROCS - 1))));
 }
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 22f16a3b462..c9d8cd796a8 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3467,7 +3467,16 @@ struct config_int ConfigureNamesInt[] =
 	},
 	{
 		/* see max_connections */
-		{"autovacuum_max_workers", PGC_POSTMASTER, AUTOVACUUM,
+		{"autovacuum_worker_slots", PGC_POSTMASTER, AUTOVACUUM,
+			gettext_noop("Sets the number of backend slots to allocate for autovacuum workers."),
+			NULL
+		},
+		&autovacuum_worker_slots,
+		16, 1, MAX_BACKENDS,
+		NULL, NULL, NULL
+	},
+	{
+		{"autovacuum_max_workers", PGC_SIGHUP, AUTOVACUUM,
 			gettext_noop("Sets the maximum number of simultaneously running autovacuum worker processes."),
 			NULL
 		},
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index a2ac7575ca7..b2bc43383db 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -661,8 +661,9 @@
 
 #autovacuum = on			# Enable autovacuum subprocess?  'on'
 					# requires track_counts to also be on.
-#autovacuum_max_workers = 3		# max number of autovacuum subprocesses
+autovacuum_worker_slots = 16	# autovacuum worker slots to allocate
 					# (change requires restart)
+#autovacuum_max_workers = 3		# max number of autovacuum subprocesses
 #autovacuum_naptime = 1min		# time between autovacuum runs
 #autovacuum_vacuum_threshold = 50	# min number of row updates before
 					# vacuum
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index edac746f3cf..54e01c81d68 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -28,6 +28,7 @@ typedef enum
 
 /* GUC variables */
 extern PGDLLIMPORT bool autovacuum_start_daemon;
+extern PGDLLIMPORT int autovacuum_worker_slots;
 extern PGDLLIMPORT int autovacuum_max_workers;
 extern PGDLLIMPORT int autovacuum_work_mem;
 extern PGDLLIMPORT int autovacuum_naptime;
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 74bf3408682..08b89a4cdff 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -707,6 +707,7 @@ sub init
 		}
 		print $conf "max_wal_senders = 10\n";
 		print $conf "max_replication_slots = 10\n";
+		print $conf "autovacuum_worker_slots = 3\n";
 		print $conf "wal_log_hints = on\n";
 		print $conf "hot_standby = on\n";
 		# conservative settings to ensure we can run multiple postmasters:
-- 
2.39.5 (Apple Git-154)

#48

Nathan Bossart

nathandbossart@gmail.com

about 1 year ago

In reply to: Nathan Bossart (#47)

Re: allow changing autovacuum_max_workers without restarting

Committed.

--
nathan

#49

Tom Lane

tgl@sss.pgh.pa.us

about 1 year ago

In reply to: Nathan Bossart (#48)

Re: allow changing autovacuum_max_workers without restarting

Nathan Bossart <nathandbossart@gmail.com> writes:

Committed.

Unsurprisingly, this has completely broken buildfarm member sawshark:
you added 13 new semaphores to the system's default requirements,
and we only had headroom for about 4 (cf. 38da05346).

Now maybe we should just abandon the notion that we ought to be
able to start up under OpenBSD/NetBSD's tiny default value of SEMMNS.
If so I'd be inclined to go revert 38da05346, at least in HEAD.
But I kind of wonder whether this feature actually brings value
commensurate with causing installation problems on real-world OSes.

regards, tom lane

#50

Nathan Bossart

nathandbossart@gmail.com

about 1 year ago

In reply to: Tom Lane (#49)

Re: allow changing autovacuum_max_workers without restarting

On Mon, Jan 06, 2025 at 04:29:37PM -0500, Tom Lane wrote:

Unsurprisingly, this has completely broken buildfarm member sawshark:
you added 13 new semaphores to the system's default requirements,
and we only had headroom for about 4 (cf. 38da05346).

Oh wow, I missed that the defaults were so low on some systems.

Now maybe we should just abandon the notion that we ought to be
able to start up under OpenBSD/NetBSD's tiny default value of SEMMNS.
If so I'd be inclined to go revert 38da05346, at least in HEAD.
But I kind of wonder whether this feature actually brings value
commensurate with causing installation problems on real-world OSes.

I'm obviously biased, but I think it would be unfortunate to block features
like this one because of low settings that would otherwise be unsuitable
for any reasonable production workload. If we do want to at least support
check-world on these systems, another option could be to simply lower the
default of autovacuum_worker_slots to 7 (or maybe lower). Of course, that
only helps until the next time more semaphores are required, but that's not
a new problem.

--
nathan

#51

Nathan Bossart

nathandbossart@gmail.com

about 1 year ago

In reply to: Nathan Bossart (#50)

1 attachment(s)

Re: allow changing autovacuum_max_workers without restarting

On Mon, Jan 06, 2025 at 03:50:24PM -0600, Nathan Bossart wrote:

I'm obviously biased, but I think it would be unfortunate to block features
like this one because of low settings that would otherwise be unsuitable
for any reasonable production workload. If we do want to at least support
check-world on these systems, another option could be to simply lower the
default of autovacuum_worker_slots to 7 (or maybe lower). Of course, that
only helps until the next time more semaphores are required, but that's not
a new problem.

I've attached a patch to lower the default to 5. That at least gives a
little bit of wiggle room for autovacuum_max_workers (and for a couple of
new auxiliary processes). FWIW the reason I originally set the default to
16 was to prevent most users from ever needing to think about adjusting
autovacuum_worker_slots (which requires a restart and is a completely new
parameter that most will be unfamiliar with).

--
nathan

Attachments:

v1-0001-lower-default-of-autovacuum_worker_slots-to-5.patchtext/plain; charset=us-asciiDownload

From 3f3c783c025e2a931bb766dc6d13c7186c957ebc Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Mon, 6 Jan 2025 16:09:49 -0600
Subject: [PATCH v1 1/1] lower default of autovacuum_worker_slots to 5

---
 doc/src/sgml/config.sgml                      | 2 +-
 src/backend/utils/misc/guc_tables.c           | 2 +-
 src/backend/utils/misc/postgresql.conf.sample | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 740ff5d5044..4fda5e6ae79 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8639,7 +8639,7 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
       <listitem>
        <para>
         Specifies the number of backend slots to reserve for autovacuum worker
-        processes.  The default is 16.  This parameter can only be set at server
+        processes.  The default is 5.  This parameter can only be set at server
         start.
        </para>
        <para>
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index c9d8cd796a8..c75359b6bc4 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3472,7 +3472,7 @@ struct config_int ConfigureNamesInt[] =
 			NULL
 		},
 		&autovacuum_worker_slots,
-		16, 1, MAX_BACKENDS,
+		5, 1, MAX_BACKENDS,
 		NULL, NULL, NULL
 	},
 	{
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index b2bc43383db..c7d3852c040 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -661,7 +661,7 @@
 
 #autovacuum = on			# Enable autovacuum subprocess?  'on'
 					# requires track_counts to also be on.
-autovacuum_worker_slots = 16	# autovacuum worker slots to allocate
+autovacuum_worker_slots = 5		# autovacuum worker slots to allocate
 					# (change requires restart)
 #autovacuum_max_workers = 3		# max number of autovacuum subprocesses
 #autovacuum_naptime = 1min		# time between autovacuum runs
-- 
2.39.5 (Apple Git-154)

#52

Andres Freund

andres@anarazel.de

about 1 year ago

In reply to: Nathan Bossart (#51)

Re: allow changing autovacuum_max_workers without restarting

Hi,

On January 6, 2025 5:15:25 PM EST, Nathan Bossart <nathandbossart@gmail.com> wrote:

On Mon, Jan 06, 2025 at 03:50:24PM -0600, Nathan Bossart wrote:

I'm obviously biased, but I think it would be unfortunate to block features
like this one because of low settings that would otherwise be unsuitable
for any reasonable production workload. If we do want to at least support
check-world on these systems, another option could be to simply lower the
default of autovacuum_worker_slots to 7 (or maybe lower). Of course, that
only helps until the next time more semaphores are required, but that's not
a new problem.

I've attached a patch to lower the default to 5. That at least gives a
little bit of wiggle room for autovacuum_max_workers (and for a couple of
new auxiliary processes). FWIW the reason I originally set the default to
16 was to prevent most users from ever needing to think about adjusting
autovacuum_worker_slots (which requires a restart and is a completely new
parameter that most will be unfamiliar with).

How about trying the higher setting first in initdb? On any sane system that won't cost anything because it'll succeed with the higher value.

Greetings,

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

#53

Tom Lane

tgl@sss.pgh.pa.us

about 1 year ago

In reply to: Andres Freund (#52)

Re: allow changing autovacuum_max_workers without restarting

Andres Freund <andres@anarazel.de> writes:

How about trying the higher setting first in initdb? On any sane system that won't cost anything because it'll succeed with the higher value.

That might be a good compromise. You'd have to think about how
it should interact with initdb's probes for workable values of
max_connections. My first thought about that is to have initdb
set autovacuum_worker_slots to max_connections / 8 or thereabouts
as it works down the list of max_connections values to try. Or
you could do something more complicated, but I don't see a reason
to make it too complex.

regards, tom lane

#54

Nathan Bossart

nathandbossart@gmail.com

about 1 year ago

In reply to: Tom Lane (#53)

1 attachment(s)

Re: allow changing autovacuum_max_workers without restarting

On Mon, Jan 06, 2025 at 05:36:17PM -0500, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

How about trying the higher setting first in initdb? On any sane system
that won't cost anything because it'll succeed with the higher value.

That might be a good compromise.

+1, I like the idea.

You'd have to think about how
it should interact with initdb's probes for workable values of
max_connections. My first thought about that is to have initdb
set autovacuum_worker_slots to max_connections / 8 or thereabouts
as it works down the list of max_connections values to try. Or
you could do something more complicated, but I don't see a reason
to make it too complex.

My first instinct was just to set it to the lowest default we'd consider
during the max_connections tests (which I'm assuming is 3 due to the
current default for autovacuum_max_workers). That way, the max_connections
default won't change from version to version on affected systems, but you
might get some extra autovacuum slots.

--
nathan

Attachments:

v1-0001-Lower-default-value-of-autovacuum_worker_slots-in.patchtext/plain; charset=us-asciiDownload

From e35b4c5425cbe1120d3c38619be4f1523ae1f2c9 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Mon, 6 Jan 2025 17:14:54 -0600
Subject: [PATCH v1 1/1] Lower default value of autovacuum_worker_slots in
 initdb as needed.

---
 doc/src/sgml/config.sgml |  5 +++--
 src/bin/initdb/initdb.c  | 41 +++++++++++++++++++++++++++++++++++-----
 2 files changed, 39 insertions(+), 7 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 740ff5d5044..8683f0bdf53 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8639,8 +8639,9 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
       <listitem>
        <para>
         Specifies the number of backend slots to reserve for autovacuum worker
-        processes.  The default is 16.  This parameter can only be set at server
-        start.
+        processes.  The default is typically 16 slots, but might be less if
+        your kernel settings will not support it (as determined during initdb).
+        This parameter can only be set at server start.
        </para>
        <para>
         When changing this value, consider also adjusting
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 4e4b7ede190..1214f1b492a 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -197,6 +197,7 @@ static char *pgdata_native;
 /* defaults */
 static int	n_connections = 10;
 static int	n_buffers = 50;
+static int	n_av_slots = 16;
 static const char *dynamic_shared_memory_type = NULL;
 static const char *default_timezone = NULL;
 
@@ -273,7 +274,8 @@ static void check_input(char *path);
 static void write_version_file(const char *extrapath);
 static void set_null_conf(void);
 static void test_config_settings(void);
-static bool test_specific_config_settings(int test_conns, int test_buffs);
+static bool test_specific_config_settings(int test_conns, int test_buffs,
+										  int test_av_slots);
 static void setup_config(void);
 static void bootstrap_template1(void);
 static void setup_auth(FILE *cmdfd);
@@ -1118,6 +1120,13 @@ test_config_settings(void)
 	 */
 #define MIN_BUFS_FOR_CONNS(nconns)	((nconns) * 10)
 
+	/*
+	 * This macro defines the minimum default autovacuum_worker_slots we are
+	 * willing to consider.  It should be kept >= the default for
+	 * autovacuum_max_workers.
+	 */
+#define MIN_AV_WORKER_SLOTS	(3)
+
 	static const int trial_conns[] = {
 		100, 50, 40, 30, 25
 	};
@@ -1155,7 +1164,9 @@ test_config_settings(void)
 		test_conns = trial_conns[i];
 		test_buffs = MIN_BUFS_FOR_CONNS(test_conns);
 
-		if (test_specific_config_settings(test_conns, test_buffs))
+		if (test_specific_config_settings(test_conns,
+										  test_buffs,
+										  MIN_AV_WORKER_SLOTS))
 		{
 			ok_buffers = test_buffs;
 			break;
@@ -1180,7 +1191,9 @@ test_config_settings(void)
 			break;
 		}
 
-		if (test_specific_config_settings(n_connections, test_buffs))
+		if (test_specific_config_settings(n_connections,
+										  test_buffs,
+										  MIN_AV_WORKER_SLOTS))
 			break;
 	}
 	n_buffers = test_buffs;
@@ -1190,6 +1203,19 @@ test_config_settings(void)
 	else
 		printf("%dkB\n", n_buffers * (BLCKSZ / 1024));
 
+	printf(_("selecting default \"autovacuum_worker_slots\" ... "));
+	fflush(stdout);
+
+	for (; n_av_slots > MIN_AV_WORKER_SLOTS; n_av_slots--)
+	{
+		if (test_specific_config_settings(n_connections,
+										  n_buffers,
+										  n_av_slots))
+			break;
+	}
+
+	printf("%d\n", n_av_slots);
+
 	printf(_("selecting default time zone ... "));
 	fflush(stdout);
 	default_timezone = select_default_timezone(share_path);
@@ -1200,7 +1226,7 @@ test_config_settings(void)
  * Test a specific combination of configuration settings.
  */
 static bool
-test_specific_config_settings(int test_conns, int test_buffs)
+test_specific_config_settings(int test_conns, int test_buffs, int test_av_slots)
 {
 	PQExpBufferData cmd;
 	_stringlist *gnames,
@@ -1214,9 +1240,10 @@ test_specific_config_settings(int test_conns, int test_buffs)
 					  "\"%s\" --check %s %s "
 					  "-c max_connections=%d "
 					  "-c shared_buffers=%d "
+					  "-c autovacuum_worker_slots=%d "
 					  "-c dynamic_shared_memory_type=%s",
 					  backend_exec, boot_options, extra_options,
-					  test_conns, test_buffs,
+					  test_conns, test_buffs, test_av_slots,
 					  dynamic_shared_memory_type);
 
 	/* Add any user-given setting overrides */
@@ -1289,6 +1316,10 @@ setup_config(void)
 	conflines = replace_guc_value(conflines, "shared_buffers",
 								  repltok, false);
 
+	snprintf(repltok, sizeof(repltok), "%d", n_av_slots);
+	conflines = replace_guc_value(conflines, "autovacuum_worker_slots",
+								  repltok, false);
+
 	conflines = replace_guc_value(conflines, "lc_messages",
 								  lc_messages, false);
 
-- 
2.39.5 (Apple Git-154)

#55

Tom Lane

tgl@sss.pgh.pa.us

about 1 year ago

In reply to: Nathan Bossart (#54)

Re: allow changing autovacuum_max_workers without restarting

Nathan Bossart <nathandbossart@gmail.com> writes:

On Mon, Jan 06, 2025 at 05:36:17PM -0500, Tom Lane wrote:

You'd have to think about how
it should interact with initdb's probes for workable values of
max_connections. My first thought about that is to have initdb
set autovacuum_worker_slots to max_connections / 8 or thereabouts
as it works down the list of max_connections values to try. Or
you could do something more complicated, but I don't see a reason
to make it too complex.

My first instinct was just to set it to the lowest default we'd consider
during the max_connections tests (which I'm assuming is 3 due to the
current default for autovacuum_max_workers). That way, the max_connections
default won't change from version to version on affected systems, but you
might get some extra autovacuum slots.

My only objection to this algorithm is it adds cycles to initdb,
in the form of at least one additional "postgres --check" step.
Admittedly that's not hugely expensive, but it'll add up over time
in the buildfarm, and I'm not sure this issue is worth that.
We already changed the max_connections default for affected systems
as a consequence of 38da05346, so I don't think the argument about not
changing it holds much water.

regards, tom lane

#56

Nathan Bossart

nathandbossart@gmail.com

about 1 year ago

In reply to: Tom Lane (#55)

1 attachment(s)

Re: allow changing autovacuum_max_workers without restarting

On Mon, Jan 06, 2025 at 06:36:43PM -0500, Tom Lane wrote:

Nathan Bossart <nathandbossart@gmail.com> writes:

My first instinct was just to set it to the lowest default we'd consider
during the max_connections tests (which I'm assuming is 3 due to the
current default for autovacuum_max_workers). That way, the max_connections
default won't change from version to version on affected systems, but you
might get some extra autovacuum slots.

My only objection to this algorithm is it adds cycles to initdb,
in the form of at least one additional "postgres --check" step.
Admittedly that's not hugely expensive, but it'll add up over time
in the buildfarm, and I'm not sure this issue is worth that.
We already changed the max_connections default for affected systems
as a consequence of 38da05346, so I don't think the argument about not
changing it holds much water.

I see. Here's a version that uses your max_connections / 8 idea. I've
lowered the initial default value of autovacuum_worker_slots to 12 to keep
the code as simple as possible. I considered trying 16 in the first
iteration or constructing a complicated formula like

autovacuum_worker_slots = (max_connections * 13) / 75 - 1

but this stuff is pretty fragile already, so I felt that simplicity was
desirable in this case.

--
nathan

Attachments:

v2-0001-Lower-default-value-of-autovacuum_worker_slots-in.patchtext/plain; charset=us-asciiDownload

From a2cb9161ccd7bf08c5273e4b4e012861a410ce51 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Mon, 6 Jan 2025 17:14:54 -0600
Subject: [PATCH v2 1/1] Lower default value of autovacuum_worker_slots in
 initdb as needed.

---
 doc/src/sgml/config.sgml                      |  5 +--
 src/backend/utils/misc/guc_tables.c           |  2 +-
 src/backend/utils/misc/postgresql.conf.sample |  2 +-
 src/bin/initdb/initdb.c                       | 34 +++++++++++++++----
 4 files changed, 33 insertions(+), 10 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 740ff5d5044..a47c4dbfcb2 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8639,8 +8639,9 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
       <listitem>
        <para>
         Specifies the number of backend slots to reserve for autovacuum worker
-        processes.  The default is 16.  This parameter can only be set at server
-        start.
+        processes.  The default is typically 12 slots, but might be less if
+        your kernel settings will not support it (as determined during initdb).
+        This parameter can only be set at server start.
        </para>
        <para>
         When changing this value, consider also adjusting
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index c9d8cd796a8..8781495a622 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3472,7 +3472,7 @@ struct config_int ConfigureNamesInt[] =
 			NULL
 		},
 		&autovacuum_worker_slots,
-		16, 1, MAX_BACKENDS,
+		12, 1, MAX_BACKENDS,
 		NULL, NULL, NULL
 	},
 	{
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index b2bc43383db..3405ca0e726 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -661,7 +661,7 @@
 
 #autovacuum = on			# Enable autovacuum subprocess?  'on'
 					# requires track_counts to also be on.
-autovacuum_worker_slots = 16	# autovacuum worker slots to allocate
+autovacuum_worker_slots = 12	# autovacuum worker slots to allocate
 					# (change requires restart)
 #autovacuum_max_workers = 3		# max number of autovacuum subprocesses
 #autovacuum_naptime = 1min		# time between autovacuum runs
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 4e4b7ede190..b5023484771 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -196,6 +196,7 @@ static char *pgdata_native;
 
 /* defaults */
 static int	n_connections = 10;
+static int	n_av_slots = 12;
 static int	n_buffers = 50;
 static const char *dynamic_shared_memory_type = NULL;
 static const char *default_timezone = NULL;
@@ -273,7 +274,8 @@ static void check_input(char *path);
 static void write_version_file(const char *extrapath);
 static void set_null_conf(void);
 static void test_config_settings(void);
-static bool test_specific_config_settings(int test_conns, int test_buffs);
+static bool test_specific_config_settings(int test_conns, int test_av_slots,
+										  int test_buffs);
 static void setup_config(void);
 static void bootstrap_template1(void);
 static void setup_auth(FILE *cmdfd);
@@ -1118,6 +1120,12 @@ test_config_settings(void)
 	 */
 #define MIN_BUFS_FOR_CONNS(nconns)	((nconns) * 10)
 
+	/*
+	 * This macro defines the default value of autovacuum_worker_slots we want
+	 * for a given max_connections value.
+	 */
+#define AV_SLOTS_FOR_CONNS(nconns)	((nconns) / 8)
+
 	static const int trial_conns[] = {
 		100, 50, 40, 30, 25
 	};
@@ -1145,7 +1153,8 @@ test_config_settings(void)
 
 	/*
 	 * Probe for max_connections before shared_buffers, since it is subject to
-	 * more constraints than shared_buffers.
+	 * more constraints than shared_buffers.  We also choose the default
+	 * autovacuum_worker_slots here.
 	 */
 	printf(_("selecting default \"max_connections\" ... "));
 	fflush(stdout);
@@ -1154,8 +1163,9 @@ test_config_settings(void)
 	{
 		test_conns = trial_conns[i];
 		test_buffs = MIN_BUFS_FOR_CONNS(test_conns);
+		n_av_slots = AV_SLOTS_FOR_CONNS(test_conns);
 
-		if (test_specific_config_settings(test_conns, test_buffs))
+		if (test_specific_config_settings(test_conns, n_av_slots, test_buffs))
 		{
 			ok_buffers = test_buffs;
 			break;
@@ -1167,6 +1177,13 @@ test_config_settings(void)
 
 	printf("%d\n", n_connections);
 
+	/*
+	 * We chose the default for autovacuum_worker_slots during the
+	 * max_connections tests above, but we print a progress message anyway.
+	 */
+	printf(_("selecting default \"autovacuum_worker_slots\" ... %d\n"),
+		   n_av_slots);
+
 	printf(_("selecting default \"shared_buffers\" ... "));
 	fflush(stdout);
 
@@ -1180,7 +1197,7 @@ test_config_settings(void)
 			break;
 		}
 
-		if (test_specific_config_settings(n_connections, test_buffs))
+		if (test_specific_config_settings(n_connections, n_av_slots, test_buffs))
 			break;
 	}
 	n_buffers = test_buffs;
@@ -1200,7 +1217,7 @@ test_config_settings(void)
  * Test a specific combination of configuration settings.
  */
 static bool
-test_specific_config_settings(int test_conns, int test_buffs)
+test_specific_config_settings(int test_conns, int test_av_slots, int test_buffs)
 {
 	PQExpBufferData cmd;
 	_stringlist *gnames,
@@ -1213,10 +1230,11 @@ test_specific_config_settings(int test_conns, int test_buffs)
 	printfPQExpBuffer(&cmd,
 					  "\"%s\" --check %s %s "
 					  "-c max_connections=%d "
+					  "-c autovacuum_worker_slots=%d "
 					  "-c shared_buffers=%d "
 					  "-c dynamic_shared_memory_type=%s",
 					  backend_exec, boot_options, extra_options,
-					  test_conns, test_buffs,
+					  test_conns, test_av_slots, test_buffs,
 					  dynamic_shared_memory_type);
 
 	/* Add any user-given setting overrides */
@@ -1280,6 +1298,10 @@ setup_config(void)
 	conflines = replace_guc_value(conflines, "max_connections",
 								  repltok, false);
 
+	snprintf(repltok, sizeof(repltok), "%d", n_av_slots);
+	conflines = replace_guc_value(conflines, "autovacuum_worker_slots",
+								  repltok, false);
+
 	if ((n_buffers * (BLCKSZ / 1024)) % 1024 == 0)
 		snprintf(repltok, sizeof(repltok), "%dMB",
 				 (n_buffers * (BLCKSZ / 1024)) / 1024);
-- 
2.39.5 (Apple Git-154)

#57

Tom Lane

tgl@sss.pgh.pa.us

about 1 year ago

In reply to: Nathan Bossart (#56)

Re: allow changing autovacuum_max_workers without restarting

Nathan Bossart <nathandbossart@gmail.com> writes:

On Mon, Jan 06, 2025 at 06:36:43PM -0500, Tom Lane wrote:

We already changed the max_connections default for affected systems
as a consequence of 38da05346, so I don't think the argument about not
changing it holds much water.

I see. Here's a version that uses your max_connections / 8 idea. I've
lowered the initial default value of autovacuum_worker_slots to 12 to keep
the code as simple as possible. I considered trying 16 in the first
iteration or constructing a complicated formula like
autovacuum_worker_slots = (max_connections * 13) / 75 - 1
but this stuff is pretty fragile already, so I felt that simplicity was
desirable in this case.

+1 for simplicity ... but on reflection, what do you think about
using max_connections / 6? That would keep autovacuum_worker_slots
at 100 / 6 = 16 for the vast majority of systems. For the worst case
*BSD machines, we'd select 25 / 6 = 4 which results in consuming one
more semaphore than where we were yesterday. I'm willing to accept
that outcome though, since we still have 3 or so to spare.

Other than the specific magic number, your patch LGTM.

regards, tom lane

#58

Nathan Bossart

nathandbossart@gmail.com

about 1 year ago

In reply to: Tom Lane (#57)

1 attachment(s)

Re: allow changing autovacuum_max_workers without restarting

On Mon, Jan 06, 2025 at 10:29:07PM -0500, Tom Lane wrote:

+1 for simplicity ... but on reflection, what do you think about
using max_connections / 6? That would keep autovacuum_worker_slots
at 100 / 6 = 16 for the vast majority of systems. For the worst case
*BSD machines, we'd select 25 / 6 = 4 which results in consuming one
more semaphore than where we were yesterday. I'm willing to accept
that outcome though, since we still have 3 or so to spare.

WFM. I'm kicking myself for not having thought of that...

Other than the specific magic number, your patch LGTM.

Here's a new version of the patch with some small cosmetic changes
(including more commentary about the formula) and the constant changed to
6. I'll go commit this shortly.

--
nathan

Attachments:

v3-0001-Lower-default-value-of-autovacuum_worker_slots-in.patchtext/plain; charset=us-asciiDownload

From d8b9fe6d3c166c14d9a8d17f43418be33c4e0784 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Tue, 7 Jan 2025 11:15:22 -0600
Subject: [PATCH v3 1/1] Lower default value of autovacuum_worker_slots in
 initdb as needed.

TODO

Reported-by: Tom Lane
Suggested-by: Andres Freund
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/1346002.1736198977%40sss.pgh.pa.us
---
 doc/src/sgml/config.sgml |  5 +++--
 src/bin/initdb/initdb.c  | 40 ++++++++++++++++++++++++++++++++++------
 2 files changed, 37 insertions(+), 8 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 740ff5d5044..8683f0bdf53 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8639,8 +8639,9 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
       <listitem>
        <para>
         Specifies the number of backend slots to reserve for autovacuum worker
-        processes.  The default is 16.  This parameter can only be set at server
-        start.
+        processes.  The default is typically 16 slots, but might be less if
+        your kernel settings will not support it (as determined during initdb).
+        This parameter can only be set at server start.
        </para>
        <para>
         When changing this value, consider also adjusting
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 4e4b7ede190..f2b9d50e9b3 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -196,6 +196,7 @@ static char *pgdata_native;
 
 /* defaults */
 static int	n_connections = 10;
+static int	n_av_slots = 16;
 static int	n_buffers = 50;
 static const char *dynamic_shared_memory_type = NULL;
 static const char *default_timezone = NULL;
@@ -273,7 +274,8 @@ static void check_input(char *path);
 static void write_version_file(const char *extrapath);
 static void set_null_conf(void);
 static void test_config_settings(void);
-static bool test_specific_config_settings(int test_conns, int test_buffs);
+static bool test_specific_config_settings(int test_conns, int test_av_slots,
+										  int test_buffs);
 static void setup_config(void);
 static void bootstrap_template1(void);
 static void setup_auth(FILE *cmdfd);
@@ -1118,6 +1120,18 @@ test_config_settings(void)
 	 */
 #define MIN_BUFS_FOR_CONNS(nconns)	((nconns) * 10)
 
+	/*
+	 * This macro defines the default value of autovacuum_worker_slots we want
+	 * for a given max_connections value.  Note that it has been carefully
+	 * crafted to provide specific values for the associated values in
+	 * trial_conns.  We want it to return autovacuum_worker_slot's initial
+	 * default value (16) for the maximum value in trial_conns (100), and we
+	 * want it to return close to the minimum value we'd consider (3, which is
+	 * the default of autovacuum_max_workers) for the minimum value in
+	 * trial_conns (25).
+	 */
+#define AV_SLOTS_FOR_CONNS(nconns)	((nconns) / 6)
+
 	static const int trial_conns[] = {
 		100, 50, 40, 30, 25
 	};
@@ -1145,7 +1159,8 @@ test_config_settings(void)
 
 	/*
 	 * Probe for max_connections before shared_buffers, since it is subject to
-	 * more constraints than shared_buffers.
+	 * more constraints than shared_buffers.  We also choose the default
+	 * autovacuum_worker_slots here.
 	 */
 	printf(_("selecting default \"max_connections\" ... "));
 	fflush(stdout);
@@ -1153,9 +1168,10 @@ test_config_settings(void)
 	for (i = 0; i < connslen; i++)
 	{
 		test_conns = trial_conns[i];
+		n_av_slots = AV_SLOTS_FOR_CONNS(test_conns);
 		test_buffs = MIN_BUFS_FOR_CONNS(test_conns);
 
-		if (test_specific_config_settings(test_conns, test_buffs))
+		if (test_specific_config_settings(test_conns, n_av_slots, test_buffs))
 		{
 			ok_buffers = test_buffs;
 			break;
@@ -1167,6 +1183,13 @@ test_config_settings(void)
 
 	printf("%d\n", n_connections);
 
+	/*
+	 * We chose the default for autovacuum_worker_slots during the
+	 * max_connections tests above, but we print a progress message anyway.
+	 */
+	printf(_("selecting default \"autovacuum_worker_slots\" ... %d\n"),
+		   n_av_slots);
+
 	printf(_("selecting default \"shared_buffers\" ... "));
 	fflush(stdout);
 
@@ -1180,7 +1203,7 @@ test_config_settings(void)
 			break;
 		}
 
-		if (test_specific_config_settings(n_connections, test_buffs))
+		if (test_specific_config_settings(n_connections, n_av_slots, test_buffs))
 			break;
 	}
 	n_buffers = test_buffs;
@@ -1200,7 +1223,7 @@ test_config_settings(void)
  * Test a specific combination of configuration settings.
  */
 static bool
-test_specific_config_settings(int test_conns, int test_buffs)
+test_specific_config_settings(int test_conns, int test_av_slots, int test_buffs)
 {
 	PQExpBufferData cmd;
 	_stringlist *gnames,
@@ -1213,10 +1236,11 @@ test_specific_config_settings(int test_conns, int test_buffs)
 	printfPQExpBuffer(&cmd,
 					  "\"%s\" --check %s %s "
 					  "-c max_connections=%d "
+					  "-c autovacuum_worker_slots=%d "
 					  "-c shared_buffers=%d "
 					  "-c dynamic_shared_memory_type=%s",
 					  backend_exec, boot_options, extra_options,
-					  test_conns, test_buffs,
+					  test_conns, test_av_slots, test_buffs,
 					  dynamic_shared_memory_type);
 
 	/* Add any user-given setting overrides */
@@ -1280,6 +1304,10 @@ setup_config(void)
 	conflines = replace_guc_value(conflines, "max_connections",
 								  repltok, false);
 
+	snprintf(repltok, sizeof(repltok), "%d", n_av_slots);
+	conflines = replace_guc_value(conflines, "autovacuum_worker_slots",
+								  repltok, false);
+
 	if ((n_buffers * (BLCKSZ / 1024)) % 1024 == 0)
 		snprintf(repltok, sizeof(repltok), "%dMB",
 				 (n_buffers * (BLCKSZ / 1024)) / 1024);
-- 
2.39.5 (Apple Git-154)

#59

Tom Lane

tgl@sss.pgh.pa.us

about 1 year ago

In reply to: Nathan Bossart (#58)

Re: allow changing autovacuum_max_workers without restarting

Nathan Bossart <nathandbossart@gmail.com> writes:

Here's a new version of the patch with some small cosmetic changes
(including more commentary about the formula) and the constant changed to
6. I'll go commit this shortly.

This one WFM. Thanks!

regards, tom lane

#60

Nathan Bossart

nathandbossart@gmail.com

about 1 year ago

In reply to: Tom Lane (#59)

Re: allow changing autovacuum_max_workers without restarting

On Tue, Jan 07, 2025 at 02:22:42PM -0500, Tom Lane wrote:

This one WFM. Thanks!

Committed, thanks for the report/review.

--
nathan

#61

Peter Eisentraut

peter@eisentraut.org

9 months ago

In reply to: Nathan Bossart (#58)

Re: allow changing autovacuum_max_workers without restarting

On 07.01.25 18:23, Nathan Bossart wrote:

+	/*
+	 * We chose the default for autovacuum_worker_slots during the
+	 * max_connections tests above, but we print a progress message anyway.
+	 */
+	printf(_("selecting default \"autovacuum_worker_slots\" ... %d\n"),
+		   n_av_slots);
+

This initdb output seems, well, kinda fake, which it is by its own
admission. Could we do this less fake maybe like this:

selecting default "max_connections", "autovacuum_worker_slots" ... 100, 16

with the actual wait at the "..."?

(It doesn't seem impossible that someone will want to add more default
selecting for various worker or process slots, and this would allow adding
these easily, versus adding more "fake" output lines.)

#62

Tom Lane

tgl@sss.pgh.pa.us

9 months ago

In reply to: Peter Eisentraut (#61)

Re: allow changing autovacuum_max_workers without restarting

Peter Eisentraut <peter@eisentraut.org> writes:

This initdb output seems, well, kinda fake, which it is by its own
admission.

Agreed.

Could we do this less fake maybe like this:
selecting default "max_connections", "autovacuum_worker_slots" ... 100, 16
with the actual wait at the "..."?

Perhaps that would be all right ...

(It doesn't seem impossible that someone will want to add more default
selecting for various worker or process slots, and this would allow adding
these easily, versus adding more "fake" output lines.)

... but I can't see this approach scaling to three or four or five
outputs. The line would get unreasonably long.

My own proposal given the way it works now is to just print
max_connections and not mention autovacuum_worker_slots at all.
Our choice for max_connections is worth reporting, but I don't
feel that everything derived from it needs to be reported.

regards, tom lane

#63

Nathan Bossart

nathandbossart@gmail.com

9 months ago

In reply to: Tom Lane (#62)

Re: allow changing autovacuum_max_workers without restarting

On Mon, Apr 28, 2025 at 09:14:54AM -0400, Tom Lane wrote:

Peter Eisentraut <peter@eisentraut.org> writes:

This initdb output seems, well, kinda fake, which it is by its own
admission.

Agreed.

Could we do this less fake maybe like this:
selecting default "max_connections", "autovacuum_worker_slots" ... 100, 16
with the actual wait at the "..."?

Perhaps that would be all right ...

(It doesn't seem impossible that someone will want to add more default
selecting for various worker or process slots, and this would allow adding
these easily, versus adding more "fake" output lines.)

... but I can't see this approach scaling to three or four or five
outputs. The line would get unreasonably long.

My own proposal given the way it works now is to just print
max_connections and not mention autovacuum_worker_slots at all.
Our choice for max_connections is worth reporting, but I don't
feel that everything derived from it needs to be reported.

I'm fine with either of these ideas. If I had to choose one, I'd just
remove the autovacuum_worker_slots report for the reasons Tom noted.

However, weren't we considering reverting some of this stuff [0]/messages/by-id/618497.1742347456@sss.pgh.pa.us? I see
that sawshark is now choosing max_connections = 40 and
autovacuum_worker_slots = 6, and since there are no other apparent related
buildfarm failures, I'm assuming that nobody else is testing the 60
semaphores case anymore.

[0]: /messages/by-id/618497.1742347456@sss.pgh.pa.us

--
nathan

#64

Peter Eisentraut

peter@eisentraut.org

9 months ago

In reply to: Nathan Bossart (#63)

Re: allow changing autovacuum_max_workers without restarting

On 28.04.25 16:41, Nathan Bossart wrote:

On Mon, Apr 28, 2025 at 09:14:54AM -0400, Tom Lane wrote:

Peter Eisentraut <peter@eisentraut.org> writes:

This initdb output seems, well, kinda fake, which it is by its own
admission.

Agreed.

Could we do this less fake maybe like this:
selecting default "max_connections", "autovacuum_worker_slots" ... 100, 16
with the actual wait at the "..."?

Perhaps that would be all right ...

(It doesn't seem impossible that someone will want to add more default
selecting for various worker or process slots, and this would allow adding
these easily, versus adding more "fake" output lines.)

... but I can't see this approach scaling to three or four or five
outputs. The line would get unreasonably long.

My own proposal given the way it works now is to just print
max_connections and not mention autovacuum_worker_slots at all.
Our choice for max_connections is worth reporting, but I don't
feel that everything derived from it needs to be reported.

I'm fine with either of these ideas. If I had to choose one, I'd just
remove the autovacuum_worker_slots report for the reasons Tom noted.

Yes, removing the report is also fine by me.

However, weren't we considering reverting some of this stuff [0]? I see
that sawshark is now choosing max_connections = 40 and
autovacuum_worker_slots = 6, and since there are no other apparent related
buildfarm failures, I'm assuming that nobody else is testing the 60
semaphores case anymore.

[0] /messages/by-id/618497.1742347456@sss.pgh.pa.us

(I don't have any thoughts on this.)

#65

Nathan Bossart

nathandbossart@gmail.com

9 months ago

In reply to: Peter Eisentraut (#64)

Re: allow changing autovacuum_max_workers without restarting

On Tue, Apr 29, 2025 at 08:36:48AM +0200, Peter Eisentraut wrote:

On 28.04.25 16:41, Nathan Bossart wrote:

On Mon, Apr 28, 2025 at 09:14:54AM -0400, Tom Lane wrote:

My own proposal given the way it works now is to just print
max_connections and not mention autovacuum_worker_slots at all.
Our choice for max_connections is worth reporting, but I don't
feel that everything derived from it needs to be reported.

I'm fine with either of these ideas. If I had to choose one, I'd just
remove the autovacuum_worker_slots report for the reasons Tom noted.

Yes, removing the report is also fine by me.

Committed.

--
nathan

#66

Tom Lane

tgl@sss.pgh.pa.us

9 months ago

In reply to: Peter Eisentraut (#64)

Re: allow changing autovacuum_max_workers without restarting

Peter Eisentraut <peter@eisentraut.org> writes:

On 28.04.25 16:41, Nathan Bossart wrote:

However, weren't we considering reverting some of this stuff [0]? I see
that sawshark is now choosing max_connections = 40 and
autovacuum_worker_slots = 6, and since there are no other apparent related
buildfarm failures, I'm assuming that nobody else is testing the 60
semaphores case anymore.

[0] /messages/by-id/618497.1742347456@sss.pgh.pa.us

(I don't have any thoughts on this.)

Andres seemed lukewarm about reverting 38da05346 or 6d0154196, so
I left it be for the moment. But I still feel the argument is good
that "these will do little except confuse future hackers". Barring
objection, I'll go revert them.

regards, tom lane

#67

Nathan Bossart

nathandbossart@gmail.com

9 months ago

In reply to: Tom Lane (#66)

Re: allow changing autovacuum_max_workers without restarting

On Tue, Apr 29, 2025 at 01:19:18PM -0400, Tom Lane wrote:

Peter Eisentraut <peter@eisentraut.org> writes:

On 28.04.25 16:41, Nathan Bossart wrote:

However, weren't we considering reverting some of this stuff [0]? I see
that sawshark is now choosing max_connections = 40 and
autovacuum_worker_slots = 6, and since there are no other apparent related
buildfarm failures, I'm assuming that nobody else is testing the 60
semaphores case anymore.

[0] /messages/by-id/618497.1742347456@sss.pgh.pa.us

(I don't have any thoughts on this.)

Andres seemed lukewarm about reverting 38da05346 or 6d0154196, so
I left it be for the moment. But I still feel the argument is good
that "these will do little except confuse future hackers". Barring
objection, I'll go revert them.

+1, I almost threatened the same but wasn't totally positive where the
discussion stood.

--
nathan

#68

Tom Lane

tgl@sss.pgh.pa.us

9 months ago

In reply to: Tom Lane (#66)

Re: allow changing autovacuum_max_workers without restarting

I wrote:

Andres seemed lukewarm about reverting 38da05346 or 6d0154196, so
I left it be for the moment. But I still feel the argument is good
that "these will do little except confuse future hackers". Barring
objection, I'll go revert them.

Actually ... on looking again at 6d0154196 ("Lower default value of
autovacuum_worker_slots in initdb as needed"), it doesn't look that
silly. If we're unable to allocate max_connections = 100, turning
it down while still insisting on 16 AV worker slots doesn't seem
terribly sane. Maybe we'd choose a formula other than
"(max_connections / 6)" if we were doing it afresh, but not scaling
autovacuum_worker_slots at all doesn't seem like the best answer.

So now I'm inclined to leave that one alone. I'd still revert
38da05346, which means the comment added by 6d0154196 needs some minor
adjustments. But I think we can stick with the "(max_connections /
6)" formula --- it will produce 3 with trial_conns = 20, but that's
enough.

regards, tom lane

#69

Nathan Bossart

nathandbossart@gmail.com

9 months ago

In reply to: Tom Lane (#68)

Re: allow changing autovacuum_max_workers without restarting

On Tue, Apr 29, 2025 at 01:31:55PM -0400, Tom Lane wrote:

I wrote:

Andres seemed lukewarm about reverting 38da05346 or 6d0154196, so
I left it be for the moment. But I still feel the argument is good
that "these will do little except confuse future hackers". Barring
objection, I'll go revert them.

Actually ... on looking again at 6d0154196 ("Lower default value of
autovacuum_worker_slots in initdb as needed"), it doesn't look that
silly. If we're unable to allocate max_connections = 100, turning
it down while still insisting on 16 AV worker slots doesn't seem
terribly sane. Maybe we'd choose a formula other than
"(max_connections / 6)" if we were doing it afresh, but not scaling
autovacuum_worker_slots at all doesn't seem like the best answer.

Fair point.

So now I'm inclined to leave that one alone. I'd still revert
38da05346, which means the comment added by 6d0154196 needs some minor
adjustments. But I think we can stick with the "(max_connections /
6)" formula --- it will produce 3 with trial_conns = 20, but that's
enough.

Yup, as long as the lowest possible default is >= the default for
autovacuum_max_workers (3), we're good.

--
nathan

#70

Tom Lane

tgl@sss.pgh.pa.us

9 months ago

In reply to: Nathan Bossart (#69)

Re: allow changing autovacuum_max_workers without restarting

Nathan Bossart <nathandbossart@gmail.com> writes:

On Tue, Apr 29, 2025 at 01:31:55PM -0400, Tom Lane wrote:

So now I'm inclined to leave that one alone. I'd still revert
38da05346, which means the comment added by 6d0154196 needs some minor
adjustments. But I think we can stick with the "(max_connections /
6)" formula --- it will produce 3 with trial_conns = 20, but that's
enough.

Yup, as long as the lowest possible default is >= the default for
autovacuum_max_workers (3), we're good.

Pushed. I realized that the text about SEMMNI/SEMMNS in runtime.sgml
needed some work too, since it still said that increasing them was
optional on NetBSD/OpenBSD.

regards, tom lane

#71

Nathan Bossart

nathandbossart@gmail.com

9 months ago

In reply to: Tom Lane (#70)

Re: allow changing autovacuum_max_workers without restarting

On Tue, Apr 29, 2025 at 05:31:01PM -0400, Tom Lane wrote:

Pushed. I realized that the text about SEMMNI/SEMMNS in runtime.sgml
needed some work too, since it still said that increasing them was
optional on NetBSD/OpenBSD.

Thanks. Your updates look good to me.

--
nathan