POC: Parallel processing of indexes in autovacuum
Hi!
The VACUUM command can be executed with the parallel option. As
documentation states, it will perform index vacuum and index cleanup phases
of VACUUM in parallel using *integer* background workers. But such an
interesting feature is not used for an autovacuum. After a quick look at
the source codes, it became clear to me that when the parallel option was
added, the corresponding option for autovacuum wasn't implemented, although
there are no clear obstacles to this.
Actually, one of our customers step into a problem with autovacuum on a
table with many indexes and relatively long transactions. Of course, long
transactions are an ultimate evil and the problem can be solved by calling
running vacuum and a cron task, but, I think, we can do better.
Anyhow, what about adding parallel option for an autovacuum? Here is a POC
patch for proposed functionality. For the sake of simplicity's, several
GUC's have been added. It would be good to think through the parallel
launch condition without them.
As always, any thoughts and opinions are very welcome!
--
Best regards,
Maxim Orlov.
Attachments:
WIP-Allow-autovacuum-to-process-indexes-of-single-table.patchapplication/octet-stream; name=WIP-Allow-autovacuum-to-process-indexes-of-single-table.patchDownload
From 58dd9a144f065b3619615efc4c2afc1cc6721617 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Tue, 1 Apr 2025 14:39:49 +0700
Subject: [PATCH] Allow autovacuum to process indexes of single table in
parallel mode
Author: Daniil Davydov <3danissimo@gmail.com>
Author: Maxim Orlov <orlovmg@gmail.com>
---
src/backend/commands/vacuum.c | 27 +
src/backend/commands/vacuumparallel.c | 290 ++++++-
src/backend/postmaster/autovacuum.c | 801 +++++++++++++++++-
src/backend/utils/misc/guc_tables.c | 30 +
src/backend/utils/misc/postgresql.conf.sample | 6 +
src/include/postmaster/autovacuum.h | 25 +
src/test/modules/autovacuum/.gitignore | 1 +
src/test/modules/autovacuum/Makefile | 14 +
.../autovacuum/t/001_autovac_parallel.pl | 137 +++
9 files changed, 1285 insertions(+), 46 deletions(-)
create mode 100644 src/test/modules/autovacuum/.gitignore
create mode 100644 src/test/modules/autovacuum/Makefile
create mode 100644 src/test/modules/autovacuum/t/001_autovac_parallel.pl
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index db5da3ce826..5f51a65967c 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2232,6 +2232,33 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params,
else
toast_relid = InvalidOid;
+ /*
+ * Decide whether we need to process table with given oid in parallel mode
+ * during autovacuum.
+ */
+ if (AmAutoVacuumWorkerProcess() &&
+ params->index_cleanup != VACOPTVALUE_DISABLED)
+ {
+ PgStat_StatTabEntry *tabentry;
+
+ /* fetch the pgstat table entry */
+ tabentry = pgstat_fetch_stat_tabentry_ext(rel->rd_rel->relisshared,
+ rel->rd_id);
+ if (tabentry && tabentry->dead_tuples >= autovac_idx_parallel_min_rows)
+ {
+ List *indexes = RelationGetIndexList(rel);
+ int num_indexes = list_length(indexes);
+
+ list_free(indexes);
+
+ if (num_indexes >= autovac_idx_parallel_min_indexes &&
+ max_parallel_index_autovac_workers > 0)
+ {
+ params->nworkers = max_parallel_index_autovac_workers;
+ }
+ }
+ }
+
/*
* Switch to the table owner's userid, so that any index functions are run
* as that user. Also lock down security-restricted operations and
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 2b9d548cdeb..6094d6c649b 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,20 +1,23 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel [auto]vacuum execution.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
*
- * In a parallel vacuum, we perform both index bulk deletion and index cleanup
- * with parallel worker processes. Individual indexes are processed by one
- * vacuum process. ParallelVacuumState contains shared information as well as
- * the memory space for storing dead items allocated in the DSA area. We
+ * In a parallel [auto]vacuum, we perform both index bulk deletion and index
+ * cleanup with parallel worker processes. Individual indexes are processed by
+ * one vacuum process. ParallelVacuumState contains shared information as well
+ * as the memory space for storing dead items allocated in the DSA area. We
* launch parallel worker processes at the start of parallel index
* bulk-deletion and index cleanup and once all indexes are processed, the
* parallel worker processes exit. Each time we process indexes in parallel,
* the parallel context is re-initialized so that the same DSM can be used for
- * multiple passes of index bulk-deletion and index cleanup.
+ * multiple passes of index bulk-deletion and index cleanup. For maintenance
+ * vacuum, we launch workers manually (using dynamic bgworkers machinery), and
+ * for autovacuum we send signals to the autovacuum launcher (all logic for
+ * communication among parallel autovacuum processes is in autovacuum.c).
*
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -34,9 +37,11 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
+#include "utils/memutils.h"
#include "utils/rel.h"
/*
@@ -157,11 +162,20 @@ typedef struct PVIndStats
} PVIndStats;
/*
- * Struct for maintaining a parallel vacuum state. typedef appears in vacuum.h.
+ * Struct for maintaining a parallel [auto]vacuum state. typedef appears in
+ * vacuum.h.
*/
struct ParallelVacuumState
{
- /* NULL for worker processes */
+ /* Is this structure used for maintenance vacuum or autovacuum */
+ bool is_autovacuum;
+
+ /*
+ * NULL for worker processes.
+ *
+ * NOTE: Parallel autovacuum only needs a subset of the maintenance vacuum
+ * functionality.
+ */
ParallelContext *pcxt;
/* Parent Heap Relation */
@@ -221,6 +235,10 @@ struct ParallelVacuumState
PVIndVacStatus status;
};
+static ParallelContext *CreateParallelAutoVacContext(int nworkers);
+static void InitializeParallelAutoVacDSM(ParallelContext *pcxt);
+static void DestroyParallelAutoVacContext(ParallelContext *pcxt);
+
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -280,15 +298,21 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
}
pvs = (ParallelVacuumState *) palloc0(sizeof(ParallelVacuumState));
+ pvs->is_autovacuum = AmAutoVacuumWorkerProcess();
pvs->indrels = indrels;
pvs->nindexes = nindexes;
pvs->will_parallel_vacuum = will_parallel_vacuum;
pvs->bstrategy = bstrategy;
pvs->heaprel = rel;
- EnterParallelMode();
- pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
- parallel_workers);
+ if (pvs->is_autovacuum)
+ pcxt = CreateParallelAutoVacContext(parallel_workers);
+ else
+ {
+ EnterParallelMode();
+ pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
+ parallel_workers);
+ }
Assert(pcxt->nworkers > 0);
pvs->pcxt = pcxt;
@@ -327,7 +351,10 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
else
querylen = 0; /* keep compiler quiet */
- InitializeParallelDSM(pcxt);
+ if (pvs->is_autovacuum)
+ InitializeParallelAutoVacDSM(pvs->pcxt);
+ else
+ InitializeParallelDSM(pcxt);
/* Prepare index vacuum stats */
indstats = (PVIndStats *) shm_toc_allocate(pcxt->toc, est_indstats_len);
@@ -371,11 +398,18 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->relid = RelationGetRelid(rel);
shared->elevel = elevel;
shared->queryid = pgstat_get_my_query_id();
- shared->maintenance_work_mem_worker =
- (nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
- shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
+
+ if (pvs->is_autovacuum)
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) :
+ autovacuum_work_mem;
+ else
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+ maintenance_work_mem;
+ shared->dead_items_info.max_bytes = vac_work_mem * 1024L;
/* Prepare DSA space for dead items */
dead_items = TidStoreCreateShared(shared->dead_items_info.max_bytes,
@@ -453,8 +487,13 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
TidStoreDestroy(pvs->dead_items);
- DestroyParallelContext(pvs->pcxt);
- ExitParallelMode();
+ if (pvs->is_autovacuum)
+ DestroyParallelAutoVacContext(pvs->pcxt);
+ else
+ {
+ DestroyParallelContext((ParallelContext *) pvs->pcxt);
+ ExitParallelMode();
+ }
pfree(pvs->will_parallel_vacuum);
pfree(pvs);
@@ -532,6 +571,144 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
}
+/*
+ * Short version of CreateParallelContext (parallel.c). Here we init only those
+ * fields that are needed for parallel index processing during autovacuum.
+ */
+static ParallelContext *
+CreateParallelAutoVacContext(int nworkers)
+{
+ ParallelContext *pcxt;
+ MemoryContext oldcontext;
+
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /* Number of workers should be non-negative. */
+ Assert(nworkers >= 0);
+
+ /* We might be running in a short-lived memory context. */
+ oldcontext = MemoryContextSwitchTo(TopTransactionContext);
+
+ /* Initialize a new ParallelContext. */
+ pcxt = palloc0(sizeof(ParallelContext));
+ pcxt->nworkers = nworkers;
+ pcxt->nworkers_to_launch = nworkers;
+ shm_toc_initialize_estimator(&pcxt->estimator);
+
+ /* Restore previous memory context. */
+ MemoryContextSwitchTo(oldcontext);
+
+ return pcxt;
+}
+
+/*
+ * Short version of InitializeParallelDSM (parallel.c). Here we put into dsm
+ * only those data that are needed for parallel index processing during
+ * autovacuum.
+ */
+static void
+InitializeParallelAutoVacDSM(ParallelContext *pcxt)
+{
+ MemoryContext oldcontext;
+ Size tsnaplen = 0;
+ Size asnaplen = 0;
+ Size segsize = 0;
+ char *tsnapspace;
+ char *asnapspace;
+ Snapshot transaction_snapshot = GetTransactionSnapshot();
+ Snapshot active_snapshot = GetActiveSnapshot();
+
+ Assert(pcxt->nworkers >= 1);
+
+ /* We might be running in a very short-lived memory context. */
+ oldcontext = MemoryContextSwitchTo(TopTransactionContext);
+
+ if (IsolationUsesXactSnapshot())
+ {
+ tsnaplen = EstimateSnapshotSpace(transaction_snapshot);
+ shm_toc_estimate_chunk(&pcxt->estimator, tsnaplen);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+ }
+ asnaplen = EstimateSnapshotSpace(active_snapshot);
+ shm_toc_estimate_chunk(&pcxt->estimator, asnaplen);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+
+ /* Create DSM and initialize with new table of contents. */
+ segsize = shm_toc_estimate(&pcxt->estimator);
+ pcxt->seg = dsm_create(segsize, DSM_CREATE_NULL_IF_MAXSEGMENTS);
+
+ if (pcxt->seg == NULL)
+ {
+ pcxt->nworkers = 0;
+ pcxt->private_memory = MemoryContextAlloc(TopMemoryContext, segsize);
+ }
+
+ pcxt->toc = shm_toc_create(AV_PARALLEL_MAGIC,
+ pcxt->seg == NULL ? pcxt->private_memory :
+ dsm_segment_address(pcxt->seg),
+ segsize);
+
+ /* We can skip the rest of this if we're not budgeting for any workers. */
+ if (pcxt->nworkers > 0)
+ {
+ /*
+ * Serialize the transaction snapshot if the transaction isolation
+ * level uses a transaction snapshot.
+ */
+ if (IsolationUsesXactSnapshot())
+ {
+ tsnapspace = shm_toc_allocate(pcxt->toc, tsnaplen);
+ SerializeSnapshot(transaction_snapshot, tsnapspace);
+ shm_toc_insert(pcxt->toc, AV_PARALLEL_KEY_TRANSACTION_SNAPSHOT,
+ tsnapspace);
+ }
+
+ /* Serialize the active snapshot. */
+ asnapspace = shm_toc_allocate(pcxt->toc, asnaplen);
+ SerializeSnapshot(active_snapshot, asnapspace);
+ shm_toc_insert(pcxt->toc, AV_PARALLEL_KEY_ACTIVE_SNAPSHOT, asnapspace);
+ }
+
+ /* Update nworkers_to_launch, in case we changed nworkers above. */
+ pcxt->nworkers_to_launch = pcxt->nworkers;
+
+ /* Restore previous memory context. */
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Short version of DestroyParallelContext (parallel.c). Here we clean up only
+ * those data that were used during parallel index processing during autovacuum.
+ */
+static void
+DestroyParallelAutoVacContext(ParallelContext *pcxt)
+{
+ /*
+ * If we have allocated a shared memory segment, detach it. This will
+ * implicitly detach the error queues, and any other shared memory queues,
+ * stored there.
+ */
+ if (pcxt->seg != NULL)
+ {
+ dsm_detach(pcxt->seg);
+ pcxt->seg = NULL;
+ }
+
+ /*
+ * If this parallel context is actually in backend-private memory rather
+ * than shared memory, free that memory instead.
+ */
+ if (pcxt->private_memory != NULL)
+ {
+ pfree(pcxt->private_memory);
+ pcxt->private_memory = NULL;
+ }
+
+ AutoVacuumReleaseParallelWork(false);
+ pfree(pcxt);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -558,7 +735,9 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster ||
+ (max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess()) ||
+ (max_parallel_index_autovac_workers == 0 && AmAutoVacuumWorkerProcess()))
return 0;
/*
@@ -597,15 +776,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = AmAutoVacuumWorkerProcess() ?
+ Min(parallel_workers, max_parallel_index_autovac_workers) :
+ Min(parallel_workers, max_parallel_maintenance_workers);
return parallel_workers;
}
/*
* Perform index vacuum or index cleanup with parallel workers. This function
- * must be used by the parallel vacuum leader process.
+ * must be used by the parallel [auto]vacuum leader process.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -670,7 +851,7 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
if (nworkers > 0)
{
/* Reinitialize parallel context to relaunch parallel workers */
- if (num_index_scans > 0)
+ if (num_index_scans > 0 && !pvs->is_autovacuum)
ReinitializeParallelDSM(pvs->pcxt);
/*
@@ -686,9 +867,23 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
* The number of workers can vary between bulkdelete and cleanup
* phase.
*/
- ReinitializeParallelWorkers(pvs->pcxt, nworkers);
-
- LaunchParallelWorkers(pvs->pcxt);
+ if (pvs->is_autovacuum)
+ {
+ pvs->pcxt->nworkers_to_launch = Min(pvs->pcxt->nworkers, nworkers);
+ if (pvs->pcxt->nworkers > 0 && pvs->pcxt->nworkers_to_launch > 0)
+ {
+ pvs->pcxt->nworkers_launched =
+ LaunchParallelAutovacuumWorkers(pvs->heaprel->rd_id,
+ pvs->pcxt->nworkers_to_launch,
+ dsm_segment_handle(pvs->pcxt->seg),
+ MyProc, MyProcPid);
+ }
+ }
+ else
+ {
+ ReinitializeParallelWorkers(pvs->pcxt, nworkers);
+ LaunchParallelWorkers(pvs->pcxt);
+ }
if (pvs->pcxt->nworkers_launched > 0)
{
@@ -733,8 +928,14 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
if (nworkers > 0)
{
- /* Wait for all vacuum workers to finish */
- WaitForParallelWorkersToFinish(pvs->pcxt);
+ /*
+ * Wait for all [auto]vacuum workers (involved in parallel index
+ * processing) to finish.
+ */
+ if (pvs->is_autovacuum)
+ ParallelAutovacuumEndSyncPoint(false);
+ else
+ WaitForParallelWorkersToFinish(pvs->pcxt);
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
@@ -982,8 +1183,8 @@ parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
/*
* Perform work within a launched parallel process.
*
- * Since parallel vacuum workers perform only index vacuum or index cleanup,
- * we don't need to report progress information.
+ * Since parallel [auto]vacuum workers perform only index vacuum or index
+ * cleanup, we don't need to report progress information.
*/
void
parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
@@ -997,23 +1198,22 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
BufferUsage *buffer_usage;
WalUsage *wal_usage;
int nindexes;
+ int worker_number;
char *sharedquery;
ErrorContextCallback errcallback;
- /*
- * A parallel vacuum worker must have only PROC_IN_VACUUM flag since we
- * don't support parallel vacuum for autovacuum as of now.
- */
- Assert(MyProc->statusFlags == PROC_IN_VACUUM);
-
- elog(DEBUG1, "starting parallel vacuum worker");
+ Assert(MyProc->statusFlags == PROC_IN_VACUUM || AmAutoVacuumWorkerProcess());
+ elog(DEBUG1, "starting parallel [auto]vacuum worker");
shared = (PVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED, false);
/* Set debug_query_string for individual workers */
- sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
- debug_query_string = sharedquery;
- pgstat_report_activity(STATE_RUNNING, debug_query_string);
+ if (!AmAutoVacuumWorkerProcess())
+ {
+ sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
+ debug_query_string = sharedquery;
+ pgstat_report_activity(STATE_RUNNING, debug_query_string);
+ }
/* Track query ID */
pgstat_report_query_id(shared->queryid, false);
@@ -1091,8 +1291,12 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
/* Report buffer/WAL usage during parallel execution */
buffer_usage = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_BUFFER_USAGE, false);
wal_usage = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_WAL_USAGE, false);
- InstrEndParallelQuery(&buffer_usage[ParallelWorkerNumber],
- &wal_usage[ParallelWorkerNumber]);
+
+ worker_number = AmAutoVacuumWorkerProcess() ?
+ GetAutoVacuumParallelWorkerNumber() : ParallelWorkerNumber;
+
+ InstrEndParallelQuery(&buffer_usage[worker_number],
+ &wal_usage[worker_number]);
/* Report any remaining cost-based vacuum delay time */
if (track_cost_delay_timing)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 16756152b71..60192ecb8f5 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -90,6 +90,7 @@
#include "postmaster/postmaster.h"
#include "storage/aio_subsys.h"
#include "storage/bufmgr.h"
+#include "storage/condition_variable.h"
#include "storage/ipc.h"
#include "storage/latch.h"
#include "storage/lmgr.h"
@@ -102,6 +103,7 @@
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
#include "utils/injection_point.h"
+#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/ps_status.h"
@@ -129,6 +131,9 @@ int autovacuum_anl_thresh;
double autovacuum_anl_scale;
int autovacuum_freeze_max_age;
int autovacuum_multixact_freeze_max_age;
+int max_parallel_index_autovac_workers;
+int autovac_idx_parallel_min_rows;
+int autovac_idx_parallel_min_indexes;
double autovacuum_vac_cost_delay;
int autovacuum_vac_cost_limit;
@@ -164,6 +169,14 @@ static int default_freeze_table_age;
static int default_multixact_freeze_min_age;
static int default_multixact_freeze_table_age;
+/*
+ * Number of additional workers that was requested for parallel index processing
+ * during autovacuum.
+ */
+static int nworkers_for_idx_autovac = 0;
+
+static int nworkers_launched = 0;
+
/* Memory context for long-lived data */
static MemoryContext AutovacMemCxt;
@@ -222,6 +235,8 @@ typedef struct autovac_table
* wi_proc pointer to PGPROC of the running worker, NULL if not started
* wi_launchtime Time at which this worker was launched
* wi_dobalance Whether this worker should be included in balance calculations
+ * wi_pcleanup if (> 0) => this worker must participate in parallel index
+ * vacuuming as supportive . Must be (== 0) for leader worker.
*
* All fields are protected by AutovacuumLock, except for wi_tableoid and
* wi_sharedrel which are protected by AutovacuumScheduleLock (note these
@@ -237,10 +252,17 @@ typedef struct WorkerInfoData
TimestampTz wi_launchtime;
pg_atomic_flag wi_dobalance;
bool wi_sharedrel;
+ int wi_pcleanup;
} WorkerInfoData;
typedef struct WorkerInfoData *WorkerInfo;
+#define AmParallelIdxAutoVacSupportive() \
+ (MyWorkerInfo != NULL && MyWorkerInfo->wi_pcleanup > 0)
+
+#define AmParallelIdxAutoVacLeader() \
+ (MyWorkerInfo != NULL && MyWorkerInfo->wi_pcleanup == 0)
+
/*
* Possible signals received by the launcher from remote processes. These are
* stored atomically in shared memory so that other processes can set them
@@ -250,9 +272,11 @@ typedef enum
{
AutoVacForkFailed, /* failed trying to start a worker */
AutoVacRebalance, /* rebalance the cost limits */
+ AutoVacParallelReq, /* request for parallel index vacuum */
+ AutoVacNumSignals, /* must be last */
} AutoVacuumSignal;
-#define AutoVacNumSignals (AutoVacRebalance + 1)
+#define AutoVacNumSignals (AutoVacParallelReq + 1)
/*
* Autovacuum workitem array, stored in AutoVacuumShmem->av_workItems. This
@@ -272,6 +296,49 @@ typedef struct AutoVacuumWorkItem
#define NUM_WORKITEMS 256
+typedef enum
+{
+ LAUNCHER = 0, /* autovacuum launcher must wake everyone up */
+ LEADER, /* leader must wake everyone up */
+ LAST_WORKER, /* the last inited supportive worker must wake everyone
+ up */
+} SyncType;
+
+typedef enum
+{
+ STARTUP = 0, /* initial value - no sync points were passed */
+ START_SYNC_POINT_PASSED, /* start_sync_point was passed */
+ END_SYNC_POINT_PASSED, /* end_sync_point was passed */
+ SHUTDOWN, /* leader wants to shut down parallel index
+ vacuum due to occured error */
+} Status;
+
+/*
+ * Structure, stored in AutoVacuumShmem->pav_workItem. This is used for managing
+ * parallel index processing (whithin single table).
+ */
+typedef struct ParallelAutoVacuumWorkItem
+{
+ Oid avw_database;
+ Oid avw_relation;
+ int nworkers_participating;
+ int nworkers_to_launch;
+ int nworkers_sleeping; /* leader doesn't count */
+ int nfinished; /* # of workers, that already finished parallel
+ index processing (and probably already dead) */
+
+ dsm_handle handl;
+ ProcNumber leader_proc_num;
+
+ PGPROC *leader_proc;
+ ConditionVariable cv;
+
+ bool active; /* being processed */
+ bool leader_sleeping;
+ SyncType sync_type;
+ Status status;
+} ParallelAutoVacuumWorkItem;
+
/*-------------
* The main autovacuum shmem struct. On shared memory we store this main
* struct and the array of WorkerInfo structs. This struct keeps:
@@ -283,6 +350,8 @@ typedef struct AutoVacuumWorkItem
* av_startingWorker pointer to WorkerInfo currently being started (cleared by
* the worker itself as soon as it's up and running)
* av_workItems work item array
+ * pav_workItem information needed for parallel index processing whithing
+ * single table
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
*
@@ -298,6 +367,7 @@ typedef struct
dlist_head av_runningWorkers;
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+ ParallelAutoVacuumWorkItem pav_workItem;
pg_atomic_uint32 av_nworkersForBalance;
} AutoVacuumShmemStruct;
@@ -322,11 +392,17 @@ pg_noreturn static void AutoVacLauncherShutdown(void);
static void launcher_determine_sleep(bool canlaunch, bool recursing,
struct timeval *nap);
static void launch_worker(TimestampTz now);
+static void launch_worker_for_pcleanup(TimestampTz now);
+static void eliminate_lock_conflicts(ParallelAutoVacuumWorkItem *item,
+ bool all_launched);
static List *get_database_list(void);
static void rebuild_database_list(Oid newdb);
static int db_comparator(const void *a, const void *b);
static void autovac_recalculate_workers_for_balance(void);
+static int parallel_autovacuum_start_sync_point(bool keep_lock);
+static void handle_parallel_idx_autovac_errors(void);
+
static void do_autovacuum(void);
static void FreeWorkerInfo(int code, Datum arg);
@@ -583,7 +659,14 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
* wakening conditions.
*/
- launcher_determine_sleep(av_worker_available(), false, &nap);
+ if (nworkers_launched < nworkers_for_idx_autovac)
+ {
+ /* Take the smallest possible sleep interval. */
+ nap.tv_sec = 0;
+ nap.tv_usec = MIN_AUTOVAC_SLEEPTIME * 1000;
+ }
+ else
+ launcher_determine_sleep(av_worker_available(), false, &nap);
/*
* Wait until naptime expires or we get some type of signal (all the
@@ -614,6 +697,19 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
LWLockRelease(AutovacuumLock);
}
+ if (AutoVacuumShmem->av_signal[AutoVacParallelReq])
+ {
+ ParallelAutoVacuumWorkItem *item;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ AutoVacuumShmem->av_signal[AutoVacParallelReq] = false;
+
+ item = &AutoVacuumShmem->pav_workItem;
+ nworkers_for_idx_autovac = item->nworkers_to_launch;
+ nworkers_launched = 0;
+ LWLockRelease(AutovacuumLock);
+ }
+
if (AutoVacuumShmem->av_signal[AutoVacForkFailed])
{
/*
@@ -686,6 +782,7 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
worker->wi_sharedrel = false;
worker->wi_proc = NULL;
worker->wi_launchtime = 0;
+ worker->wi_pcleanup = -1;
dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
&worker->wi_links);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -698,9 +795,29 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
}
LWLockRelease(AutovacuumLock); /* either shared or exclusive */
- /* if we can't do anything, just go back to sleep */
if (!can_launch)
+ {
+ /*
+ * If launcher cannot launch all requested for parallel index
+ * vacuum workers, it must handle all possible lock conflicts and
+ * tell everyone, that there will no new supportive workers.
+ */
+ if (nworkers_launched < nworkers_for_idx_autovac)
+ {
+ ParallelAutoVacuumWorkItem *item;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ item = &AutoVacuumShmem->pav_workItem;
+ Assert(item->active);
+
+ eliminate_lock_conflicts(item, false);
+ nworkers_launched = nworkers_for_idx_autovac = 0;
+ LWLockRelease(AutovacuumLock);
+ }
+
+ /* if we can't do anything else, just go back to sleep */
continue;
+ }
/* We're OK to start a new worker */
@@ -716,6 +833,15 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
*/
launch_worker(current_time);
}
+ else if (nworkers_launched < nworkers_for_idx_autovac)
+ {
+ /*
+ * One of active autovacuum workers sent us request to lauch
+ * participants for parallel index vacuum. We check this case first
+ * because we need to start participants as soon as possible.
+ */
+ launch_worker_for_pcleanup(current_time);
+ }
else
{
/*
@@ -1267,6 +1393,7 @@ do_start_worker(void)
worker->wi_dboid = avdb->adw_datid;
worker->wi_proc = NULL;
worker->wi_launchtime = GetCurrentTimestamp();
+ worker->wi_pcleanup = -1;
AutoVacuumShmem->av_startingWorker = worker;
@@ -1349,6 +1476,134 @@ launch_worker(TimestampTz now)
}
}
+/*
+ * launch_worker_for_pcleanup
+ *
+ * Wrapper for starting a worker (requested by leader of parallel index
+ * vacuuming) from the launcher.
+ */
+static void
+launch_worker_for_pcleanup(TimestampTz now)
+{
+ ParallelAutoVacuumWorkItem *item;
+ WorkerInfo worker;
+ dlist_node *wptr;
+
+ Assert(nworkers_launched < nworkers_for_idx_autovac);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Get a worker entry from the freelist. We checked above, so there
+ * really should be a free slot.
+ */
+ wptr = dclist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
+
+ worker = dlist_container(WorkerInfoData, wi_links, wptr);
+ worker->wi_dboid = InvalidOid;
+ worker->wi_proc = NULL;
+ worker->wi_launchtime = GetCurrentTimestamp();
+
+ /*
+ * Set indicator, that this workers must join to parallel index vacuum.
+ * This variable also plays the role of an unique id among parallel index
+ * vacuum workers. First id is '1', because '0' is reserved for leader.
+ */
+ worker->wi_pcleanup = (nworkers_launched + 1);
+
+ AutoVacuumShmem->av_startingWorker = worker;
+
+ SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_WORKER);
+
+ item = &AutoVacuumShmem->pav_workItem;
+ Assert(item->active);
+
+ nworkers_launched += 1;
+
+ if (nworkers_launched < nworkers_for_idx_autovac)
+ {
+ LWLockRelease(AutovacuumLock);
+ return;
+ }
+
+ Assert(item->sync_type == LAUNCHER &&
+ nworkers_launched == nworkers_for_idx_autovac);
+
+ /*
+ * If launcher managed to launch all requested for parallel index
+ * vacuum workers, it must handle all possible lock conflicts.
+ */
+ eliminate_lock_conflicts(item, true);
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Must be called from autovacuum launcher when it launched all requested
+ * workers for parallel index vacuum, or when it realized, that no more
+ * processes can be launched.
+ *
+ * In this function launcher will assign roles in such a way as to avoid lock
+ * conflicts between leader and supportive workers.
+ *
+ * AutovacuumLock must be held in exclusive mode before calling this function!
+ */
+static void
+eliminate_lock_conflicts(ParallelAutoVacuumWorkItem *item, bool all_launched)
+{
+ Assert(AmAutoVacuumLauncherProcess());
+ Assert(LWLockHeldByMe(AutovacuumLock));
+
+ /* So, let's start... */
+
+ if (item->leader_sleeping && item->nworkers_sleeping == nworkers_launched)
+ {
+ /*
+ * If both leader and all launched supportive workers are sleeping, then
+ * only we can wake everyone up.
+ */
+ LWLockRelease(AutovacuumLock);
+ ConditionVariableBroadcast(&item->cv);
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ }
+ else if (item->leader_sleeping &&
+ item->nworkers_sleeping < nworkers_launched)
+ {
+ /*
+ * If leader already sleeping, but several supportive workers are
+ * initing, we shift the responsibility for awakening everyone into the
+ * worker who completes initialization last
+ */
+ item->sync_type = LAST_WORKER;
+ }
+ else if (!item->leader_sleeping &&
+ item->nworkers_sleeping == nworkers_launched)
+ {
+ /*
+ * If only leader is not sleeping - it must wake up all workers when it
+ * finishes all preparations.
+ */
+ item->sync_type = LEADER;
+ }
+ else
+ {
+ /*
+ * If nobody is sleeping, we assume that leader has higher chanses to
+ * asleep first, so set sync type to LAST_WORKER, but if the last worker
+ * will see that leader still not sleeping, it will change sync type to
+ * LEADER and asleep.
+ */
+ item->sync_type = LAST_WORKER;
+ }
+
+ /*
+ * If we cannot launch all requested workers, refresh
+ * nworkers_to_launch value, so that the last worker can find out
+ * that he is really the last.
+ */
+ if (!all_launched && item->sync_type == LAST_WORKER)
+ item->nworkers_to_launch = nworkers_launched;
+}
+
/*
* Called from postmaster to signal a failure to fork a process to become
* worker. The postmaster should kill(SIGUSR2) the launcher shortly
@@ -1360,6 +1615,38 @@ AutoVacWorkerFailed(void)
AutoVacuumShmem->av_signal[AutoVacForkFailed] = true;
}
+/*
+ * Called from autovacuum worker to signal that he needs participants in
+ * parallel index vacuum. Function sends SIGUSR2 to the launcher and returns
+ * 'true' iff signal was sent successfully.
+ */
+bool
+AutoVacParallelWorkRequest(void)
+{
+ if (AutoVacuumShmem->av_launcherpid == 0)
+ {
+ ereport(WARNING,
+ (errcode(ERRCODE_INTERNAL_ERROR),
+ errmsg("autovacuum launcher is dead")));
+
+ return false;
+ }
+
+ AutoVacuumShmem->av_signal[AutoVacParallelReq] = true;
+
+ if (kill(AutoVacuumShmem->av_launcherpid, SIGUSR2) < 0)
+ {
+ ereport(WARNING,
+ (errcode(ERRCODE_SYSTEM_ERROR),
+ errmsg("failed to send signal to autovac launcher (pid %d): %m",
+ AutoVacuumShmem->av_launcherpid)));
+
+ return false;
+ }
+
+ return true;
+}
+
/* SIGUSR2: a worker is up and running, or just finished, or failed to fork */
static void
avl_sigusr2_handler(SIGNAL_ARGS)
@@ -1559,6 +1846,8 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
{
char dbname[NAMEDATALEN];
+ Assert(MyWorkerInfo->wi_pcleanup < 0);
+
/*
* Report autovac startup to the cumulative stats system. We
* deliberately do this before InitPostgres, so that the
@@ -1593,12 +1882,112 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
recentMulti = ReadNextMultiXactId();
do_autovacuum();
}
+ else if (AmParallelIdxAutoVacSupportive())
+ {
+ ParallelAutoVacuumWorkItem *item;
+ dsm_handle handle;
+ PGPROC *leader_proc;
+ ProcNumber leader_proc_number;
+ dsm_segment *seg;
+ shm_toc *toc;
+ char *asnapspace;
+ char *tsnapspace;
+ char dbname[NAMEDATALEN];
+ Snapshot tsnapshot;
+ Snapshot asnapshot;
+
+ /*
+ * We will abort parallel index vacuuming whithin current process if
+ * something errors out
+ */
+ PG_TRY();
+ {
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ item = &AutoVacuumShmem->pav_workItem;
+ dbid = item->avw_database;
+ handle = item->handl;
+ leader_proc = item->leader_proc;
+ leader_proc_number = item->leader_proc_num;
+ LWLockRelease(AutovacuumLock);
+
+ InitPostgres(NULL, dbid, NULL, InvalidOid,
+ INIT_PG_OVERRIDE_ALLOW_CONNS,
+ dbname);
+
+ set_ps_display(dbname);
+ if (PostAuthDelay)
+ pg_usleep(PostAuthDelay * 1000000L);
+
+ /* And do an appropriate amount of work */
+ recentXid = ReadNextTransactionId();
+ recentMulti = ReadNextMultiXactId();
+
+ if (parallel_autovacuum_start_sync_point(false) == -1)
+ {
+ /* We are not participating anymore */
+ MyWorkerInfo->wi_pcleanup = -1;
+ goto exit;
+ }
+
+ seg = dsm_attach(handle);
+ if (seg == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("could not map dynamic shared memory segment")));
+
+ toc = shm_toc_attach(AV_PARALLEL_MAGIC, dsm_segment_address(seg));
+ if (toc == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("invalid magic number in dynamic shared memory segment")));
+
+ if (!BecomeLockGroupMember(leader_proc, leader_proc_number))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("could not become lock group member")));
+ }
+
+ StartTransactionCommand();
+
+ asnapspace =
+ shm_toc_lookup(toc, AV_PARALLEL_KEY_ACTIVE_SNAPSHOT, false);
+ tsnapspace =
+ shm_toc_lookup(toc, AV_PARALLEL_KEY_TRANSACTION_SNAPSHOT, true);
+ asnapshot = RestoreSnapshot(asnapspace);
+ tsnapshot = tsnapspace ? RestoreSnapshot(tsnapspace) : asnapshot;
+ RestoreTransactionSnapshot(tsnapshot, leader_proc);
+ PushActiveSnapshot(asnapshot);
+
+ /*
+ * We've changed which tuples we can see, and must therefore
+ * invalidate system caches.
+ */
+ InvalidateSystemCaches();
+
+ parallel_vacuum_main(seg, toc);
+
+ /* Must pop active snapshot so snapmgr.c doesn't complain. */
+ PopActiveSnapshot();
+
+ dsm_detach(seg);
+ CommitTransactionCommand();
+ ParallelAutovacuumEndSyncPoint(false);
+ }
+ PG_CATCH();
+ {
+ EmitErrorReport();
+ handle_parallel_idx_autovac_errors();
+ }
+ PG_END_TRY();
+ }
/*
* The launcher will be notified of my death in ProcKill, *if* we managed
* to get a worker slot at all
*/
+exit:
/* All done, go away */
proc_exit(0);
}
@@ -2461,6 +2850,10 @@ do_autovacuum(void)
tab->at_datname, tab->at_nspname, tab->at_relname);
EmitErrorReport();
+ /* if we are parallel index vacuuming leader, we must shut it down */
+ if (AmParallelIdxAutoVacLeader())
+ handle_parallel_idx_autovac_errors();
+
/* this resets ProcGlobal->statusFlags[i] too */
AbortOutOfAnyTransaction();
FlushErrorState();
@@ -3296,6 +3689,405 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * Release work item, used for managing parallel index vacuum. Must be called
+ * once and only from leader worker.
+ *
+ * If 'keep_lock' is true, then AutovacuumLock will not be released in the end
+ * of function execution.
+ */
+void
+AutoVacuumReleaseParallelWork(bool keep_lock)
+{
+ ParallelAutoVacuumWorkItem *workitem;
+
+ if (!LWLockHeldByMe(AutovacuumLock))
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ workitem = &AutoVacuumShmem->pav_workItem;
+
+ Assert(workitem->leader_proc_num == MyProcPid);
+
+ workitem->active = false;
+
+ if (!keep_lock)
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Waiting on condition variable is frequent operation, so it has beed taken
+ * out with a separate function. Caller must acquire hold AutovacuumLock before
+ * calling it.
+ */
+static inline void
+CVSleep(ConditionVariable *cv)
+{
+ ConditionVariablePrepareToSleep(cv);
+
+ LWLockRelease(AutovacuumLock);
+ ConditionVariableSleep(cv, PG_WAIT_IPC);
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ ConditionVariableCancelSleep();
+}
+
+/*
+ * This function used to synchronize leader with supportive workers during
+ * parallel index vacuuming. Each process will exit iff:
+ * Leader worker is ready to perform parallel vacuum &&
+ * All launched supportive workers are ready to perform parallel vacuum &&
+ * (Autovacuum launcher already launched all requested workers ||
+ * Autovacuum launcher cannot launch more workers)
+ *
+ * If 'keep_lock' is true, then AutovacuumLock will not be released in the end
+ * of function execution.
+ *
+ * NOTE: Some workers may call this function when leader worker decided to shut
+ * down parallel vacuuming. In this case '-1' value will be returned.
+ */
+static int
+parallel_autovacuum_start_sync_point(bool keep_lock)
+{
+ ParallelAutoVacuumWorkItem *workitem;
+ SyncType sync_type;
+ int num_participants;
+
+ if (!LWLockHeldByMe(AutovacuumLock))
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ workitem = &AutoVacuumShmem->pav_workItem;
+ Assert(workitem->active);
+ sync_type = workitem->sync_type;
+
+ if (AmParallelIdxAutoVacLeader())
+ {
+ Assert(workitem->leader_proc_num == MyProcPid);
+
+ /* Wake up all sleeping supportive workers, if required ... */
+ if (sync_type == LEADER)
+ {
+ ConditionVariableBroadcast(&workitem->cv);
+
+ /*
+ * Advance status, because we are guaranteed to pass this
+ * sync point.
+ */
+ workitem->status = START_SYNC_POINT_PASSED;
+ }
+ /* ... otherwise, wait for somebody to wake us up */
+ else
+ {
+ workitem->leader_sleeping = true;
+ CVSleep(&workitem->cv);
+ workitem->leader_sleeping = false;
+
+ /*
+ * A priori, we believe that in the end everyone should be awakened
+ * by the leader.
+ */
+ workitem->sync_type = LEADER;
+ }
+ }
+ else
+ {
+ workitem->nworkers_participating += 1;
+
+ /*
+ * If we know, that launcher will no longer attempt to launch more
+ * supportive workers for this item => we are LAST_WORKER for sure.
+ *
+ * Note, that launcher set LAST_WORKER sync type without knowing
+ * current status of leader. So we also check that leader is sleeping
+ * before wake all up. Otherwise, we must wait for leader (and ask him
+ * to wake all up).
+ */
+ if (workitem->nworkers_participating == workitem->nworkers_to_launch &&
+ sync_type == LAST_WORKER && workitem->leader_sleeping)
+ {
+ ConditionVariableBroadcast(&workitem->cv);
+
+ /*
+ * We must not advance status if leader wants to shut down parallel
+ * execution (see checks below).
+ */
+ if (workitem->status != SHUTDOWN)
+ workitem->status = START_SYNC_POINT_PASSED;
+ }
+ else
+ {
+ if (workitem->nworkers_participating == workitem->nworkers_to_launch &&
+ sync_type == LAST_WORKER)
+ {
+ workitem->sync_type = LEADER;
+ }
+
+ workitem->nworkers_sleeping += 1;
+ CVSleep(&workitem->cv);
+ workitem->nworkers_sleeping -= 1;
+ }
+ }
+
+ /* Tell caller that */
+ if (workitem->status == SHUTDOWN)
+ num_participants = -1;
+ else
+ num_participants = workitem->nworkers_participating;
+
+ if (!keep_lock)
+ LWLockRelease(AutovacuumLock);
+
+ return num_participants;
+}
+
+/*
+ * Like function above, but must be called by leader and supportive workers
+ * when they finished parallel index vacuum.
+ *
+ * If 'keep_lock' is true, then AutovacuumLock will not be released in the end
+ * of function execution.
+ */
+void
+ParallelAutovacuumEndSyncPoint(bool keep_lock)
+{
+ ParallelAutoVacuumWorkItem *workitem;
+
+ if (!LWLockHeldByMe(AutovacuumLock))
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ workitem = &AutoVacuumShmem->pav_workItem;
+ Assert(workitem->active);
+
+ /* Nothing to do if no supportive workers were launched */
+ if (workitem->nworkers_participating == 0)
+ {
+ Assert(AmParallelIdxAutoVacLeader());
+ workitem->status = END_SYNC_POINT_PASSED;
+
+ if (!keep_lock)
+ LWLockRelease(AutovacuumLock);
+
+ return;
+ }
+
+ if (AmParallelIdxAutoVacLeader())
+ {
+ Assert(workitem->sync_type == LEADER);
+
+ /* Wait for all workers to finish (only last worker will wake us up) */
+ if (workitem->nfinished != workitem->nworkers_participating)
+ {
+ workitem->sync_type = LAST_WORKER;
+ workitem->leader_sleeping = true;
+ CVSleep(&workitem->cv);
+ workitem->leader_sleeping = false;
+
+ Assert(workitem->nfinished == workitem->nworkers_participating);
+
+ /*
+ * Advance status, because we are guaranteed to pass this
+ * sync point.
+ */
+ workitem->status = END_SYNC_POINT_PASSED;
+ }
+ }
+ else
+ {
+ workitem->nfinished += 1;
+
+ /* If we are last finished worker - wake up the leader.
+ *
+ * If not - just leave, because supportive worker already finished all
+ * work and must die.
+ */
+ if (workitem->sync_type == LAST_WORKER &&
+ workitem->nfinished == workitem->nworkers_participating)
+ {
+ ConditionVariableBroadcast(&workitem->cv);
+
+ /*
+ * Don't need to check SHUTDOWN status here - all supportive workers
+ * are about to finish anyway.
+ */
+ workitem->status = END_SYNC_POINT_PASSED;
+ }
+
+ /* We are not participate anymore */
+ MyWorkerInfo->wi_pcleanup = -1;
+ }
+
+ if (!keep_lock)
+ LWLockRelease(AutovacuumLock);
+
+ return;
+}
+
+/*
+ * Get id of parallel index vacuum worker (counting from 0).
+ */
+int
+GetAutoVacuumParallelWorkerNumber(void)
+{
+ Assert(AmAutoVacuumWorkerProcess() && MyWorkerInfo->wi_pcleanup > 0);
+ return (MyWorkerInfo->wi_pcleanup - 1);
+}
+
+/*
+ * Leader autovacuum process can decide, that he needs several helper workers
+ * to process table in parallel mode. He must set up parallel context and call
+ * LaunchParallelAutovacuumWorkers.
+ *
+ * In this function we do following :
+ * 1) Send signal to autovacuum lancher that creates 'supportive workers'
+ * during launcher's standard work loop.
+ * 2) Wait for supportive workers to start.
+ *
+ * Funcition return number of workers that launcher was able to launch (may be
+ * less then 'nworkers_to_launch').
+ */
+int
+LaunchParallelAutovacuumWorkers(Oid rel_id, int nworkers_to_launch,
+ dsm_handle handle, PGPROC *leader_proc,
+ int leader_proc_pid)
+{
+ int nworkers_launched = 0;
+ ParallelAutoVacuumWorkItem *workitem;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ workitem = &AutoVacuumShmem->pav_workItem;
+
+ /*
+ * For now, there can be only one leader across all cluster.
+ * TODO: fix it in next versions
+ */
+ if (workitem->active && workitem->leader_proc_num != MyProcPid)
+ {
+ LWLockRelease(AutovacuumLock);
+ return 0;
+ }
+
+ /* OK, we can use this workitem entry. Init it. */
+ workitem->avw_database = MyDatabaseId;
+ workitem->avw_relation = rel_id;
+ workitem->handl = handle;
+ workitem->leader_proc = leader_proc;
+ workitem->leader_proc_num = leader_proc_pid;
+ workitem->nworkers_participating = 0;
+ workitem->nfinished = 0;
+ workitem->nworkers_to_launch = nworkers_to_launch;
+ workitem->active = true;
+ workitem->leader_sleeping = false;
+ workitem->nworkers_sleeping = 0;
+ workitem->nfinished = 0;
+ workitem->sync_type = LAUNCHER;
+ workitem->status = STARTUP;
+
+ LWLockRelease(AutovacuumLock);
+
+ /* Notify autovacuum launcher that we need supportive workers */
+ if (AutoVacParallelWorkRequest())
+ {
+ /* Become the leader */
+ MyWorkerInfo->wi_pcleanup = 0;
+
+ /* All created workers must get same locks as leader process */
+ BecomeLockGroupLeader();
+
+ /*
+ * Wait until all supprotive workers are launched. Also retrieve actual
+ * number of participants
+ */
+
+ nworkers_launched = parallel_autovacuum_start_sync_point(false);
+ }
+ else
+ {
+ /*
+ * If we (for any reason) cannot send signal to the launcher, don't try
+ * to do index vacuuming in parallel
+ */
+ return 0;
+ }
+
+ return nworkers_launched;
+}
+
+/*
+ * During parallel index vacuuming any worker (both supportives and leader) can
+ * catch an error.
+ * In order to handle it in the right way we must call this function.
+ */
+static void
+handle_parallel_idx_autovac_errors(void)
+{
+ ParallelAutoVacuumWorkItem *item;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ item = &AutoVacuumShmem->pav_workItem;
+
+ if (AmParallelIdxAutoVacLeader())
+ {
+ if (item->status == START_SYNC_POINT_PASSED)
+ {
+ /*
+ * If start sync point already passed - just wait for all supportive
+ * workers to finish and exit.
+ */
+ ParallelAutovacuumEndSyncPoint(true);
+ }
+ else if (item->status == STARTUP)
+ {
+ /*
+ * If no sync point are passed we can prevent supportive workers
+ * from performing their work - set SHUTDOWN status and wait while
+ * all workers will see it.
+ */
+ item->status = SHUTDOWN;
+ parallel_autovacuum_start_sync_point(true);
+ }
+
+ AutoVacuumReleaseParallelWork(true);
+ }
+ else
+ {
+ Assert(AmParallelIdxAutoVacSupportive());
+
+ if (item->status == STARTUP)
+ {
+ /*
+ * If no sync point are passed - just exclude ourselves from
+ * participants. Further parallel index vacuuming will take place
+ * as usual.
+ */
+ item->nworkers_to_launch -= 1;
+
+ if (item->nworkers_participating == item->nworkers_to_launch &&
+ item->sync_type == LAST_WORKER && item->leader_sleeping)
+ {
+ ConditionVariableBroadcast(&item->cv);
+ item->status = START_SYNC_POINT_PASSED;
+ }
+ }
+ else if (item->status == START_SYNC_POINT_PASSED)
+ {
+ /*
+ * If start sync point already passed we will simulate the usual
+ * end of work (see ParallelAutovacuumEndSyncPoint).
+ */
+ item->nfinished += 1;
+
+ if (item->sync_type == LAST_WORKER &&
+ item->nfinished == item->nworkers_participating)
+ {
+ ConditionVariableBroadcast(&item->cv);
+ item->status = END_SYNC_POINT_PASSED;
+ }
+ }
+ }
+
+ LWLockRelease(AutovacuumLock);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3361,6 +4153,9 @@ AutoVacuumShmemInit(void)
AutoVacuumShmem->av_startingWorker = NULL;
memset(AutoVacuumShmem->av_workItems, 0,
sizeof(AutoVacuumWorkItem) * NUM_WORKITEMS);
+ memset(&AutoVacuumShmem->pav_workItem, 0,
+ sizeof(ParallelAutoVacuumWorkItem));
+ ConditionVariableInit(&AutoVacuumShmem->pav_workItem.cv);
worker = (WorkerInfo) ((char *) AutoVacuumShmem +
MAXALIGN(sizeof(AutoVacuumShmemStruct)));
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 60b12446a1c..c045a8d6eda 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3647,6 +3647,36 @@ struct config_int ConfigureNamesInt[] =
check_autovacuum_work_mem, NULL, NULL
},
+ {
+ {"max_parallel_index_autovac_workers", PGC_POSTMASTER, VACUUM_AUTOVACUUM,
+ gettext_noop("Sets the maximum number of parallel autovacuum worker processes during parallel index vacuuming of single table."),
+ NULL
+ },
+ &max_parallel_index_autovac_workers,
+ 0, 0, MAX_PARALLEL_WORKER_LIMIT,
+ NULL, NULL, NULL
+ },
+
+ {
+ {"autovac_idx_parallel_min_rows", PGC_POSTMASTER, VACUUM_AUTOVACUUM,
+ gettext_noop("Sets the minimum number of dead tuples in single table that requires parallel index processing during autovacuum."),
+ NULL
+ },
+ &autovac_idx_parallel_min_rows,
+ 0, 0, INT32_MAX,
+ NULL, NULL, NULL
+ },
+
+ {
+ {"autovac_idx_parallel_min_indexes", PGC_POSTMASTER, VACUUM_AUTOVACUUM,
+ gettext_noop("Sets the minimum number indexes created on single table that requires parallel index processing during autovacuum."),
+ NULL
+ },
+ &autovac_idx_parallel_min_indexes,
+ 2, 2, INT32_MAX,
+ NULL, NULL, NULL
+ },
+
{
{"tcp_keepalives_idle", PGC_USERSET, CONN_AUTH_TCP,
gettext_noop("Time between issuing TCP keepalives."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 34826d01380..08869398039 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -146,6 +146,12 @@
#hash_mem_multiplier = 2.0 # 1-1000.0 multiplier on hash table work_mem
#maintenance_work_mem = 64MB # min 64kB
#autovacuum_work_mem = -1 # min 64kB, or -1 to use maintenance_work_mem
+#max_parallel_index_autovac_workers = 0 # this feature disabled by default
+ # (change requires restart)
+#autovac_idx_parallel_min_rows = 0
+ # (change requires restart)
+#autovac_idx_parallel_min_indexes = 2
+ # (change requires restart)
#logical_decoding_work_mem = 64MB # min 64kB
#max_stack_depth = 2MB # min 100kB
#shared_memory_type = mmap # the default is the first option
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e8135f41a1c..81e6267cd7b 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -15,6 +15,8 @@
#define AUTOVACUUM_H
#include "storage/block.h"
+#include "storage/dsm_impl.h"
+#include "storage/lock.h"
/*
* Other processes can request specific work from autovacuum, identified by
@@ -25,12 +27,25 @@ typedef enum
AVW_BRINSummarizeRange,
} AutoVacuumWorkItemType;
+/*
+ * Magic number for parallel context TOC. Used for parallel index processing
+ * during autovacuum.
+ */
+#define AV_PARALLEL_MAGIC 0xaaaaaaaa
+
+/* Magic numbers for per-context parallel index processing state sharing. */
+#define AV_PARALLEL_KEY_TRANSACTION_SNAPSHOT UINT64CONST(0xFFF0000000000001)
+#define AV_PARALLEL_KEY_ACTIVE_SNAPSHOT UINT64CONST(0xFFF0000000000002)
+
/* GUC variables */
extern PGDLLIMPORT bool autovacuum_start_daemon;
extern PGDLLIMPORT int autovacuum_worker_slots;
extern PGDLLIMPORT int autovacuum_max_workers;
extern PGDLLIMPORT int autovacuum_work_mem;
+extern PGDLLIMPORT int max_parallel_index_autovac_workers;
+extern PGDLLIMPORT int autovac_idx_parallel_min_rows;
+extern PGDLLIMPORT int autovac_idx_parallel_min_indexes;
extern PGDLLIMPORT int autovacuum_naptime;
extern PGDLLIMPORT int autovacuum_vac_thresh;
extern PGDLLIMPORT int autovacuum_vac_max_thresh;
@@ -60,10 +75,20 @@ extern void AutoVacWorkerFailed(void);
pg_noreturn extern void AutoVacLauncherMain(const void *startup_data, size_t startup_data_len);
pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t startup_data_len);
+/* called from autovac worker when it needs participants in parallel index cleanup */
+extern bool AutoVacParallelWorkRequest(void);
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+extern void AutoVacuumReleaseParallelWork(bool keep_lock);
+extern int AutoVacuumParallelWorkWaitForStart(void);
+extern void ParallelAutovacuumEndSyncPoint( bool keep_lock);
+extern int GetAutoVacuumParallelWorkerNumber(void);
+extern int LaunchParallelAutovacuumWorkers(Oid rel_id, int nworkers_to_launch,
+ dsm_handle handle,
+ PGPROC *leader_proc,
+ int leader_proc_pid);
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/test/modules/autovacuum/.gitignore b/src/test/modules/autovacuum/.gitignore
new file mode 100644
index 00000000000..0b54641bceb
--- /dev/null
+++ b/src/test/modules/autovacuum/.gitignore
@@ -0,0 +1 @@
+/tmp_check/
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/Makefile b/src/test/modules/autovacuum/Makefile
new file mode 100644
index 00000000000..90c00ff350b
--- /dev/null
+++ b/src/test/modules/autovacuum/Makefile
@@ -0,0 +1,14 @@
+# src/test/modules/autovacuum/Makefile
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/t/001_autovac_parallel.pl b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
new file mode 100644
index 00000000000..d8e22a06bac
--- /dev/null
+++ b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
@@ -0,0 +1,137 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+$node->append_conf('postgresql.conf', qq{
+ autovacuum = off
+ max_wal_size = 4096
+});
+$node->start;
+
+my $indexes_num = 80;
+my $initial_rows_num = 1_000_000;
+
+# Create big table and create specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER, col_5 INTEGER,
+ col_6 INTEGER, col_7 INTEGER, col_8 INTEGER, col_9 INTEGER, col_10 INTEGER,
+ col_11 INTEGER, col_12 INTEGER, col_13 INTEGER, col_14 INTEGER, col_15 INTEGER,
+ col_16 INTEGER, col_17 INTEGER, col_18 INTEGER, col_19 INTEGER, col_20 INTEGER,
+ col_21 INTEGER, col_22 INTEGER, col_23 INTEGER, col_24 INTEGER, col_25 INTEGER,
+ col_26 INTEGER, col_27 INTEGER, col_28 INTEGER, col_29 INTEGER, col_30 INTEGER,
+ col_31 INTEGER, col_32 INTEGER, col_33 INTEGER, col_34 INTEGER, col_35 INTEGER,
+ col_36 INTEGER, col_37 INTEGER, col_38 INTEGER, col_39 INTEGER, col_40 INTEGER,
+ col_41 INTEGER, col_42 INTEGER, col_43 INTEGER, col_44 INTEGER, col_45 INTEGER,
+ col_46 INTEGER, col_47 INTEGER, col_48 INTEGER, col_49 INTEGER, col_50 INTEGER,
+ col_51 INTEGER, col_52 INTEGER, col_53 INTEGER, col_54 INTEGER, col_55 INTEGER,
+ col_56 INTEGER, col_57 INTEGER, col_58 INTEGER, col_59 INTEGER, col_60 INTEGER,
+ col_61 INTEGER, col_62 INTEGER, col_63 INTEGER, col_64 INTEGER, col_65 INTEGER,
+ col_66 INTEGER, col_67 INTEGER, col_68 INTEGER, col_69 INTEGER, col_70 INTEGER,
+ col_71 INTEGER, col_72 INTEGER, col_73 INTEGER, col_74 INTEGER, col_75 INTEGER,
+ col_76 INTEGER, col_77 INTEGER, col_78 INTEGER, col_79 INTEGER, col_80 INTEGER,
+ col_81 INTEGER, col_82 INTEGER, col_83 INTEGER, col_84 INTEGER, col_85 INTEGER,
+ col_86 INTEGER, col_87 INTEGER, col_88 INTEGER, col_89 INTEGER, col_90 INTEGER,
+ col_91 INTEGER, col_92 INTEGER, col_93 INTEGER, col_94 INTEGER, col_95 INTEGER,
+ col_96 INTEGER, col_97 INTEGER, col_98 INTEGER, col_99 INTEGER, col_100 INTEGER
+ );
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM pg_index i
+ JOIN pg_class c ON c.oid = i.indrelid
+ WHERE c.relname = 'test_autovac';",
+ stdout => \$psql_out
+);
+is($psql_out, $indexes_num + 1, "All indexes created successfully");
+
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac (
+ col_1, col_2, col_3, col_4, col_5, col_6, col_7, col_8, col_9, col_10,
+ col_11, col_12, col_13, col_14, col_15, col_16, col_17, col_18, col_19, col_20,
+ col_21, col_22, col_23, col_24, col_25, col_26, col_27, col_28, col_29, col_30,
+ col_31, col_32, col_33, col_34, col_35, col_36, col_37, col_38, col_39, col_40,
+ col_41, col_42, col_43, col_44, col_45, col_46, col_47, col_48, col_49, col_50,
+ col_51, col_52, col_53, col_54, col_55, col_56, col_57, col_58, col_59, col_60,
+ col_61, col_62, col_63, col_64, col_65, col_66, col_67, col_68, col_69, col_70,
+ col_71, col_72, col_73, col_74, col_75, col_76, col_77, col_78, col_79, col_80,
+ col_81, col_82, col_83, col_84, col_85, col_86, col_87, col_88, col_89, col_90,
+ col_91, col_92, col_93, col_94, col_95, col_96, col_97, col_98, col_99, col_100
+ ) VALUES (
+ i, i + 1, i + 2, i + 3, i + 4, i + 5, i + 6, i + 7, i + 8, i + 9,
+ i + 10, i + 11, i + 12, i + 13, i + 14, i + 15, i + 16, i + 17, i + 18, i + 19,
+ i + 20, i + 21, i + 22, i + 23, i + 24, i + 25, i + 26, i + 27, i + 28, i + 29,
+ i + 30, i + 31, i + 32, i + 33, i + 34, i + 35, i + 36, i + 37, i + 38, i + 39,
+ i + 40, i + 41, i + 42, i + 43, i + 44, i + 45, i + 46, i + 47, i + 48, i + 49,
+ i + 50, i + 51, i + 52, i + 53, i + 54, i + 55, i + 56, i + 57, i + 58, i + 59,
+ i + 60, i + 61, i + 62, i + 63, i + 64, i + 65, i + 66, i + 67, i + 68, i + 69,
+ i + 70, i + 71, i + 72, i + 73, i + 74, i + 75, i + 76, i + 77, i + 78, i + 79,
+ i + 80, i + 81, i + 82, i + 83, i + 84, i + 85, i + 86, i + 87, i + 88, i + 89,
+ i + 90, i + 91, i + 92, i + 93, i + 94, i + 95, i + 96, i + 97, i + 98, i + 99
+ );
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM test_autovac;",
+ stdout => \$psql_out
+);
+is($psql_out, $initial_rows_num, "All data inserted into table successfully");
+
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+my $dead_tuples_thresh = $initial_rows_num / 4;
+my $indexes_num_thresh = $indexes_num / 2;
+my $num_workers = 1;
+
+# Reduce autovacuum_work_mem, so leader process will perform parallel indexi
+# vacuum phase several times
+$node->append_conf('postgresql.conf', qq{
+ autovacuum_naptime = '1s'
+ autovacuum_work_mem = 2048
+ autovacuum_vacuum_threshold = 1
+ autovacuum_analyze_threshold = 1
+ autovacuum_vacuum_scale_factor = 0.1
+ autovacuum_analyze_scale_factor = 0.1
+ autovacuum_max_workers = 10
+ autovacuum = on
+ autovac_idx_parallel_min_rows = $dead_tuples_thresh
+ autovac_idx_parallel_min_indexes = $indexes_num_thresh
+ max_parallel_index_autovac_workers = $num_workers
+});
+
+$node->restart;
+
+# wait for autovacuum to reset datfrozenxid age to 0
+$node->poll_query_until('postgres', q{
+ SELECT count(*) = 0 FROM pg_database WHERE mxid_age(datfrozenxid) > 0
+}) or die "Timed out while waiting for autovacuum";
+
+ok(1, "There are no segfaults");
+
+$node->stop;
+done_testing();
--
2.43.0
HI *Maxim Orlov*
Thank you for your working on this ,I like your idea ,but I have a
suggestion ,autovacuum_max_workers is not need change requires restart , I
think those guc are can like
autovacuum_max_workers
+#max_parallel_index_autovac_workers = 0 # this feature disabled by default
+ # (change requires restart)
+#autovac_idx_parallel_min_rows = 0
+ # (change requires restart)
+#autovac_idx_parallel_min_indexes = 2
+ # (change requires restart)
Thanks
On Wed, Apr 16, 2025 at 7:05 PM Maxim Orlov <orlovmg@gmail.com> wrote:
Show quoted text
Hi!
The VACUUM command can be executed with the parallel option. As
documentation states, it will perform index vacuum and index cleanup
phases of VACUUM in parallel using *integer* background workers. But such
an interesting feature is not used for an autovacuum. After a quick look
at the source codes, it became clear to me that when the parallel option
was added, the corresponding option for autovacuum wasn't implemented, although
there are no clear obstacles to this.Actually, one of our customers step into a problem with autovacuum on a
table with many indexes and relatively long transactions. Of course, long
transactions are an ultimate evil and the problem can be solved by calling
running vacuum and a cron task, but, I think, we can do better.Anyhow, what about adding parallel option for an autovacuum? Here is a
POC patch for proposed functionality. For the sake of simplicity's, several
GUC's have been added. It would be good to think through the parallel
launch condition without them.As always, any thoughts and opinions are very welcome!
--
Best regards,
Maxim Orlov.
Hi,
On Wed, Apr 16, 2025 at 4:05 AM Maxim Orlov <orlovmg@gmail.com> wrote:
Hi!
The VACUUM command can be executed with the parallel option. As documentation states, it will perform index vacuum and index cleanup phases of VACUUM in parallel using integer background workers. But such an interesting feature is not used for an autovacuum. After a quick look at the source codes, it became clear to me that when the parallel option was added, the corresponding option for autovacuum wasn't implemented, although there are no clear obstacles to this.
Actually, one of our customers step into a problem with autovacuum on a table with many indexes and relatively long transactions. Of course, long transactions are an ultimate evil and the problem can be solved by calling running vacuum and a cron task, but, I think, we can do better.
Anyhow, what about adding parallel option for an autovacuum? Here is a POC patch for proposed functionality. For the sake of simplicity's, several GUC's have been added. It would be good to think through the parallel launch condition without them.
As always, any thoughts and opinions are very welcome!
As I understand it, we initially disabled parallel vacuum for
autovacuum because their objectives are somewhat contradictory.
Parallel vacuum aims to accelerate the process by utilizing additional
resources, while autovacuum is designed to perform cleaning operations
with minimal impact on foreground transaction processing (e.g.,
through vacuum delay).
Nevertheless, I see your point about the potential benefits of using
parallel vacuum within autovacuum in specific scenarios. The crucial
consideration is determining appropriate criteria for triggering
parallel vacuum in autovacuum. Given that we currently support only
parallel index processing, suitable candidates might be autovacuum
operations on large tables that have a substantial number of
sufficiently large indexes and a high volume of garbage tuples.
Once we have parallel heap vacuum, as discussed in thread[1]/messages/by-id/CAD21AoAEfCNv-GgaDheDJ+s-p_Lv1H24AiJeNoPGCmZNSwL1YA@mail.gmail.com, it would
also likely be beneficial to incorporate it into autovacuum during
aggressive vacuum or failsafe mode.
Although the actual number of parallel workers ultimately depends on
the number of eligible indexes, it might be beneficial to introduce a
storage parameter, say parallel_vacuum_workers, that allows control
over the number of parallel vacuum workers on a per-table basis.
Regarding implementation: I notice the WIP patch implements its own
parallel vacuum mechanism for autovacuum. Have you considered simply
setting at_params.nworkers to a value greater than zero?
Regards,
[1]: /messages/by-id/CAD21AoAEfCNv-GgaDheDJ+s-p_Lv1H24AiJeNoPGCmZNSwL1YA@mail.gmail.com
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Thanks for raising this idea!
I am generally -1 on the idea of autovacuum performing parallel
index vacuum, because I always felt that the parallel option should
be employed in a targeted manner for a specific table. if you have a bunch
of large tables, some more important than others, a/c may end
up using parallel resources on the least important tables and you
will have to adjust a/v settings per table, etc to get the right table
to be parallel index vacuumed by a/v.
Also, with the TIDStore improvements for index cleanup, and the practical
elimination of multi-pass index vacuums, I see this being even less
convincing as something to add to a/v.
Now, If I am going to allocate extra workers to run vacuum in parallel, why
not just provide more autovacuum workers instead so I can get more tables
vacuumed within a span of time?
Once we have parallel heap vacuum, as discussed in thread[1], it would
also likely be beneficial to incorporate it into autovacuum during
aggressive vacuum or failsafe mode.
IIRC, index cleanup is disabled by failsafe.
--
Sami Imseih
Amazon Web Services (AWS)
On Thu, May 1, 2025 at 8:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
As I understand it, we initially disabled parallel vacuum for
autovacuum because their objectives are somewhat contradictory.
Parallel vacuum aims to accelerate the process by utilizing additional
resources, while autovacuum is designed to perform cleaning operations
with minimal impact on foreground transaction processing (e.g.,
through vacuum delay).
Yep, we also decided that we must not create more a/v workers for
index processing.
In current implementation, the leader process sends a signal to the
a/v launcher, and the launcher tries to launch all requested workers.
But the number of workers never exceeds `autovacuum_max_workers`.
Thus, we will never have more a/v workers than in the standard case
(without this feature).
Nevertheless, I see your point about the potential benefits of using
parallel vacuum within autovacuum in specific scenarios. The crucial
consideration is determining appropriate criteria for triggering
parallel vacuum in autovacuum. Given that we currently support only
parallel index processing, suitable candidates might be autovacuum
operations on large tables that have a substantial number of
sufficiently large indexes and a high volume of garbage tuples.Although the actual number of parallel workers ultimately depends on
the number of eligible indexes, it might be beneficial to introduce a
storage parameter, say parallel_vacuum_workers, that allows control
over the number of parallel vacuum workers on a per-table basis.
For now, we have three GUC variables for this purpose:
max_parallel_index_autovac_workers, autovac_idx_parallel_min_rows,
autovac_idx_parallel_min_indexes.
That is, everything is as you said. But we are still conducting
research on this issue. I would like to get rid of some of these
parameters.
Regarding implementation: I notice the WIP patch implements its own
parallel vacuum mechanism for autovacuum. Have you considered simply
setting at_params.nworkers to a value greater than zero?
About `at_params.nworkers = N` - that's exactly what we're doing (you
can see it in the `vacuum_rel` function). But we cannot fully reuse
code of VACUUM PARALLEL, because it creates its own processes via
dynamic bgworkers machinery.
As I said above - we don't want to consume additional resources. Also
we don't want to complicate communication between processes (the idea
is that a/v workers can only send signals to the a/v launcher).
As a result, we created our own implementation of parallel index
processing control - see changes in vacuumparallel.c and autovacuum.c.
--
Best regards,
Daniil Davydov
On Fri, May 2, 2025 at 11:58 PM Sami Imseih <samimseih@gmail.com> wrote:
I am generally -1 on the idea of autovacuum performing parallel
index vacuum, because I always felt that the parallel option should
be employed in a targeted manner for a specific table. if you have a bunch
of large tables, some more important than others, a/c may end
up using parallel resources on the least important tables and you
will have to adjust a/v settings per table, etc to get the right table
to be parallel index vacuumed by a/v.
Hm, this is a good point. I think I should clarify one moment - in
practice, there is a common situation when users have one huge table
among all databases (with 80+ indexes created on it). But, of course,
in general there may be few such tables.
But we can still adjust the autovac_idx_parallel_min_rows parameter.
If a table has a lot of dead tuples => it is actively used => table is
important (?).
Also, if the user can really determine the "importance" of each of the
tables - we can provide an appropriate table option. Tables with this
option set will be processed in parallel in priority order. What do
you think about such an idea?
Also, with the TIDStore improvements for index cleanup, and the practical
elimination of multi-pass index vacuums, I see this being even less
convincing as something to add to a/v.
If I understood correctly, then we are talking about the fact that
TIDStore can store so many tuples that in fact a second pass is never
needed.
But the number of passes does not affect the presented optimization in
any way. We must think about a large number of indexes that must be
processed. Even within a single pass we can have a 40% increase in
speed.
Now, If I am going to allocate extra workers to run vacuum in parallel, why
not just provide more autovacuum workers instead so I can get more tables
vacuumed within a span of time?
For now, only one process can clean up indexes, so I don't see how
increasing the number of a/v workers will help in the situation that I
mentioned above.
Also, we don't consume additional resources during autovacuum in this
patch - total number of a/v workers always <= autovacuum_max_workers.
BTW, see v2 patch, attached to this letter (bug fixes) :-)
--
Best regards,
Daniil Davydov
Attachments:
v2-0001-WIP-Allow-autovacuum-to-process-indexes-of-single.patchtext/x-patch; charset=US-ASCII; name=v2-0001-WIP-Allow-autovacuum-to-process-indexes-of-single.patchDownload
From 1c93a729b844a1dfe109e8d9e54d5cc0a941d061 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Sat, 3 May 2025 00:27:45 +0700
Subject: [PATCH v2] WIP Allow autovacuum to process indexes of single table in
parallel
---
src/backend/commands/vacuum.c | 27 +
src/backend/commands/vacuumparallel.c | 289 +++++-
src/backend/postmaster/autovacuum.c | 906 +++++++++++++++++-
src/backend/utils/misc/guc_tables.c | 30 +
src/backend/utils/misc/postgresql.conf.sample | 6 +
src/include/postmaster/autovacuum.h | 23 +
src/test/modules/autovacuum/.gitignore | 1 +
src/test/modules/autovacuum/Makefile | 14 +
.../autovacuum/t/001_autovac_parallel.pl | 137 +++
9 files changed, 1387 insertions(+), 46 deletions(-)
create mode 100644 src/test/modules/autovacuum/.gitignore
create mode 100644 src/test/modules/autovacuum/Makefile
create mode 100644 src/test/modules/autovacuum/t/001_autovac_parallel.pl
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 33a33bf6b1c..a5ef5319ccc 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2234,6 +2234,33 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params,
else
toast_relid = InvalidOid;
+ /*
+ * Decide whether we need to process table with given oid in parallel mode
+ * during autovacuum.
+ */
+ if (AmAutoVacuumWorkerProcess() &&
+ params->index_cleanup != VACOPTVALUE_DISABLED)
+ {
+ PgStat_StatTabEntry *tabentry;
+
+ /* fetch the pgstat table entry */
+ tabentry = pgstat_fetch_stat_tabentry_ext(rel->rd_rel->relisshared,
+ rel->rd_id);
+ if (tabentry && tabentry->dead_tuples >= autovac_idx_parallel_min_rows)
+ {
+ List *indexes = RelationGetIndexList(rel);
+ int num_indexes = list_length(indexes);
+
+ list_free(indexes);
+
+ if (num_indexes >= autovac_idx_parallel_min_indexes &&
+ max_parallel_index_autovac_workers > 0)
+ {
+ params->nworkers = max_parallel_index_autovac_workers;
+ }
+ }
+ }
+
/*
* Switch to the table owner's userid, so that any index functions are run
* as that user. Also lock down security-restricted operations and
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 2b9d548cdeb..cb4b7c23010 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,20 +1,23 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel [auto]vacuum execution.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
*
- * In a parallel vacuum, we perform both index bulk deletion and index cleanup
- * with parallel worker processes. Individual indexes are processed by one
- * vacuum process. ParallelVacuumState contains shared information as well as
- * the memory space for storing dead items allocated in the DSA area. We
+ * In a parallel [auto]vacuum, we perform both index bulk deletion and index
+ * cleanup with parallel worker processes. Individual indexes are processed by
+ * one vacuum process. ParallelVacuumState contains shared information as well
+ * as the memory space for storing dead items allocated in the DSA area. We
* launch parallel worker processes at the start of parallel index
* bulk-deletion and index cleanup and once all indexes are processed, the
* parallel worker processes exit. Each time we process indexes in parallel,
* the parallel context is re-initialized so that the same DSM can be used for
- * multiple passes of index bulk-deletion and index cleanup.
+ * multiple passes of index bulk-deletion and index cleanup. For maintenance
+ * vacuum, we launch workers manually (using dynamic bgworkers machinery), and
+ * for autovacuum we send signals to the autovacuum launcher (all logic for
+ * communication among parallel autovacuum processes is in autovacuum.c).
*
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -34,9 +37,11 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
+#include "utils/memutils.h"
#include "utils/rel.h"
/*
@@ -157,11 +162,20 @@ typedef struct PVIndStats
} PVIndStats;
/*
- * Struct for maintaining a parallel vacuum state. typedef appears in vacuum.h.
+ * Struct for maintaining a parallel [auto]vacuum state. typedef appears in
+ * vacuum.h.
*/
struct ParallelVacuumState
{
- /* NULL for worker processes */
+ /* Is this structure used for maintenance vacuum or autovacuum */
+ bool is_autovacuum;
+
+ /*
+ * NULL for worker processes.
+ *
+ * NOTE: Parallel autovacuum only needs a subset of the maintenance vacuum
+ * functionality.
+ */
ParallelContext *pcxt;
/* Parent Heap Relation */
@@ -221,6 +235,10 @@ struct ParallelVacuumState
PVIndVacStatus status;
};
+static ParallelContext *CreateParallelAutoVacContext(int nworkers);
+static void InitializeParallelAutoVacDSM(ParallelContext *pcxt);
+static void DestroyParallelAutoVacContext(ParallelContext *pcxt);
+
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -280,15 +298,21 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
}
pvs = (ParallelVacuumState *) palloc0(sizeof(ParallelVacuumState));
+ pvs->is_autovacuum = AmAutoVacuumWorkerProcess();
pvs->indrels = indrels;
pvs->nindexes = nindexes;
pvs->will_parallel_vacuum = will_parallel_vacuum;
pvs->bstrategy = bstrategy;
pvs->heaprel = rel;
- EnterParallelMode();
- pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
- parallel_workers);
+ if (pvs->is_autovacuum)
+ pcxt = CreateParallelAutoVacContext(parallel_workers);
+ else
+ {
+ EnterParallelMode();
+ pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
+ parallel_workers);
+ }
Assert(pcxt->nworkers > 0);
pvs->pcxt = pcxt;
@@ -327,7 +351,10 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
else
querylen = 0; /* keep compiler quiet */
- InitializeParallelDSM(pcxt);
+ if (pvs->is_autovacuum)
+ InitializeParallelAutoVacDSM(pvs->pcxt);
+ else
+ InitializeParallelDSM(pcxt);
/* Prepare index vacuum stats */
indstats = (PVIndStats *) shm_toc_allocate(pcxt->toc, est_indstats_len);
@@ -371,11 +398,18 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->relid = RelationGetRelid(rel);
shared->elevel = elevel;
shared->queryid = pgstat_get_my_query_id();
- shared->maintenance_work_mem_worker =
- (nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
- shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
+
+ if (pvs->is_autovacuum)
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) :
+ autovacuum_work_mem;
+ else
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+ maintenance_work_mem;
+ shared->dead_items_info.max_bytes = vac_work_mem * 1024L;
/* Prepare DSA space for dead items */
dead_items = TidStoreCreateShared(shared->dead_items_info.max_bytes,
@@ -453,8 +487,13 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
TidStoreDestroy(pvs->dead_items);
- DestroyParallelContext(pvs->pcxt);
- ExitParallelMode();
+ if (pvs->is_autovacuum)
+ DestroyParallelAutoVacContext(pvs->pcxt);
+ else
+ {
+ DestroyParallelContext((ParallelContext *) pvs->pcxt);
+ ExitParallelMode();
+ }
pfree(pvs->will_parallel_vacuum);
pfree(pvs);
@@ -532,6 +571,144 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
}
+/*
+ * Short version of CreateParallelContext (parallel.c). Here we init only those
+ * fields that are needed for parallel index processing during autovacuum.
+ */
+static ParallelContext *
+CreateParallelAutoVacContext(int nworkers)
+{
+ ParallelContext *pcxt;
+ MemoryContext oldcontext;
+
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /* Number of workers should be non-negative. */
+ Assert(nworkers >= 0);
+
+ /* We might be running in a short-lived memory context. */
+ oldcontext = MemoryContextSwitchTo(TopTransactionContext);
+
+ /* Initialize a new ParallelContext. */
+ pcxt = palloc0(sizeof(ParallelContext));
+ pcxt->nworkers = nworkers;
+ pcxt->nworkers_to_launch = nworkers;
+ shm_toc_initialize_estimator(&pcxt->estimator);
+
+ /* Restore previous memory context. */
+ MemoryContextSwitchTo(oldcontext);
+
+ return pcxt;
+}
+
+/*
+ * Short version of InitializeParallelDSM (parallel.c). Here we put into dsm
+ * only those data that are needed for parallel index processing during
+ * autovacuum.
+ */
+static void
+InitializeParallelAutoVacDSM(ParallelContext *pcxt)
+{
+ MemoryContext oldcontext;
+ Size tsnaplen = 0;
+ Size asnaplen = 0;
+ Size segsize = 0;
+ char *tsnapspace;
+ char *asnapspace;
+ Snapshot transaction_snapshot = GetTransactionSnapshot();
+ Snapshot active_snapshot = GetActiveSnapshot();
+
+ Assert(pcxt->nworkers >= 1);
+
+ /* We might be running in a very short-lived memory context. */
+ oldcontext = MemoryContextSwitchTo(TopTransactionContext);
+
+ if (IsolationUsesXactSnapshot())
+ {
+ tsnaplen = EstimateSnapshotSpace(transaction_snapshot);
+ shm_toc_estimate_chunk(&pcxt->estimator, tsnaplen);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+ }
+ asnaplen = EstimateSnapshotSpace(active_snapshot);
+ shm_toc_estimate_chunk(&pcxt->estimator, asnaplen);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+
+ /* Create DSM and initialize with new table of contents. */
+ segsize = shm_toc_estimate(&pcxt->estimator);
+ pcxt->seg = dsm_create(segsize, DSM_CREATE_NULL_IF_MAXSEGMENTS);
+
+ if (pcxt->seg == NULL)
+ {
+ pcxt->nworkers = 0;
+ pcxt->private_memory = MemoryContextAlloc(TopMemoryContext, segsize);
+ }
+
+ pcxt->toc = shm_toc_create(AV_PARALLEL_MAGIC,
+ pcxt->seg == NULL ? pcxt->private_memory :
+ dsm_segment_address(pcxt->seg),
+ segsize);
+
+ /* We can skip the rest of this if we're not budgeting for any workers. */
+ if (pcxt->nworkers > 0)
+ {
+ /*
+ * Serialize the transaction snapshot if the transaction isolation
+ * level uses a transaction snapshot.
+ */
+ if (IsolationUsesXactSnapshot())
+ {
+ tsnapspace = shm_toc_allocate(pcxt->toc, tsnaplen);
+ SerializeSnapshot(transaction_snapshot, tsnapspace);
+ shm_toc_insert(pcxt->toc, AV_PARALLEL_KEY_TRANSACTION_SNAPSHOT,
+ tsnapspace);
+ }
+
+ /* Serialize the active snapshot. */
+ asnapspace = shm_toc_allocate(pcxt->toc, asnaplen);
+ SerializeSnapshot(active_snapshot, asnapspace);
+ shm_toc_insert(pcxt->toc, AV_PARALLEL_KEY_ACTIVE_SNAPSHOT, asnapspace);
+ }
+
+ /* Update nworkers_to_launch, in case we changed nworkers above. */
+ pcxt->nworkers_to_launch = pcxt->nworkers;
+
+ /* Restore previous memory context. */
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Short version of DestroyParallelContext (parallel.c). Here we clean up only
+ * those data that were used during parallel index processing during autovacuum.
+ */
+static void
+DestroyParallelAutoVacContext(ParallelContext *pcxt)
+{
+ /*
+ * If we have allocated a shared memory segment, detach it. This will
+ * implicitly detach the error queues, and any other shared memory queues,
+ * stored there.
+ */
+ if (pcxt->seg != NULL)
+ {
+ dsm_detach(pcxt->seg);
+ pcxt->seg = NULL;
+ }
+
+ /*
+ * If this parallel context is actually in backend-private memory rather
+ * than shared memory, free that memory instead.
+ */
+ if (pcxt->private_memory != NULL)
+ {
+ pfree(pcxt->private_memory);
+ pcxt->private_memory = NULL;
+ }
+
+ AutoVacuumReleaseParallelWork(false);
+ pfree(pcxt);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -558,7 +735,9 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster ||
+ (max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess()) ||
+ (max_parallel_index_autovac_workers == 0 && AmAutoVacuumWorkerProcess()))
return 0;
/*
@@ -597,15 +776,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = AmAutoVacuumWorkerProcess() ?
+ Min(parallel_workers, max_parallel_index_autovac_workers) :
+ Min(parallel_workers, max_parallel_maintenance_workers);
return parallel_workers;
}
/*
* Perform index vacuum or index cleanup with parallel workers. This function
- * must be used by the parallel vacuum leader process.
+ * must be used by the parallel [auto]vacuum leader process.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -670,7 +851,7 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
if (nworkers > 0)
{
/* Reinitialize parallel context to relaunch parallel workers */
- if (num_index_scans > 0)
+ if (num_index_scans > 0 && !pvs->is_autovacuum)
ReinitializeParallelDSM(pvs->pcxt);
/*
@@ -686,9 +867,22 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
* The number of workers can vary between bulkdelete and cleanup
* phase.
*/
- ReinitializeParallelWorkers(pvs->pcxt, nworkers);
-
- LaunchParallelWorkers(pvs->pcxt);
+ if (pvs->is_autovacuum)
+ {
+ pvs->pcxt->nworkers_to_launch = Min(pvs->pcxt->nworkers, nworkers);
+ if (pvs->pcxt->nworkers > 0 && pvs->pcxt->nworkers_to_launch > 0)
+ {
+ pvs->pcxt->nworkers_launched =
+ LaunchParallelAutovacuumWorkers(pvs->heaprel->rd_id,
+ pvs->pcxt->nworkers_to_launch,
+ dsm_segment_handle(pvs->pcxt->seg));
+ }
+ }
+ else
+ {
+ ReinitializeParallelWorkers(pvs->pcxt, nworkers);
+ LaunchParallelWorkers(pvs->pcxt);
+ }
if (pvs->pcxt->nworkers_launched > 0)
{
@@ -733,8 +927,14 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
if (nworkers > 0)
{
- /* Wait for all vacuum workers to finish */
- WaitForParallelWorkersToFinish(pvs->pcxt);
+ /*
+ * Wait for all [auto]vacuum workers involved in parallel index
+ * processing (if any) to finish and advance state machine.
+ */
+ if (pvs->is_autovacuum && pvs->pcxt->nworkers_launched >= 0)
+ ParallelAutovacuumEndSyncPoint(false);
+ else if (!pvs->is_autovacuum)
+ WaitForParallelWorkersToFinish(pvs->pcxt);
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
@@ -982,8 +1182,8 @@ parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
/*
* Perform work within a launched parallel process.
*
- * Since parallel vacuum workers perform only index vacuum or index cleanup,
- * we don't need to report progress information.
+ * Since parallel [auto]vacuum workers perform only index vacuum or index
+ * cleanup, we don't need to report progress information.
*/
void
parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
@@ -997,23 +1197,22 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
BufferUsage *buffer_usage;
WalUsage *wal_usage;
int nindexes;
+ int worker_number;
char *sharedquery;
ErrorContextCallback errcallback;
- /*
- * A parallel vacuum worker must have only PROC_IN_VACUUM flag since we
- * don't support parallel vacuum for autovacuum as of now.
- */
- Assert(MyProc->statusFlags == PROC_IN_VACUUM);
-
- elog(DEBUG1, "starting parallel vacuum worker");
+ Assert(MyProc->statusFlags == PROC_IN_VACUUM || AmAutoVacuumWorkerProcess());
+ elog(DEBUG1, "starting parallel [auto]vacuum worker");
shared = (PVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED, false);
/* Set debug_query_string for individual workers */
- sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
- debug_query_string = sharedquery;
- pgstat_report_activity(STATE_RUNNING, debug_query_string);
+ if (!AmAutoVacuumWorkerProcess())
+ {
+ sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
+ debug_query_string = sharedquery;
+ pgstat_report_activity(STATE_RUNNING, debug_query_string);
+ }
/* Track query ID */
pgstat_report_query_id(shared->queryid, false);
@@ -1091,8 +1290,12 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
/* Report buffer/WAL usage during parallel execution */
buffer_usage = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_BUFFER_USAGE, false);
wal_usage = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_WAL_USAGE, false);
- InstrEndParallelQuery(&buffer_usage[ParallelWorkerNumber],
- &wal_usage[ParallelWorkerNumber]);
+
+ worker_number = AmAutoVacuumWorkerProcess() ?
+ GetAutoVacuumParallelWorkerNumber() : ParallelWorkerNumber;
+
+ InstrEndParallelQuery(&buffer_usage[worker_number],
+ &wal_usage[worker_number]);
/* Report any remaining cost-based vacuum delay time */
if (track_cost_delay_timing)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 16756152b71..cb9c9f374bb 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -90,6 +90,7 @@
#include "postmaster/postmaster.h"
#include "storage/aio_subsys.h"
#include "storage/bufmgr.h"
+#include "storage/condition_variable.h"
#include "storage/ipc.h"
#include "storage/latch.h"
#include "storage/lmgr.h"
@@ -102,6 +103,7 @@
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
#include "utils/injection_point.h"
+#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/ps_status.h"
@@ -129,6 +131,9 @@ int autovacuum_anl_thresh;
double autovacuum_anl_scale;
int autovacuum_freeze_max_age;
int autovacuum_multixact_freeze_max_age;
+int max_parallel_index_autovac_workers;
+int autovac_idx_parallel_min_rows;
+int autovac_idx_parallel_min_indexes;
double autovacuum_vac_cost_delay;
int autovacuum_vac_cost_limit;
@@ -164,6 +169,14 @@ static int default_freeze_table_age;
static int default_multixact_freeze_min_age;
static int default_multixact_freeze_table_age;
+/*
+ * Number of additional workers that was requested for parallel index processing
+ * during autovacuum.
+ */
+static int nworkers_for_idx_autovac = 0;
+
+static int nworkers_launched = 0;
+
/* Memory context for long-lived data */
static MemoryContext AutovacMemCxt;
@@ -222,6 +235,8 @@ typedef struct autovac_table
* wi_proc pointer to PGPROC of the running worker, NULL if not started
* wi_launchtime Time at which this worker was launched
* wi_dobalance Whether this worker should be included in balance calculations
+ * wi_pcleanup if (> 0) => this worker must participate in parallel index
+ * vacuuming as supportive . Must be (== 0) for leader worker.
*
* All fields are protected by AutovacuumLock, except for wi_tableoid and
* wi_sharedrel which are protected by AutovacuumScheduleLock (note these
@@ -237,10 +252,17 @@ typedef struct WorkerInfoData
TimestampTz wi_launchtime;
pg_atomic_flag wi_dobalance;
bool wi_sharedrel;
+ int wi_pcleanup;
} WorkerInfoData;
typedef struct WorkerInfoData *WorkerInfo;
+#define AmParallelIdxAutoVacSupportive() \
+ (MyWorkerInfo != NULL && MyWorkerInfo->wi_pcleanup > 0)
+
+#define AmParallelIdxAutoVacLeader() \
+ (MyWorkerInfo != NULL && MyWorkerInfo->wi_pcleanup == 0)
+
/*
* Possible signals received by the launcher from remote processes. These are
* stored atomically in shared memory so that other processes can set them
@@ -250,9 +272,11 @@ typedef enum
{
AutoVacForkFailed, /* failed trying to start a worker */
AutoVacRebalance, /* rebalance the cost limits */
+ AutoVacParallelReq, /* request for parallel index vacuum */
+ AutoVacNumSignals, /* must be last */
} AutoVacuumSignal;
-#define AutoVacNumSignals (AutoVacRebalance + 1)
+#define AutoVacNumSignals (AutoVacParallelReq + 1)
/*
* Autovacuum workitem array, stored in AutoVacuumShmem->av_workItems. This
@@ -272,6 +296,50 @@ typedef struct AutoVacuumWorkItem
#define NUM_WORKITEMS 256
+typedef enum
+{
+ LAUNCHER = 0, /* autovacuum launcher must wake everyone up */
+ LEADER, /* leader must wake everyone up */
+ LAST_WORKER, /* the last inited supportive worker must wake everyone
+ up */
+} SyncType;
+
+typedef enum
+{
+ STARTUP = 0, /* initial value - no sync points were passed */
+ START_SYNC_POINT_PASSED, /* start_sync_point was passed */
+ END_SYNC_POINT_PASSED, /* end_sync_point was passed */
+ SHUTDOWN, /* leader wants to shut down parallel index
+ vacuum due to occured error */
+} Status;
+
+/*
+ * Structure, stored in AutoVacuumShmem->pav_workItem. This is used for managing
+ * parallel index processing (whithin single table).
+ */
+typedef struct ParallelAutoVacuumWorkItem
+{
+ Oid avw_database;
+ Oid avw_relation;
+ int nworkers_participating;
+ int nworkers_to_launch;
+ int nworkers_sleeping; /* leader doesn't count */
+ int nfinished; /* # of workers, that already finished parallel
+ index processing (and probably already dead) */
+
+ dsm_handle handl;
+ int leader_proc_pid;
+
+ PGPROC *leader_proc;
+ ConditionVariable cv;
+
+ bool active; /* being processed */
+ bool leader_sleeping_on_ssp; /* sleeping on start sync point */
+ bool leader_sleeping_on_esp; /* sleeping on end sync point */
+ SyncType sync_type;
+ Status status;
+} ParallelAutoVacuumWorkItem;
+
/*-------------
* The main autovacuum shmem struct. On shared memory we store this main
* struct and the array of WorkerInfo structs. This struct keeps:
@@ -283,6 +351,8 @@ typedef struct AutoVacuumWorkItem
* av_startingWorker pointer to WorkerInfo currently being started (cleared by
* the worker itself as soon as it's up and running)
* av_workItems work item array
+ * pav_workItem information needed for parallel index processing whithing
+ * single table
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
*
@@ -298,6 +368,7 @@ typedef struct
dlist_head av_runningWorkers;
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+ ParallelAutoVacuumWorkItem pav_workItem;
pg_atomic_uint32 av_nworkersForBalance;
} AutoVacuumShmemStruct;
@@ -322,11 +393,17 @@ pg_noreturn static void AutoVacLauncherShutdown(void);
static void launcher_determine_sleep(bool canlaunch, bool recursing,
struct timeval *nap);
static void launch_worker(TimestampTz now);
+static void launch_worker_for_pcleanup(TimestampTz now);
+static void eliminate_lock_conflicts(ParallelAutoVacuumWorkItem *item,
+ bool all_launched);
static List *get_database_list(void);
static void rebuild_database_list(Oid newdb);
static int db_comparator(const void *a, const void *b);
static void autovac_recalculate_workers_for_balance(void);
+static int parallel_autovacuum_start_sync_point(bool keep_lock);
+static void handle_parallel_idx_autovac_errors(void);
+
static void do_autovacuum(void);
static void FreeWorkerInfo(int code, Datum arg);
@@ -355,6 +432,10 @@ static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+typedef bool (*wakeup_condition) (ParallelAutoVacuumWorkItem *item);
+static bool start_sync_point_wakeup_cond(ParallelAutoVacuumWorkItem *item);
+static bool end_sync_point_wakeup_cond(ParallelAutoVacuumWorkItem *item);
+static void CVSleep(ParallelAutoVacuumWorkItem *item, wakeup_condition wakeup_cond);
/********************************************************************
@@ -583,7 +664,14 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
* wakening conditions.
*/
- launcher_determine_sleep(av_worker_available(), false, &nap);
+ if (nworkers_launched < nworkers_for_idx_autovac)
+ {
+ /* Take the smallest possible sleep interval. */
+ nap.tv_sec = 0;
+ nap.tv_usec = MIN_AUTOVAC_SLEEPTIME * 1000;
+ }
+ else
+ launcher_determine_sleep(av_worker_available(), false, &nap);
/*
* Wait until naptime expires or we get some type of signal (all the
@@ -614,6 +702,19 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
LWLockRelease(AutovacuumLock);
}
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ if (AutoVacuumShmem->av_signal[AutoVacParallelReq])
+ {
+ ParallelAutoVacuumWorkItem *item;
+
+ AutoVacuumShmem->av_signal[AutoVacParallelReq] = false;
+
+ item = &AutoVacuumShmem->pav_workItem;
+ nworkers_for_idx_autovac = item->nworkers_to_launch;
+ nworkers_launched = 0;
+ }
+ LWLockRelease(AutovacuumLock);
+
if (AutoVacuumShmem->av_signal[AutoVacForkFailed])
{
/*
@@ -686,6 +787,7 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
worker->wi_sharedrel = false;
worker->wi_proc = NULL;
worker->wi_launchtime = 0;
+ worker->wi_pcleanup = -1;
dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
&worker->wi_links);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -698,9 +800,29 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
}
LWLockRelease(AutovacuumLock); /* either shared or exclusive */
- /* if we can't do anything, just go back to sleep */
if (!can_launch)
+ {
+ /*
+ * If launcher cannot launch all requested for parallel index
+ * vacuum workers, it must handle all possible lock conflicts and
+ * tell everyone, that there will no new supportive workers.
+ */
+ if (nworkers_launched < nworkers_for_idx_autovac)
+ {
+ ParallelAutoVacuumWorkItem *item;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ item = &AutoVacuumShmem->pav_workItem;
+ Assert(item->active);
+
+ eliminate_lock_conflicts(item, false);
+ nworkers_launched = nworkers_for_idx_autovac = 0;
+ LWLockRelease(AutovacuumLock);
+ }
+
+ /* if we can't do anything else, just go back to sleep */
continue;
+ }
/* We're OK to start a new worker */
@@ -716,6 +838,15 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
*/
launch_worker(current_time);
}
+ else if (nworkers_launched < nworkers_for_idx_autovac)
+ {
+ /*
+ * One of active autovacuum workers sent us request to lauch
+ * participants for parallel index vacuum. We check this case first
+ * because we need to start participants as soon as possible.
+ */
+ launch_worker_for_pcleanup(current_time);
+ }
else
{
/*
@@ -1267,6 +1398,7 @@ do_start_worker(void)
worker->wi_dboid = avdb->adw_datid;
worker->wi_proc = NULL;
worker->wi_launchtime = GetCurrentTimestamp();
+ worker->wi_pcleanup = -1;
AutoVacuumShmem->av_startingWorker = worker;
@@ -1349,6 +1481,136 @@ launch_worker(TimestampTz now)
}
}
+/*
+ * launch_worker_for_pcleanup
+ *
+ * Wrapper for starting a worker (requested by leader of parallel index
+ * vacuuming) from the launcher.
+ */
+static void
+launch_worker_for_pcleanup(TimestampTz now)
+{
+ ParallelAutoVacuumWorkItem *item;
+ WorkerInfo worker;
+ dlist_node *wptr;
+
+ Assert(nworkers_launched < nworkers_for_idx_autovac);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Get a worker entry from the freelist. We checked above, so there
+ * really should be a free slot.
+ */
+ wptr = dclist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
+
+ worker = dlist_container(WorkerInfoData, wi_links, wptr);
+ worker->wi_dboid = InvalidOid;
+ worker->wi_proc = NULL;
+ worker->wi_launchtime = GetCurrentTimestamp();
+
+ /*
+ * Set indicator, that this workers must join to parallel index vacuum.
+ * This variable also plays the role of an unique id among parallel index
+ * vacuum workers. First id is '1', because '0' is reserved for leader.
+ */
+ worker->wi_pcleanup = (nworkers_launched + 1);
+
+ AutoVacuumShmem->av_startingWorker = worker;
+
+ SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_WORKER);
+
+ item = &AutoVacuumShmem->pav_workItem;
+ Assert(item->active);
+
+ nworkers_launched += 1;
+
+ if (nworkers_launched < nworkers_for_idx_autovac)
+ {
+ LWLockRelease(AutovacuumLock);
+ return;
+ }
+
+ Assert(item->sync_type == LAUNCHER &&
+ nworkers_launched == nworkers_for_idx_autovac);
+
+ /*
+ * If launcher managed to launch all requested for parallel index
+ * vacuum workers, it must handle all possible lock conflicts.
+ */
+ eliminate_lock_conflicts(item, true);
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Must be called from autovacuum launcher when it launched all requested
+ * workers for parallel index vacuum, or when it realized, that no more
+ * processes can be launched.
+ *
+ * In this function launcher will assign roles in such a way as to avoid lock
+ * conflicts between leader and supportive workers.
+ *
+ * AutovacuumLock must be held in exclusive mode before calling this function!
+ */
+static void
+eliminate_lock_conflicts(ParallelAutoVacuumWorkItem *item, bool all_launched)
+{
+ Assert(AmAutoVacuumLauncherProcess());
+ Assert(LWLockHeldByMe(AutovacuumLock));
+
+ /* So, let's start... */
+
+ if (item->leader_sleeping_on_ssp &&
+ item->nworkers_sleeping == nworkers_launched)
+ {
+ /*
+ * If both leader and all launched supportive workers are sleeping, then
+ * only we can wake everyone up.
+ */
+ ConditionVariableBroadcast(&item->cv);
+
+ /* Advance status. */
+ item->status = START_SYNC_POINT_PASSED;
+ }
+ else if (item->leader_sleeping_on_ssp &&
+ item->nworkers_sleeping < nworkers_launched)
+ {
+ /*
+ * If leader already sleeping, but several supportive workers are
+ * initing, we shift the responsibility for awakening everyone into the
+ * worker who completes initialization last
+ */
+ item->sync_type = LAST_WORKER;
+ }
+ else if (!item->leader_sleeping_on_ssp &&
+ item->nworkers_sleeping == nworkers_launched)
+ {
+ /*
+ * If only leader is not sleeping - it must wake up all workers when it
+ * finishes all preparations.
+ */
+ item->sync_type = LEADER;
+ }
+ else
+ {
+ /*
+ * If nobody is sleeping, we assume that leader has higher chanses to
+ * asleep first, so set sync type to LAST_WORKER, but if the last worker
+ * will see that leader still not sleeping, it will change sync type to
+ * LEADER and asleep.
+ */
+ item->sync_type = LAST_WORKER;
+ }
+
+ /*
+ * If we cannot launch all requested workers, refresh
+ * nworkers_to_launch value, so that the last worker can find out
+ * that he is really the last.
+ */
+ if (!all_launched && item->sync_type == LAST_WORKER)
+ item->nworkers_to_launch = nworkers_launched;
+}
+
/*
* Called from postmaster to signal a failure to fork a process to become
* worker. The postmaster should kill(SIGUSR2) the launcher shortly
@@ -1360,6 +1622,37 @@ AutoVacWorkerFailed(void)
AutoVacuumShmem->av_signal[AutoVacForkFailed] = true;
}
+/*
+ * Called from autovacuum worker to signal that he needs participants in
+ * parallel index vacuum. Function sends SIGUSR2 to the launcher and returns
+ * 'true' iff signal was sent successfully.
+ */
+bool
+AutoVacParallelWorkRequest(void)
+{
+ if (AutoVacuumShmem->av_launcherpid == 0)
+ {
+ ereport(WARNING,
+ (errcode(ERRCODE_INTERNAL_ERROR),
+ errmsg("autovacuum launcher is dead")));
+
+ return false;
+ }
+
+ if (kill(AutoVacuumShmem->av_launcherpid, SIGUSR2) < 0)
+ {
+ ereport(WARNING,
+ (errcode(ERRCODE_SYSTEM_ERROR),
+ errmsg("failed to send signal to autovac launcher (pid %d): %m",
+ AutoVacuumShmem->av_launcherpid)));
+
+ return false;
+ }
+
+ AutoVacuumShmem->av_signal[AutoVacParallelReq] = true;
+ return true;
+}
+
/* SIGUSR2: a worker is up and running, or just finished, or failed to fork */
static void
avl_sigusr2_handler(SIGNAL_ARGS)
@@ -1559,6 +1852,8 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
{
char dbname[NAMEDATALEN];
+ Assert(MyWorkerInfo->wi_pcleanup < 0);
+
/*
* Report autovac startup to the cumulative stats system. We
* deliberately do this before InitPostgres, so that the
@@ -1593,12 +1888,113 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
recentMulti = ReadNextMultiXactId();
do_autovacuum();
}
+ else if (AmParallelIdxAutoVacSupportive())
+ {
+ ParallelAutoVacuumWorkItem *item;
+ dsm_handle handle;
+ PGPROC *leader_proc;
+ int leader_proc_pid;
+ dsm_segment *seg;
+ shm_toc *toc;
+ char *asnapspace;
+ char *tsnapspace;
+ char dbname[NAMEDATALEN];
+ Snapshot tsnapshot;
+ Snapshot asnapshot;
+
+ /*
+ * We will abort parallel index vacuuming whithin current process if
+ * something errors out
+ */
+ PG_TRY();
+ {
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ item = &AutoVacuumShmem->pav_workItem;
+ dbid = item->avw_database;
+ handle = item->handl;
+ leader_proc = item->leader_proc;
+ leader_proc_pid = item->leader_proc_pid;
+ LWLockRelease(AutovacuumLock);
+
+ InitPostgres(NULL, dbid, NULL, InvalidOid,
+ INIT_PG_OVERRIDE_ALLOW_CONNS,
+ dbname);
+
+ set_ps_display(dbname);
+ if (PostAuthDelay)
+ pg_usleep(PostAuthDelay * 1000000L);
+
+ /* And do an appropriate amount of work */
+ recentXid = ReadNextTransactionId();
+ recentMulti = ReadNextMultiXactId();
+
+ if (parallel_autovacuum_start_sync_point(false) == -1)
+ {
+ /* We are not participating anymore */
+ MyWorkerInfo->wi_pcleanup = -1;
+ goto exit;
+ }
+
+ seg = dsm_attach(handle);
+ if (seg == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("could not map dynamic shared memory segment")));
+
+ toc = shm_toc_attach(AV_PARALLEL_MAGIC, dsm_segment_address(seg));
+ if (toc == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("invalid magic number in dynamic shared memory segment")));
+
+ if (!BecomeLockGroupMember(leader_proc, leader_proc_pid))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("could not become lock group member")));
+ }
+
+ StartTransactionCommand();
+
+ asnapspace =
+ shm_toc_lookup(toc, AV_PARALLEL_KEY_ACTIVE_SNAPSHOT, false);
+ tsnapspace =
+ shm_toc_lookup(toc, AV_PARALLEL_KEY_TRANSACTION_SNAPSHOT, true);
+ asnapshot = RestoreSnapshot(asnapspace);
+ tsnapshot = tsnapspace ? RestoreSnapshot(tsnapspace) : asnapshot;
+ RestoreTransactionSnapshot(tsnapshot, leader_proc);
+ PushActiveSnapshot(asnapshot);
+
+ /*
+ * We've changed which tuples we can see, and must therefore
+ * invalidate system caches.
+ */
+ InvalidateSystemCaches();
+
+ parallel_vacuum_main(seg, toc);
+
+ /* Must pop active snapshot so snapmgr.c doesn't complain. */
+ PopActiveSnapshot();
+
+ dsm_detach(seg);
+ CommitTransactionCommand();
+ ParallelAutovacuumEndSyncPoint(false);
+ }
+ PG_CATCH();
+ {
+ EmitErrorReport();
+ if (AmParallelIdxAutoVacSupportive())
+ handle_parallel_idx_autovac_errors();
+ }
+ PG_END_TRY();
+ }
/*
* The launcher will be notified of my death in ProcKill, *if* we managed
* to get a worker slot at all
*/
+exit:
/* All done, go away */
proc_exit(0);
}
@@ -2461,6 +2857,10 @@ do_autovacuum(void)
tab->at_datname, tab->at_nspname, tab->at_relname);
EmitErrorReport();
+ /* if we are parallel index vacuuming leader, we must shut it down */
+ if (AmParallelIdxAutoVacLeader())
+ handle_parallel_idx_autovac_errors();
+
/* this resets ProcGlobal->statusFlags[i] too */
AbortOutOfAnyTransaction();
FlushErrorState();
@@ -3296,6 +3696,503 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * Release work item, used for managing parallel index vacuum. Must be called
+ * once and only from leader worker.
+ *
+ * If 'keep_lock' is true, then AutovacuumLock will not be released in the end
+ * of function execution.
+ */
+void
+AutoVacuumReleaseParallelWork(bool keep_lock)
+{
+ ParallelAutoVacuumWorkItem *workitem;
+
+ /*
+ * We might not get the workitem from launcher (we must not be considered
+ * as leader in this case), so just leave.
+ */
+ if (!AmParallelIdxAutoVacLeader())
+ return;
+
+ if (!LWLockHeldByMe(AutovacuumLock))
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ workitem = &AutoVacuumShmem->pav_workItem;
+
+ Assert(AmParallelIdxAutoVacLeader() &&
+ workitem->leader_proc_pid == MyProcPid);
+
+ workitem->leader_proc = NULL;
+ workitem->leader_proc_pid = 0;
+ workitem->active = false;
+
+ /* We are not leader anymore. */
+ MyWorkerInfo->wi_pcleanup = -1;
+
+ if (!keep_lock)
+ LWLockRelease(AutovacuumLock);
+}
+
+static bool
+start_sync_point_wakeup_cond(ParallelAutoVacuumWorkItem *item)
+{
+ bool need_wakeup = false;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ if (AmParallelIdxAutoVacLeader())
+ {
+ /*
+ * In normal case we should exit sleep loop after last launched
+ * supportive worker passed sync point (status == START_SYNC_POINT_PASSED).
+ * But if we are in SHUTDOWN mode, all launched workers will just exit
+ * sync point whithout status advancing. We can handle such case if we
+ * check that n_participating == n_to_launch.
+ */
+ if (item->status == SHUTDOWN)
+ need_wakeup = (item->nworkers_participating == item->nworkers_to_launch);
+ else
+ need_wakeup = item->status == START_SYNC_POINT_PASSED;
+ }
+ else
+ need_wakeup = (item->status == START_SYNC_POINT_PASSED ||
+ item->status == SHUTDOWN);
+
+ LWLockRelease(AutovacuumLock);
+ return need_wakeup;
+}
+
+static bool
+end_sync_point_wakeup_cond(ParallelAutoVacuumWorkItem *item)
+{
+ bool need_wakeup = false;
+
+ Assert(AmParallelIdxAutoVacLeader());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ need_wakeup = item->status == END_SYNC_POINT_PASSED;
+ LWLockRelease(AutovacuumLock);
+ return need_wakeup;
+}
+
+/*
+ * Waiting on condition variable is frequent operation, so it has beed taken
+ * out with a separate function. Caller must acquire hold AutovacuumLock before
+ * calling it.
+ */
+static void
+CVSleep(ParallelAutoVacuumWorkItem *item, wakeup_condition wakeup_cond)
+{
+ ConditionVariablePrepareToSleep(&item->cv);
+
+ LWLockRelease(AutovacuumLock);
+ PG_TRY();
+ {
+ do
+ {
+ ConditionVariableSleep(&item->cv, PG_WAIT_IPC);
+ } while (!wakeup_cond(item));
+ }
+ PG_CATCH();
+ {
+ ConditionVariableCancelSleep();
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
+
+ ConditionVariableCancelSleep();
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+}
+
+/*
+ * This function used to synchronize leader with supportive workers during
+ * parallel index vacuuming. Each process will exit iff:
+ * Leader worker is ready to perform parallel vacuum &&
+ * All launched supportive workers are ready to perform parallel vacuum &&
+ * (Autovacuum launcher already launched all requested workers ||
+ * Autovacuum launcher cannot launch more workers)
+ *
+ * If 'keep_lock' is true, then AutovacuumLock will not be released in the end
+ * of function execution.
+ *
+ * NOTE: Some workers may call this function when leader worker decided to shut
+ * down parallel vacuuming. In this case '-1' value will be returned.
+ */
+static int
+parallel_autovacuum_start_sync_point(bool keep_lock)
+{
+ ParallelAutoVacuumWorkItem *workitem;
+ SyncType sync_type;
+ int num_participants;
+
+ if (!LWLockHeldByMe(AutovacuumLock))
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ workitem = &AutoVacuumShmem->pav_workItem;
+ Assert(workitem->active);
+ sync_type = workitem->sync_type;
+
+ if (AmParallelIdxAutoVacLeader())
+ {
+ Assert(workitem->leader_proc_pid == MyProcPid);
+
+ /* Wake up all sleeping supportive workers, if required ... */
+ if (sync_type == LEADER)
+ {
+ ConditionVariableBroadcast(&workitem->cv);
+
+ /*
+ * Advance status, because we are guaranteed to pass this
+ * sync point.
+ * Don't advance if we call this function from error handle function
+ * (status == SHUTDOWN).
+ */
+ if (workitem->status != SHUTDOWN)
+ workitem->status = START_SYNC_POINT_PASSED;
+ }
+ /* ... otherwise, wait for somebody to wake us up */
+ else
+ {
+ workitem->leader_sleeping_on_ssp = true;
+ CVSleep(workitem, start_sync_point_wakeup_cond);
+ workitem->leader_sleeping_on_ssp = false;
+
+ /*
+ * A priori, we believe that in the end everyone should be awakened
+ * by the leader.
+ */
+ workitem->sync_type = LEADER;
+ }
+ }
+ else
+ {
+ workitem->nworkers_participating += 1;
+
+ /*
+ * If we know, that launcher will no longer attempt to launch more
+ * supportive workers for this item => we are LAST_WORKER for sure.
+ *
+ * Note, that launcher set LAST_WORKER sync type without knowing
+ * current status of leader. So we also check that leader is sleeping
+ * before wake all up. Otherwise, we must wait for leader (and ask him
+ * to wake all up).
+ */
+ if (workitem->nworkers_participating == workitem->nworkers_to_launch &&
+ sync_type == LAST_WORKER && workitem->leader_sleeping_on_ssp)
+ {
+ ConditionVariableBroadcast(&workitem->cv);
+
+ /*
+ * We must not advance status if leader wants to shut down parallel
+ * execution (see checks below).
+ */
+ if (workitem->status != SHUTDOWN)
+ workitem->status = START_SYNC_POINT_PASSED;
+ }
+ else
+ {
+ if (workitem->nworkers_participating == workitem->nworkers_to_launch &&
+ sync_type == LAST_WORKER)
+ {
+ workitem->sync_type = LEADER;
+ }
+
+ workitem->nworkers_sleeping += 1;
+ CVSleep(workitem, start_sync_point_wakeup_cond);
+ workitem->nworkers_sleeping -= 1;
+ }
+ }
+
+ /* Tell caller that it must not participate in parallel index cleanup. */
+ if (workitem->status == SHUTDOWN)
+ num_participants = -1;
+ else
+ num_participants = workitem->nworkers_participating;
+
+ if (!keep_lock)
+ LWLockRelease(AutovacuumLock);
+
+ return num_participants;
+}
+
+/*
+ * Like function above, but must be called by leader and supportive workers
+ * when they finished parallel index vacuum.
+ *
+ * If 'keep_lock' is true, then AutovacuumLock will not be released in the end
+ * of function execution.
+ */
+void
+ParallelAutovacuumEndSyncPoint(bool keep_lock)
+{
+ ParallelAutoVacuumWorkItem *workitem;
+
+ if (!LWLockHeldByMe(AutovacuumLock))
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ workitem = &AutoVacuumShmem->pav_workItem;
+ Assert(workitem->active);
+
+ if (workitem->nworkers_participating == 0)
+ {
+ Assert(!AmParallelIdxAutoVacSupportive());
+
+ /*
+ * We have two cases when no supportive workers were launched:
+ * 1) Leader got workitem, but launcher didn't launch any
+ * workers => just advance status, because we don't need to wait
+ * for anybody.
+ * 2) Leader didn't get workitem, because it was already in use =>
+ * we must not touch it. Just leave.
+ */
+ if (AmParallelIdxAutoVacLeader())
+ {
+ Assert(workitem->leader_proc_pid == MyProcPid);
+ workitem->status = END_SYNC_POINT_PASSED;
+ }
+ else
+ Assert(workitem->leader_proc_pid != MyProcPid);
+
+ if (!keep_lock)
+ LWLockRelease(AutovacuumLock);
+
+ return;
+ }
+
+ if (AmParallelIdxAutoVacLeader())
+ {
+ Assert(workitem->leader_proc_pid == MyProcPid);
+ Assert(workitem->sync_type == LEADER);
+
+ /* Wait for all workers to finish (only last worker will wake us up) */
+ if (workitem->nfinished != workitem->nworkers_participating)
+ {
+ workitem->sync_type = LAST_WORKER;
+ workitem->leader_sleeping_on_esp = true;
+ CVSleep(workitem, end_sync_point_wakeup_cond);
+ workitem->leader_sleeping_on_esp = false;
+
+ Assert(workitem->nfinished == workitem->nworkers_participating);
+
+ /*
+ * Advance status, because we are guaranteed to pass this
+ * sync point.
+ */
+ workitem->status = END_SYNC_POINT_PASSED;
+ }
+ }
+ else
+ {
+ workitem->nfinished += 1;
+
+ /* If we are last finished worker - wake up the leader.
+ *
+ * If not - just leave, because supportive worker already finished all
+ * work and must die.
+ */
+ if (workitem->sync_type == LAST_WORKER &&
+ workitem->nfinished == workitem->nworkers_participating &&
+ workitem->leader_sleeping_on_esp)
+ {
+ ConditionVariableBroadcast(&workitem->cv);
+
+ /*
+ * Don't need to check SHUTDOWN status here - all supportive workers
+ * are about to finish anyway.
+ */
+ workitem->status = END_SYNC_POINT_PASSED;
+ }
+
+ /* We are not participate anymore */
+ MyWorkerInfo->wi_pcleanup = -1;
+ }
+
+ if (!keep_lock)
+ LWLockRelease(AutovacuumLock);
+
+ return;
+}
+
+/*
+ * Get id of parallel index vacuum worker (counting from 0).
+ */
+int
+GetAutoVacuumParallelWorkerNumber(void)
+{
+ Assert(AmAutoVacuumWorkerProcess() && MyWorkerInfo->wi_pcleanup > 0);
+ return (MyWorkerInfo->wi_pcleanup - 1);
+}
+
+/*
+ * Leader autovacuum process can decide, that he needs several helper workers
+ * to process table in parallel mode. He must set up parallel context and call
+ * LaunchParallelAutovacuumWorkers.
+ *
+ * In this function we do following :
+ * 1) Send signal to autovacuum lancher that creates 'supportive workers'
+ * during launcher's standard work loop.
+ * 2) Wait for supportive workers to start.
+ *
+ * Funcition return number of workers that launcher was able to launch (may be
+ * less then 'nworkers_to_launch').
+ */
+int
+LaunchParallelAutovacuumWorkers(Oid rel_id, int nworkers_to_launch,
+ dsm_handle handle)
+{
+ int nworkers_launched = 0;
+ ParallelAutoVacuumWorkItem *workitem;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ workitem = &AutoVacuumShmem->pav_workItem;
+
+ /*
+ * For now, there can be only one leader across all cluster.
+ * TODO: fix it in next versions
+ */
+ if (workitem->active && workitem->leader_proc_pid != MyProcPid)
+ {
+ LWLockRelease(AutovacuumLock);
+ return -1;
+ }
+
+ /* Notify autovacuum launcher that we need supportive workers */
+ if (AutoVacParallelWorkRequest())
+ {
+ /* OK, we can use this workitem entry. Init it. */
+ workitem->avw_database = MyDatabaseId;
+ workitem->avw_relation = rel_id;
+ workitem->handl = handle;
+ workitem->leader_proc = MyProc;
+ workitem->leader_proc_pid = MyProcPid;
+ workitem->nworkers_participating = 0;
+ workitem->nworkers_to_launch = nworkers_to_launch;
+ workitem->leader_sleeping_on_ssp = false;
+ workitem->leader_sleeping_on_esp = false;
+ workitem->nworkers_sleeping = 0;
+ workitem->nfinished = 0;
+ workitem->sync_type = LAUNCHER;
+ workitem->status = STARTUP;
+
+ workitem->active = true;
+ LWLockRelease(AutovacuumLock);
+
+ /* Become the leader */
+ MyWorkerInfo->wi_pcleanup = 0;
+
+ /* All created workers must get same locks as leader process */
+ BecomeLockGroupLeader();
+
+ /*
+ * Wait until all supprotive workers are launched. Also retrieve actual
+ * number of participants
+ */
+
+ nworkers_launched = parallel_autovacuum_start_sync_point(false);
+ Assert(nworkers_launched >= 0);
+ }
+ else
+ {
+ /*
+ * If we (for any reason) cannot send signal to the launcher, don't try
+ * to do index vacuuming in parallel
+ */
+ LWLockRelease(AutovacuumLock);
+ return 0;
+ }
+
+ return nworkers_launched;
+}
+
+/*
+ * During parallel index vacuuming any worker (both supportives and leader) can
+ * catch an error.
+ * In order to handle it in the right way we must call this function.
+ */
+static void
+handle_parallel_idx_autovac_errors(void)
+{
+ ParallelAutoVacuumWorkItem *item;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ item = &AutoVacuumShmem->pav_workItem;
+
+ if (AmParallelIdxAutoVacLeader())
+ {
+ if (item->status == START_SYNC_POINT_PASSED)
+ {
+ /*
+ * If start sync point already passed - just wait for all supportive
+ * workers to finish and exit.
+ */
+ ParallelAutovacuumEndSyncPoint(true);
+ }
+ else if (item->status == STARTUP)
+ {
+ /*
+ * If no sync point are passed we can prevent supportive workers
+ * from performing their work - set SHUTDOWN status and wait while
+ * all workers will see it.
+ */
+ item->status = SHUTDOWN;
+ parallel_autovacuum_start_sync_point(true);
+ }
+
+ AutoVacuumReleaseParallelWork(true);
+ }
+ else
+ {
+ Assert(AmParallelIdxAutoVacSupportive());
+
+ if (item->status == STARTUP || item->status == SHUTDOWN)
+ {
+ /*
+ * If no sync point are passed - just exclude ourselves from
+ * participants. Further parallel index vacuuming will take place
+ * as usual.
+ */
+ item->nworkers_to_launch -= 1;
+
+ if (item->nworkers_participating == item->nworkers_to_launch &&
+ item->sync_type == LAST_WORKER && item->leader_sleeping_on_ssp)
+ {
+ ConditionVariableBroadcast(&item->cv);
+
+ if (item->status != SHUTDOWN)
+ item->status = START_SYNC_POINT_PASSED;
+ }
+ }
+ else if (item->status == START_SYNC_POINT_PASSED)
+ {
+ /*
+ * If start sync point already passed we will simulate the usual
+ * end of work (see ParallelAutovacuumEndSyncPoint).
+ */
+ item->nfinished += 1;
+
+ /*
+ * We check "!item->leader_sleeping_on_ssp" in order to handle an
+ * almost impossible situation, when leader didn't have time to wake
+ * up after start sync point (but last worker already advenced
+ * status to START_SYNC_POINT_PASSED). In this case we should not
+ * advance status to END_SYNC_POINT_PASSED, so leader can continue
+ * processing.
+ */
+ if (item->sync_type == LAST_WORKER &&
+ item->nfinished == item->nworkers_participating &&
+ !item->leader_sleeping_on_ssp)
+ {
+ ConditionVariableBroadcast(&item->cv);
+ item->status = END_SYNC_POINT_PASSED;
+ }
+ }
+ }
+
+ LWLockRelease(AutovacuumLock);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3361,6 +4258,9 @@ AutoVacuumShmemInit(void)
AutoVacuumShmem->av_startingWorker = NULL;
memset(AutoVacuumShmem->av_workItems, 0,
sizeof(AutoVacuumWorkItem) * NUM_WORKITEMS);
+ memset(&AutoVacuumShmem->pav_workItem, 0,
+ sizeof(ParallelAutoVacuumWorkItem));
+ ConditionVariableInit(&AutoVacuumShmem->pav_workItem.cv);
worker = (WorkerInfo) ((char *) AutoVacuumShmem +
MAXALIGN(sizeof(AutoVacuumShmemStruct)));
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 2f8cbd86759..2e36921097a 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3647,6 +3647,36 @@ struct config_int ConfigureNamesInt[] =
check_autovacuum_work_mem, NULL, NULL
},
+ {
+ {"max_parallel_index_autovac_workers", PGC_POSTMASTER, VACUUM_AUTOVACUUM,
+ gettext_noop("Sets the maximum number of parallel autovacuum worker processes during parallel index vacuuming of single table."),
+ NULL
+ },
+ &max_parallel_index_autovac_workers,
+ 0, 0, MAX_PARALLEL_WORKER_LIMIT,
+ NULL, NULL, NULL
+ },
+
+ {
+ {"autovac_idx_parallel_min_rows", PGC_POSTMASTER, VACUUM_AUTOVACUUM,
+ gettext_noop("Sets the minimum number of dead tuples in single table that requires parallel index processing during autovacuum."),
+ NULL
+ },
+ &autovac_idx_parallel_min_rows,
+ 0, 0, INT32_MAX,
+ NULL, NULL, NULL
+ },
+
+ {
+ {"autovac_idx_parallel_min_indexes", PGC_POSTMASTER, VACUUM_AUTOVACUUM,
+ gettext_noop("Sets the minimum number indexes created on single table that requires parallel index processing during autovacuum."),
+ NULL
+ },
+ &autovac_idx_parallel_min_indexes,
+ 2, 2, INT32_MAX,
+ NULL, NULL, NULL
+ },
+
{
{"tcp_keepalives_idle", PGC_USERSET, CONN_AUTH_TCP,
gettext_noop("Time between issuing TCP keepalives."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 34826d01380..08869398039 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -146,6 +146,12 @@
#hash_mem_multiplier = 2.0 # 1-1000.0 multiplier on hash table work_mem
#maintenance_work_mem = 64MB # min 64kB
#autovacuum_work_mem = -1 # min 64kB, or -1 to use maintenance_work_mem
+#max_parallel_index_autovac_workers = 0 # this feature disabled by default
+ # (change requires restart)
+#autovac_idx_parallel_min_rows = 0
+ # (change requires restart)
+#autovac_idx_parallel_min_indexes = 2
+ # (change requires restart)
#logical_decoding_work_mem = 64MB # min 64kB
#max_stack_depth = 2MB # min 100kB
#shared_memory_type = mmap # the default is the first option
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e8135f41a1c..8647154437b 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -15,6 +15,8 @@
#define AUTOVACUUM_H
#include "storage/block.h"
+#include "storage/dsm_impl.h"
+#include "storage/lock.h"
/*
* Other processes can request specific work from autovacuum, identified by
@@ -25,12 +27,25 @@ typedef enum
AVW_BRINSummarizeRange,
} AutoVacuumWorkItemType;
+/*
+ * Magic number for parallel context TOC. Used for parallel index processing
+ * during autovacuum.
+ */
+#define AV_PARALLEL_MAGIC 0xaaaaaaaa
+
+/* Magic numbers for per-context parallel index processing state sharing. */
+#define AV_PARALLEL_KEY_TRANSACTION_SNAPSHOT UINT64CONST(0xFFF0000000000001)
+#define AV_PARALLEL_KEY_ACTIVE_SNAPSHOT UINT64CONST(0xFFF0000000000002)
+
/* GUC variables */
extern PGDLLIMPORT bool autovacuum_start_daemon;
extern PGDLLIMPORT int autovacuum_worker_slots;
extern PGDLLIMPORT int autovacuum_max_workers;
extern PGDLLIMPORT int autovacuum_work_mem;
+extern PGDLLIMPORT int max_parallel_index_autovac_workers;
+extern PGDLLIMPORT int autovac_idx_parallel_min_rows;
+extern PGDLLIMPORT int autovac_idx_parallel_min_indexes;
extern PGDLLIMPORT int autovacuum_naptime;
extern PGDLLIMPORT int autovacuum_vac_thresh;
extern PGDLLIMPORT int autovacuum_vac_max_thresh;
@@ -60,10 +75,18 @@ extern void AutoVacWorkerFailed(void);
pg_noreturn extern void AutoVacLauncherMain(const void *startup_data, size_t startup_data_len);
pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t startup_data_len);
+/* called from autovac worker when it needs participants in parallel index cleanup */
+extern bool AutoVacParallelWorkRequest(void);
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+extern void AutoVacuumReleaseParallelWork(bool keep_lock);
+extern int AutoVacuumParallelWorkWaitForStart(void);
+extern void ParallelAutovacuumEndSyncPoint( bool keep_lock);
+extern int GetAutoVacuumParallelWorkerNumber(void);
+extern int LaunchParallelAutovacuumWorkers(Oid rel_id, int nworkers_to_launch,
+ dsm_handle handle);
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/test/modules/autovacuum/.gitignore b/src/test/modules/autovacuum/.gitignore
new file mode 100644
index 00000000000..0b54641bceb
--- /dev/null
+++ b/src/test/modules/autovacuum/.gitignore
@@ -0,0 +1 @@
+/tmp_check/
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/Makefile b/src/test/modules/autovacuum/Makefile
new file mode 100644
index 00000000000..90c00ff350b
--- /dev/null
+++ b/src/test/modules/autovacuum/Makefile
@@ -0,0 +1,14 @@
+# src/test/modules/autovacuum/Makefile
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/t/001_autovac_parallel.pl b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
new file mode 100644
index 00000000000..ff07c33d867
--- /dev/null
+++ b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
@@ -0,0 +1,137 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+$node->append_conf('postgresql.conf', qq{
+ autovacuum = off
+ max_wal_size = 4096
+});
+$node->start;
+
+my $indexes_num = 80;
+my $initial_rows_num = 1_000_000;
+
+# Create big table and create specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER, col_5 INTEGER,
+ col_6 INTEGER, col_7 INTEGER, col_8 INTEGER, col_9 INTEGER, col_10 INTEGER,
+ col_11 INTEGER, col_12 INTEGER, col_13 INTEGER, col_14 INTEGER, col_15 INTEGER,
+ col_16 INTEGER, col_17 INTEGER, col_18 INTEGER, col_19 INTEGER, col_20 INTEGER,
+ col_21 INTEGER, col_22 INTEGER, col_23 INTEGER, col_24 INTEGER, col_25 INTEGER,
+ col_26 INTEGER, col_27 INTEGER, col_28 INTEGER, col_29 INTEGER, col_30 INTEGER,
+ col_31 INTEGER, col_32 INTEGER, col_33 INTEGER, col_34 INTEGER, col_35 INTEGER,
+ col_36 INTEGER, col_37 INTEGER, col_38 INTEGER, col_39 INTEGER, col_40 INTEGER,
+ col_41 INTEGER, col_42 INTEGER, col_43 INTEGER, col_44 INTEGER, col_45 INTEGER,
+ col_46 INTEGER, col_47 INTEGER, col_48 INTEGER, col_49 INTEGER, col_50 INTEGER,
+ col_51 INTEGER, col_52 INTEGER, col_53 INTEGER, col_54 INTEGER, col_55 INTEGER,
+ col_56 INTEGER, col_57 INTEGER, col_58 INTEGER, col_59 INTEGER, col_60 INTEGER,
+ col_61 INTEGER, col_62 INTEGER, col_63 INTEGER, col_64 INTEGER, col_65 INTEGER,
+ col_66 INTEGER, col_67 INTEGER, col_68 INTEGER, col_69 INTEGER, col_70 INTEGER,
+ col_71 INTEGER, col_72 INTEGER, col_73 INTEGER, col_74 INTEGER, col_75 INTEGER,
+ col_76 INTEGER, col_77 INTEGER, col_78 INTEGER, col_79 INTEGER, col_80 INTEGER,
+ col_81 INTEGER, col_82 INTEGER, col_83 INTEGER, col_84 INTEGER, col_85 INTEGER,
+ col_86 INTEGER, col_87 INTEGER, col_88 INTEGER, col_89 INTEGER, col_90 INTEGER,
+ col_91 INTEGER, col_92 INTEGER, col_93 INTEGER, col_94 INTEGER, col_95 INTEGER,
+ col_96 INTEGER, col_97 INTEGER, col_98 INTEGER, col_99 INTEGER, col_100 INTEGER
+ );
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM pg_index i
+ JOIN pg_class c ON c.oid = i.indrelid
+ WHERE c.relname = 'test_autovac';",
+ stdout => \$psql_out
+);
+is($psql_out, $indexes_num + 1, "All indexes created successfully");
+
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac (
+ col_1, col_2, col_3, col_4, col_5, col_6, col_7, col_8, col_9, col_10,
+ col_11, col_12, col_13, col_14, col_15, col_16, col_17, col_18, col_19, col_20,
+ col_21, col_22, col_23, col_24, col_25, col_26, col_27, col_28, col_29, col_30,
+ col_31, col_32, col_33, col_34, col_35, col_36, col_37, col_38, col_39, col_40,
+ col_41, col_42, col_43, col_44, col_45, col_46, col_47, col_48, col_49, col_50,
+ col_51, col_52, col_53, col_54, col_55, col_56, col_57, col_58, col_59, col_60,
+ col_61, col_62, col_63, col_64, col_65, col_66, col_67, col_68, col_69, col_70,
+ col_71, col_72, col_73, col_74, col_75, col_76, col_77, col_78, col_79, col_80,
+ col_81, col_82, col_83, col_84, col_85, col_86, col_87, col_88, col_89, col_90,
+ col_91, col_92, col_93, col_94, col_95, col_96, col_97, col_98, col_99, col_100
+ ) VALUES (
+ i, i + 1, i + 2, i + 3, i + 4, i + 5, i + 6, i + 7, i + 8, i + 9,
+ i + 10, i + 11, i + 12, i + 13, i + 14, i + 15, i + 16, i + 17, i + 18, i + 19,
+ i + 20, i + 21, i + 22, i + 23, i + 24, i + 25, i + 26, i + 27, i + 28, i + 29,
+ i + 30, i + 31, i + 32, i + 33, i + 34, i + 35, i + 36, i + 37, i + 38, i + 39,
+ i + 40, i + 41, i + 42, i + 43, i + 44, i + 45, i + 46, i + 47, i + 48, i + 49,
+ i + 50, i + 51, i + 52, i + 53, i + 54, i + 55, i + 56, i + 57, i + 58, i + 59,
+ i + 60, i + 61, i + 62, i + 63, i + 64, i + 65, i + 66, i + 67, i + 68, i + 69,
+ i + 70, i + 71, i + 72, i + 73, i + 74, i + 75, i + 76, i + 77, i + 78, i + 79,
+ i + 80, i + 81, i + 82, i + 83, i + 84, i + 85, i + 86, i + 87, i + 88, i + 89,
+ i + 90, i + 91, i + 92, i + 93, i + 94, i + 95, i + 96, i + 97, i + 98, i + 99
+ );
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM test_autovac;",
+ stdout => \$psql_out
+);
+is($psql_out, $initial_rows_num, "All data inserted into table successfully");
+
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+my $dead_tuples_thresh = $initial_rows_num / 4;
+my $indexes_num_thresh = $indexes_num / 2;
+my $num_workers = 3;
+
+# Reduce autovacuum_work_mem, so leader process will perform parallel indexi
+# vacuum phase several times
+$node->append_conf('postgresql.conf', qq{
+ autovacuum_naptime = '1s'
+ autovacuum_work_mem = 2048
+ autovacuum_vacuum_threshold = 1
+ autovacuum_analyze_threshold = 1
+ autovacuum_vacuum_scale_factor = 0.1
+ autovacuum_analyze_scale_factor = 0.1
+ autovacuum_max_workers = 10
+ autovacuum = on
+ autovac_idx_parallel_min_rows = $dead_tuples_thresh
+ autovac_idx_parallel_min_indexes = $indexes_num_thresh
+ max_parallel_index_autovac_workers = $num_workers
+});
+
+$node->restart;
+
+# wait for autovacuum to reset datfrozenxid age to 0
+$node->poll_query_until('postgres', q{
+ SELECT count(*) = 0 FROM pg_database WHERE mxid_age(datfrozenxid) > 0
+}) or die "Timed out while waiting for autovacuum";
+
+ok(1, "There are no segfaults");
+
+$node->stop;
+done_testing();
--
2.43.0
On Fri, May 2, 2025 at 11:58 PM Sami Imseih <samimseih@gmail.com> wrote:
I am generally -1 on the idea of autovacuum performing parallel
index vacuum, because I always felt that the parallel option should
be employed in a targeted manner for a specific table. if you have a bunch
of large tables, some more important than others, a/c may end
up using parallel resources on the least important tables and you
will have to adjust a/v settings per table, etc to get the right table
to be parallel index vacuumed by a/v.Hm, this is a good point. I think I should clarify one moment - in
practice, there is a common situation when users have one huge table
among all databases (with 80+ indexes created on it). But, of course,
in general there may be few such tables.
But we can still adjust the autovac_idx_parallel_min_rows parameter.
If a table has a lot of dead tuples => it is actively used => table is
important (?).
Also, if the user can really determine the "importance" of each of the
tables - we can provide an appropriate table option. Tables with this
option set will be processed in parallel in priority order. What do
you think about such an idea?
I think in most cases, the user will want to determine the priority of
a table getting parallel vacuum cycles rather than having the autovacuum
determine the priority. I also see users wanting to stagger
vacuums of large tables with many indexes through some time period,
and give the
tables the full amount of parallel workers they can afford at these
specific periods
of time. A/V currently does not really allow for this type of
scheduling, and if we
give some kind of GUC to prioritize tables, I think users will constantly have
to be modifying this priority.
I am basing my comments on the scenarios I have seen on the field, and others
may have a different opinion.
Also, with the TIDStore improvements for index cleanup, and the practical
elimination of multi-pass index vacuums, I see this being even less
convincing as something to add to a/v.If I understood correctly, then we are talking about the fact that
TIDStore can store so many tuples that in fact a second pass is never
needed.
But the number of passes does not affect the presented optimization in
any way. We must think about a large number of indexes that must be
processed. Even within a single pass we can have a 40% increase in
speed.
I am not discounting that a single table vacuum with many indexes will
maybe perform better with parallel index scan, I am merely saying that
the TIDStore optimization now makes index vacuums better and perhaps
there is less of an incentive to use parallel.
Now, If I am going to allocate extra workers to run vacuum in parallel, why
not just provide more autovacuum workers instead so I can get more tables
vacuumed within a span of time?For now, only one process can clean up indexes, so I don't see how
increasing the number of a/v workers will help in the situation that I
mentioned above.
Also, we don't consume additional resources during autovacuum in this
patch - total number of a/v workers always <= autovacuum_max_workers.
Increasing a/v workers will not help speed up a specific table, what I
am suggesting is that instead of speeding up one table, let's just allow
other tables to not be starved of a/v cycles due to lack of a/v workers.
--
Sami
On Fri, May 2, 2025 at 9:58 AM Sami Imseih <samimseih@gmail.com> wrote:
Once we have parallel heap vacuum, as discussed in thread[1], it would
also likely be beneficial to incorporate it into autovacuum during
aggressive vacuum or failsafe mode.IIRC, index cleanup is disabled by failsafe.
Yes. My idea is to use parallel *heap* vacuum in autovacuum during
failsafe mode. I think it would make sense as users want to complete
freezing tables as soon as possible in this situation.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Fri, May 2, 2025 at 11:13 AM Daniil Davydov <3danissimo@gmail.com> wrote:
On Thu, May 1, 2025 at 8:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
As I understand it, we initially disabled parallel vacuum for
autovacuum because their objectives are somewhat contradictory.
Parallel vacuum aims to accelerate the process by utilizing additional
resources, while autovacuum is designed to perform cleaning operations
with minimal impact on foreground transaction processing (e.g.,
through vacuum delay).Yep, we also decided that we must not create more a/v workers for
index processing.
In current implementation, the leader process sends a signal to the
a/v launcher, and the launcher tries to launch all requested workers.
But the number of workers never exceeds `autovacuum_max_workers`.
Thus, we will never have more a/v workers than in the standard case
(without this feature).
I have concerns about this design. When autovacuuming on a single
table consumes all available autovacuum_max_workers slots with
parallel vacuum workers, the system becomes incapable of processing
other tables. This means that when determining the appropriate
autovacuum_max_workers value, users must consider not only the number
of tables to be processed concurrently but also the potential number
of parallel workers that might be launched. I think it would more make
sense to maintain the existing autovacuum_max_workers parameter while
introducing a new parameter that would either control the maximum
number of parallel vacuum workers per autovacuum worker or set a
system-wide cap on the total number of parallel vacuum workers.
Regarding implementation: I notice the WIP patch implements its own
parallel vacuum mechanism for autovacuum. Have you considered simply
setting at_params.nworkers to a value greater than zero?About `at_params.nworkers = N` - that's exactly what we're doing (you
can see it in the `vacuum_rel` function). But we cannot fully reuse
code of VACUUM PARALLEL, because it creates its own processes via
dynamic bgworkers machinery.
As I said above - we don't want to consume additional resources. Also
we don't want to complicate communication between processes (the idea
is that a/v workers can only send signals to the a/v launcher).
Could you elaborate on the reasons why you don't want to use
background workers and avoid complicated communication between
processes? I'm not sure whether these concerns provide sufficient
justification for implementing its own parallel index processing.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
I think it would more make
sense to maintain the existing autovacuum_max_workers parameter while
introducing a new parameter that would either control the maximum
number of parallel vacuum workers per autovacuum worker or set a
system-wide cap on the total number of parallel vacuum workers.
+1, and would it make sense for parallel workers to come from
max_parallel_maintenance_workers? This is capped by
max_parallel_workers and max_worker_processes, so increasing
the defaults for all 3 will be needed as well.
--
Sami
On Sat, May 3, 2025 at 3:17 AM Sami Imseih <samimseih@gmail.com> wrote:
I think in most cases, the user will want to determine the priority of
a table getting parallel vacuum cycles rather than having the autovacuum
determine the priority. I also see users wanting to stagger
vacuums of large tables with many indexes through some time period,
and give the
tables the full amount of parallel workers they can afford at these
specific periods
of time. A/V currently does not really allow for this type of
scheduling, and if we
give some kind of GUC to prioritize tables, I think users will constantly have
to be modifying this priority.
If the user wants to determine priority himself, we anyway need to
introduce some parameter (GUC or table option) that will give us a
hint how we should schedule a/v work.
You think that we should think about a more comprehensive behavior for
such a parameter (so that the user doesn't have to change it often)? I
will be glad to know your thoughts.
If I understood correctly, then we are talking about the fact that
TIDStore can store so many tuples that in fact a second pass is never
needed.
But the number of passes does not affect the presented optimization in
any way. We must think about a large number of indexes that must be
processed. Even within a single pass we can have a 40% increase in
speed.I am not discounting that a single table vacuum with many indexes will
maybe perform better with parallel index scan, I am merely saying that
the TIDStore optimization now makes index vacuums better and perhaps
there is less of an incentive to use parallel.
I still insist that this does not affect the parallel index vacuum,
because we don't get an advantage in repeated passes. We get the same
speed increase whether we have this optimization or not.
Although it's even possible that the opposite is true - the situation
will be better with the new TIDStore, but I can't say for sure.
Now, If I am going to allocate extra workers to run vacuum in parallel, why
not just provide more autovacuum workers instead so I can get more tables
vacuumed within a span of time?For now, only one process can clean up indexes, so I don't see how
increasing the number of a/v workers will help in the situation that I
mentioned above.
Also, we don't consume additional resources during autovacuum in this
patch - total number of a/v workers always <= autovacuum_max_workers.Increasing a/v workers will not help speed up a specific table, what I
am suggesting is that instead of speeding up one table, let's just allow
other tables to not be starved of a/v cycles due to lack of a/v workers.
OK, I got it. But what if vacuuming of a single table will take (for
example) 60% of all time? This is still a possible situation, and the
fast vacuum of all other tables will not help us.
--
Best regards,
Daniil Davydov
On Sat, May 3, 2025 at 5:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
In current implementation, the leader process sends a signal to the
a/v launcher, and the launcher tries to launch all requested workers.
But the number of workers never exceeds `autovacuum_max_workers`.
Thus, we will never have more a/v workers than in the standard case
(without this feature).I have concerns about this design. When autovacuuming on a single
table consumes all available autovacuum_max_workers slots with
parallel vacuum workers, the system becomes incapable of processing
other tables. This means that when determining the appropriate
autovacuum_max_workers value, users must consider not only the number
of tables to be processed concurrently but also the potential number
of parallel workers that might be launched. I think it would more make
sense to maintain the existing autovacuum_max_workers parameter while
introducing a new parameter that would either control the maximum
number of parallel vacuum workers per autovacuum worker or set a
system-wide cap on the total number of parallel vacuum workers.
For now we have max_parallel_index_autovac_workers - this GUC limits
the number of parallel a/v workers that can process a single table. I
agree that the scenario you provided is problematic.
The proposal to limit the total number of supportive a/v workers seems
attractive to me (I'll implement it as an experiment).
It seems to me that this question is becoming a key one. First we need
to determine the role of the user in the whole scheduling mechanism.
Should we allow users to determine priority? Will this priority affect
only within a single vacuuming cycle, or it will be more 'global'?
I guess I don't have enough expertise to determine this alone. I will
be glad to receive any suggestions.
About `at_params.nworkers = N` - that's exactly what we're doing (you
can see it in the `vacuum_rel` function). But we cannot fully reuse
code of VACUUM PARALLEL, because it creates its own processes via
dynamic bgworkers machinery.
As I said above - we don't want to consume additional resources. Also
we don't want to complicate communication between processes (the idea
is that a/v workers can only send signals to the a/v launcher).Could you elaborate on the reasons why you don't want to use
background workers and avoid complicated communication between
processes? I'm not sure whether these concerns provide sufficient
justification for implementing its own parallel index processing.
Here are my thoughts on this. A/v worker has a very simple role - it
is born after the launcher's request and must do exactly one 'task' -
vacuum table or participate in parallel index vacuum.
We also have a dedicated 'launcher' role, meaning the whole design
implies that only the launcher is able to launch processes.
If we allow a/v worker to use bgworkers, then :
1) A/v worker will go far beyond his responsibility.
2) Its functionality will overlap with the functionality of the launcher.
3) Resource consumption can jump dramatically, which is unexpected for
the user. Autovacuum will also be dependent on other resources
(bgworkers pool). The current design does not imply this.
I wanted to create a patch that would fit into the existing mechanism
without drastic innovations. But if you think that the above is not so
important, then we can reuse VACUUM PARALLEL code and it would
simplify the final implementation)
--
Best regards,
Daniil Davydov
On Sat, May 3, 2025 at 5:59 AM Sami Imseih <samimseih@gmail.com> wrote:
I think it would more make
sense to maintain the existing autovacuum_max_workers parameter while
introducing a new parameter that would either control the maximum
number of parallel vacuum workers per autovacuum worker or set a
system-wide cap on the total number of parallel vacuum workers.+1, and would it make sense for parallel workers to come from
max_parallel_maintenance_workers? This is capped by
max_parallel_workers and max_worker_processes, so increasing
the defaults for all 3 will be needed as well.
I may be wrong, but the `max_parallel_maintenance_workers` parameter
is only used for commands that are explicitly run by the user. We
already have `autovacuum_max_workers` and I think that code will be
more consistent, if we adapt this particular parameter (perhaps with
the addition of a new one, as I wrote in the previous letter).
--
Best regards,
Daniil Davydov
On Sat, May 3, 2025 at 1:10 AM Daniil Davydov <3danissimo@gmail.com> wrote:
On Sat, May 3, 2025 at 5:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
In current implementation, the leader process sends a signal to the
a/v launcher, and the launcher tries to launch all requested workers.
But the number of workers never exceeds `autovacuum_max_workers`.
Thus, we will never have more a/v workers than in the standard case
(without this feature).I have concerns about this design. When autovacuuming on a single
table consumes all available autovacuum_max_workers slots with
parallel vacuum workers, the system becomes incapable of processing
other tables. This means that when determining the appropriate
autovacuum_max_workers value, users must consider not only the number
of tables to be processed concurrently but also the potential number
of parallel workers that might be launched. I think it would more make
sense to maintain the existing autovacuum_max_workers parameter while
introducing a new parameter that would either control the maximum
number of parallel vacuum workers per autovacuum worker or set a
system-wide cap on the total number of parallel vacuum workers.For now we have max_parallel_index_autovac_workers - this GUC limits
the number of parallel a/v workers that can process a single table. I
agree that the scenario you provided is problematic.
The proposal to limit the total number of supportive a/v workers seems
attractive to me (I'll implement it as an experiment).It seems to me that this question is becoming a key one. First we need
to determine the role of the user in the whole scheduling mechanism.
Should we allow users to determine priority? Will this priority affect
only within a single vacuuming cycle, or it will be more 'global'?
I guess I don't have enough expertise to determine this alone. I will
be glad to receive any suggestions.
What I roughly imagined is that we don't need to change the entire
autovacuum scheduling, but would like autovacuum workers to decides
whether or not to use parallel vacuum during its vacuum operation
based on GUC parameters (having a global effect) or storage parameters
(having an effect on the particular table). The criteria of triggering
parallel vacuum in autovacuum might need to be somewhat pessimistic so
that we don't unnecessarily use parallel vacuum on many tables.
About `at_params.nworkers = N` - that's exactly what we're doing (you
can see it in the `vacuum_rel` function). But we cannot fully reuse
code of VACUUM PARALLEL, because it creates its own processes via
dynamic bgworkers machinery.
As I said above - we don't want to consume additional resources. Also
we don't want to complicate communication between processes (the idea
is that a/v workers can only send signals to the a/v launcher).Could you elaborate on the reasons why you don't want to use
background workers and avoid complicated communication between
processes? I'm not sure whether these concerns provide sufficient
justification for implementing its own parallel index processing.Here are my thoughts on this. A/v worker has a very simple role - it
is born after the launcher's request and must do exactly one 'task' -
vacuum table or participate in parallel index vacuum.
We also have a dedicated 'launcher' role, meaning the whole design
implies that only the launcher is able to launch processes.If we allow a/v worker to use bgworkers, then :
1) A/v worker will go far beyond his responsibility.
2) Its functionality will overlap with the functionality of the launcher.
While I agree that the launcher process is responsible for launching
autovacuum worker processes but I'm not sure it should be for
launching everything related autovacuums. It's quite possible that we
have parallel heap vacuum and processing the particular index with
parallel workers in the future. The code could get more complex if we
have the autovacuum launcher process launch such parallel workers too.
I believe it's more straightforward to divide the responsibility like
in a way that the autovacuum launcher is responsible for launching
autovacuum workers and autovacuum workers are responsible for
vacuuming tables no matter how to do that.
3) Resource consumption can jump dramatically, which is unexpected for
the user.
What extra resources could be used if we use background workers
instead of autovacuum workers?
Autovacuum will also be dependent on other resources
(bgworkers pool). The current design does not imply this.
I see your point but I think it doesn't necessarily need to reflect it
at the infrastructure layer. For example, we can internally allocate
extra background worker slots for parallel vacuum workers based on
max_parallel_index_autovac_workers in addition to
max_worker_processes. Anyway we might need something to check or
validate max_worker_processes value to make sure that every autovacuum
worker can use the specified number of parallel workers for parallel
vacuum.
I wanted to create a patch that would fit into the existing mechanism
without drastic innovations. But if you think that the above is not so
important, then we can reuse VACUUM PARALLEL code and it would
simplify the final implementation)
I'd suggest using the existing infrastructure if we can achieve the
goal with it. If we find out there are some technical difficulties to
implement it without new infrastructure, we can revisit this approach.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Sat, May 3, 2025 at 1:10 AM Daniil Davydov <3danissimo@gmail.com>
wrote:On Sat, May 3, 2025 at 5:28 AM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:
In current implementation, the leader process sends a signal to the
a/v launcher, and the launcher tries to launch all requested workers.
But the number of workers never exceeds `autovacuum_max_workers`.
Thus, we will never have more a/v workers than in the standard case
(without this feature).I have concerns about this design. When autovacuuming on a single
table consumes all available autovacuum_max_workers slots with
parallel vacuum workers, the system becomes incapable of processing
other tables. This means that when determining the appropriate
autovacuum_max_workers value, users must consider not only the number
of tables to be processed concurrently but also the potential number
of parallel workers that might be launched. I think it would more make
sense to maintain the existing autovacuum_max_workers parameter while
introducing a new parameter that would either control the maximum
number of parallel vacuum workers per autovacuum worker or set a
system-wide cap on the total number of parallel vacuum workers.For now we have max_parallel_index_autovac_workers - this GUC limits
the number of parallel a/v workers that can process a single table. I
agree that the scenario you provided is problematic.
The proposal to limit the total number of supportive a/v workers seems
attractive to me (I'll implement it as an experiment).It seems to me that this question is becoming a key one. First we need
to determine the role of the user in the whole scheduling mechanism.
Should we allow users to determine priority? Will this priority affect
only within a single vacuuming cycle, or it will be more 'global'?
I guess I don't have enough expertise to determine this alone. I will
be glad to receive any suggestions.What I roughly imagined is that we don't need to change the entire
autovacuum scheduling, but would like autovacuum workers to decides
whether or not to use parallel vacuum during its vacuum operation
based on GUC parameters (having a global effect) or storage parameters
(having an effect on the particular table). The criteria of triggering
parallel vacuum in autovacuum might need to be somewhat pessimistic so
that we don't unnecessarily use parallel vacuum on many tables.
Perhaps we should only provide a reloption, therefore only tables specified
by the user via the reloption can be autovacuumed in parallel?
This gives a targeted approach. Of course if multiple of these allowed
tables
are to be autovacuumed at the same time, some may not get all the workers,
But that’s not different from if you are to manually vacuum in parallel the
tables
at the same time.
What do you think ?
—
Sami
Show quoted text
On Tue, May 6, 2025 at 6:57 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
What I roughly imagined is that we don't need to change the entire
autovacuum scheduling, but would like autovacuum workers to decides
whether or not to use parallel vacuum during its vacuum operation
based on GUC parameters (having a global effect) or storage parameters
(having an effect on the particular table). The criteria of triggering
parallel vacuum in autovacuum might need to be somewhat pessimistic so
that we don't unnecessarily use parallel vacuum on many tables.
+1, I think about it in the same way. I will expand on this topic in
more detail in response to Sami's letter [1]/messages/by-id/CAA5RZ0vfBg=c_0Sa1Tpxv8tueeBk8C5qTf9TrxKBbXUqPc99Ag@mail.gmail.com, so as not to repeat
myself.
Here are my thoughts on this. A/v worker has a very simple role - it
is born after the launcher's request and must do exactly one 'task' -
vacuum table or participate in parallel index vacuum.
We also have a dedicated 'launcher' role, meaning the whole design
implies that only the launcher is able to launch processes.If we allow a/v worker to use bgworkers, then :
1) A/v worker will go far beyond his responsibility.
2) Its functionality will overlap with the functionality of the launcher.While I agree that the launcher process is responsible for launching
autovacuum worker processes but I'm not sure it should be for
launching everything related autovacuums. It's quite possible that we
have parallel heap vacuum and processing the particular index with
parallel workers in the future. The code could get more complex if we
have the autovacuum launcher process launch such parallel workers too.
I believe it's more straightforward to divide the responsibility like
in a way that the autovacuum launcher is responsible for launching
autovacuum workers and autovacuum workers are responsible for
vacuuming tables no matter how to do that.
It sounds very tempting. At the very beginning I did exactly that (to
make sure that nothing would break in a parallel autovacuum). Only
later it was decided to abandon the use of bgworkers.
For now both approaches look fair for me. What do you think - will
others agree that we can provide more responsibility to a/v workers?
3) Resource consumption can jump dramatically, which is unexpected for
the user.What extra resources could be used if we use background workers
instead of autovacuum workers?
I meant that more processes are starting to participate in the
autovacuum than indicated in autovacuum_max_workers. And if a/v worker
will use additional bgworkers => other operations cannot get these
resources.
Autovacuum will also be dependent on other resources
(bgworkers pool). The current design does not imply this.I see your point but I think it doesn't necessarily need to reflect it
at the infrastructure layer. For example, we can internally allocate
extra background worker slots for parallel vacuum workers based on
max_parallel_index_autovac_workers in addition to
max_worker_processes. Anyway we might need something to check or
validate max_worker_processes value to make sure that every autovacuum
worker can use the specified number of parallel workers for parallel
vacuum.
I don't think that we can provide all supportive workers for each
parallel index vacuuming request. But I got your point - always keep
several bgworkers that only a/v workers can use if needed and the size
of this additional pool (depending on max_worker_processes) must be
user-configurable.
I wanted to create a patch that would fit into the existing mechanism
without drastic innovations. But if you think that the above is not so
important, then we can reuse VACUUM PARALLEL code and it would
simplify the final implementation)I'd suggest using the existing infrastructure if we can achieve the
goal with it. If we find out there are some technical difficulties to
implement it without new infrastructure, we can revisit this approach.
OK, in the near future I'll implement it and send a new patch to this
thread. I'll be glad if you will take a look on it)
[1]: /messages/by-id/CAA5RZ0vfBg=c_0Sa1Tpxv8tueeBk8C5qTf9TrxKBbXUqPc99Ag@mail.gmail.com
--
Best regards,
Daniil Davydov
On Mon, May 5, 2025 at 5:21 PM Sami Imseih <samimseih@gmail.com> wrote:
On Sat, May 3, 2025 at 1:10 AM Daniil Davydov <3danissimo@gmail.com> wrote:
On Sat, May 3, 2025 at 5:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
In current implementation, the leader process sends a signal to the
a/v launcher, and the launcher tries to launch all requested workers.
But the number of workers never exceeds `autovacuum_max_workers`.
Thus, we will never have more a/v workers than in the standard case
(without this feature).I have concerns about this design. When autovacuuming on a single
table consumes all available autovacuum_max_workers slots with
parallel vacuum workers, the system becomes incapable of processing
other tables. This means that when determining the appropriate
autovacuum_max_workers value, users must consider not only the number
of tables to be processed concurrently but also the potential number
of parallel workers that might be launched. I think it would more make
sense to maintain the existing autovacuum_max_workers parameter while
introducing a new parameter that would either control the maximum
number of parallel vacuum workers per autovacuum worker or set a
system-wide cap on the total number of parallel vacuum workers.For now we have max_parallel_index_autovac_workers - this GUC limits
the number of parallel a/v workers that can process a single table. I
agree that the scenario you provided is problematic.
The proposal to limit the total number of supportive a/v workers seems
attractive to me (I'll implement it as an experiment).It seems to me that this question is becoming a key one. First we need
to determine the role of the user in the whole scheduling mechanism.
Should we allow users to determine priority? Will this priority affect
only within a single vacuuming cycle, or it will be more 'global'?
I guess I don't have enough expertise to determine this alone. I will
be glad to receive any suggestions.What I roughly imagined is that we don't need to change the entire
autovacuum scheduling, but would like autovacuum workers to decides
whether or not to use parallel vacuum during its vacuum operation
based on GUC parameters (having a global effect) or storage parameters
(having an effect on the particular table). The criteria of triggering
parallel vacuum in autovacuum might need to be somewhat pessimistic so
that we don't unnecessarily use parallel vacuum on many tables.Perhaps we should only provide a reloption, therefore only tables specified
by the user via the reloption can be autovacuumed in parallel?This gives a targeted approach. Of course if multiple of these allowed tables
are to be autovacuumed at the same time, some may not get all the workers,
But that’s not different from if you are to manually vacuum in parallel the tables
at the same time.What do you think ?
+1. I think that's a good starting point. We can later introduce a new
GUC parameter that globally controls the maximum number of parallel
vacuum workers used in autovacuum, if necessary.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Tue, May 6, 2025 at 7:21 AM Sami Imseih <samimseih@gmail.com> wrote:
Perhaps we should only provide a reloption, therefore only tables specified
by the user via the reloption can be autovacuumed in parallel?
Аfter your comments (earlier in this thread) I decided to do just
that. For now we have reloption, so the user can decide which tables
are "important" for parallel index vacuuming.
We also set lower bounds (hardcoded) on the number of indexes and the
number of dead tuples. For example, there is no need to use a parallel
vacuum if the table has only one index.
The situation is more complicated with the number of dead tuples - we
need tests that would show the optimal minimum value. This issue is
still being worked out.
This gives a targeted approach. Of course if multiple of these allowed tables
are to be autovacuumed at the same time, some may not get all the workers,
But that’s not different from if you are to manually vacuum in parallel the tables
at the same time.
I fully agree. Recently v2 patch has been supplemented with a new
feature [1]I guess that I'll send it within the v3 patch, that will also contain logic that was discussed in the letter above - using bgworkers instead of additional a/v workers. BTW, what do you think about this idea? - multiple tables in a cluster can be processed in
parallel during autovacuum. And of course, not every a/v worker can
get enough supportive processes, but this is considered normal
behavior.
Maximum number of supportive workers is limited by the GUC variable.
[1]: I guess that I'll send it within the v3 patch, that will also contain logic that was discussed in the letter above - using bgworkers instead of additional a/v workers. BTW, what do you think about this idea?
contain logic that was discussed in the letter above - using bgworkers
instead of additional a/v workers. BTW, what do you think about this
idea?
--
Best regards,
Daniil Davydov
On Mon, May 5, 2025 at 5:21 PM Sami Imseih <samimseih@gmail.com> wrote:
On Sat, May 3, 2025 at 1:10 AM Daniil Davydov <3danissimo@gmail.com> wrote:
On Sat, May 3, 2025 at 5:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
In current implementation, the leader process sends a signal to the
a/v launcher, and the launcher tries to launch all requested workers.
But the number of workers never exceeds `autovacuum_max_workers`.
Thus, we will never have more a/v workers than in the standard case
(without this feature).I have concerns about this design. When autovacuuming on a single
table consumes all available autovacuum_max_workers slots with
parallel vacuum workers, the system becomes incapable of processing
other tables. This means that when determining the appropriate
autovacuum_max_workers value, users must consider not only the number
of tables to be processed concurrently but also the potential number
of parallel workers that might be launched. I think it would more make
sense to maintain the existing autovacuum_max_workers parameter while
introducing a new parameter that would either control the maximum
number of parallel vacuum workers per autovacuum worker or set a
system-wide cap on the total number of parallel vacuum workers.For now we have max_parallel_index_autovac_workers - this GUC limits
the number of parallel a/v workers that can process a single table. I
agree that the scenario you provided is problematic.
The proposal to limit the total number of supportive a/v workers seems
attractive to me (I'll implement it as an experiment).It seems to me that this question is becoming a key one. First we need
to determine the role of the user in the whole scheduling mechanism.
Should we allow users to determine priority? Will this priority affect
only within a single vacuuming cycle, or it will be more 'global'?
I guess I don't have enough expertise to determine this alone. I will
be glad to receive any suggestions.What I roughly imagined is that we don't need to change the entire
autovacuum scheduling, but would like autovacuum workers to decides
whether or not to use parallel vacuum during its vacuum operation
based on GUC parameters (having a global effect) or storage parameters
(having an effect on the particular table). The criteria of triggering
parallel vacuum in autovacuum might need to be somewhat pessimistic so
that we don't unnecessarily use parallel vacuum on many tables.Perhaps we should only provide a reloption, therefore only tables specified
by the user via the reloption can be autovacuumed in parallel?This gives a targeted approach. Of course if multiple of these allowed tables
are to be autovacuumed at the same time, some may not get all the workers,
But that’s not different from if you are to manually vacuum in parallel the tables
at the same time.What do you think ?
+1. I think that's a good starting point. We can later introduce a new
GUC parameter that globally controls the maximum number of parallel
vacuum workers used in autovacuum, if necessary.
and I this reloption should also apply to parallel heap vacuum in
non-failsafe scenarios.
In the failsafe case however, all tables will be eligible for parallel
vacuum. Anyhow, that
discussion could be taken in that thread, but wanted to point that out.
--
Sami Imseih
Amazon Web Services (AWS)
Hi,
As I promised - meet parallel index autovacuum with bgworkers
(Parallel-index-autovacuum-with-bgworkers.patch). This is pretty
simple implementation :
1) Added new table option `parallel_idx_autovac_enabled` that must be
set to `true` if user wants autovacuum to process table in parallel.
2) Added new GUC variable `autovacuum_reserved_workers_num`. This is
number of parallel workers from bgworkers pool that can be used only
by autovacuum workers. The `autovacuum_reserved_workers_num` parameter
actually reserves a requested part of the processes, the total number
of which is equal to `max_worker_processes`.
3) When an autovacuum worker decides to process some table in
parallel, it just sets `VacuumParams->nworkers` to appropriate value
(> 0) and then the code is executed as if it were a regular VACUUM
PARALLEL.
4) I kept test/modules/autovacuum as sandbox where you can play with
parallel index autovacuum a bit.
What do you think about this implementation?
P.S.
I also improved "self-managed" parallel autovacuum implementation
(Self-managed-parallel-index-autovacuum.patch). For now it needs a lot
of refactoring, but all features are working good.
Both patches are targeting on master branch
(bc35adee8d7ad38e7bef40052f196be55decddec)
--
Best regards,
Daniil Davydov
Attachments:
v1-0001-Parallel-index-autovacuum-with-bgworkers.patchtext/x-patch; charset=US-ASCII; name=v1-0001-Parallel-index-autovacuum-with-bgworkers.patchDownload
From cfb7e675d9a1b05aef0cdaeeca5f6edd4bcd3b70 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Sat, 10 May 2025 01:07:42 +0700
Subject: [PATCH v1] Parallel index autovacuum with bgworkers
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuum.c | 55 ++++++++
src/backend/commands/vacuumparallel.c | 46 ++++---
src/backend/postmaster/autovacuum.c | 9 ++
src/backend/postmaster/bgworker.c | 33 ++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc_tables.c | 13 ++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/miscadmin.h | 1 +
src/include/utils/guc_hooks.h | 2 +
src/include/utils/rel.h | 10 ++
src/test/modules/autovacuum/.gitignore | 1 +
src/test/modules/autovacuum/Makefile | 14 ++
.../autovacuum/t/001_autovac_parallel.pl | 129 ++++++++++++++++++
14 files changed, 307 insertions(+), 19 deletions(-)
create mode 100644 src/test/modules/autovacuum/.gitignore
create mode 100644 src/test/modules/autovacuum/Makefile
create mode 100644 src/test/modules/autovacuum/t/001_autovac_parallel.pl
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 46c1dce222d..ccf59208783 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -166,6 +166,15 @@ static relopt_bool boolRelOpts[] =
},
true
},
+ {
+ {
+ "parallel_idx_autovac_enabled",
+ "Allows autovacuum to process indexes of this table in parallel mode",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ false
+ },
/* list terminator */
{{NULL}}
};
@@ -1863,6 +1872,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"parallel_idx_autovac_enabled", RELOPT_TYPE_BOOL,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, parallel_idx_autovac_enabled)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 33a33bf6b1c..f7667f14147 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -57,9 +57,21 @@
#include "utils/guc.h"
#include "utils/guc_hooks.h"
#include "utils/memutils.h"
+#include "utils/rel.h"
#include "utils/snapmgr.h"
#include "utils/syscache.h"
+/*
+ * Minimum number of dead tuples required for the table's indexes to be
+ * processed in parallel during autovacuum.
+ */
+#define AV_PARALLEL_DEADTUP_THRESHOLD 1024
+
+/*
+ * How many indexes should process each parallel worker during autovacuum.
+ */
+#define NUM_INDEXES_PER_PARALLEL_WORKER 30
+
/*
* Minimum interval for cost-based vacuum delay reports from a parallel worker.
* This aims to avoid sending too many messages and waking up the leader too
@@ -2234,6 +2246,49 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params,
else
toast_relid = InvalidOid;
+ /*
+ * If we are running autovacuum - decide whether we need to process indexes
+ * of table with given oid in parallel.
+ */
+ if (AmAutoVacuumWorkerProcess() &&
+ params->index_cleanup != VACOPTVALUE_DISABLED &&
+ RelationAllowsParallelIdxAutovac(rel))
+ {
+ PgStat_StatTabEntry *tabentry;
+
+ /* fetch the pgstat table entry */
+ tabentry = pgstat_fetch_stat_tabentry_ext(rel->rd_rel->relisshared,
+ rel->rd_id);
+ if (tabentry && tabentry->dead_tuples >= AV_PARALLEL_DEADTUP_THRESHOLD)
+ {
+ List *indexes = RelationGetIndexList(rel);
+ int num_indexes = list_length(indexes);
+
+ list_free(indexes);
+
+ if (av_reserved_workers_num > 0)
+ {
+ /*
+ * We request at least one parallel worker, if user set
+ * 'parallel_idx_autovac_enabled' option. The total number of
+ * additional parallel workers depends on how many indexes the
+ * table has. For now we assume that each parallel worker should
+ * process NUM_INDEXES_PER_PARALLEL_WORKER indexes.
+ */
+ params->nworkers =
+ Min((num_indexes / NUM_INDEXES_PER_PARALLEL_WORKER) + 1,
+ av_reserved_workers_num);
+ }
+ else
+ ereport(WARNING,
+ (errcode(ERRCODE_CONFIGURATION_LIMIT_EXCEEDED),
+ errmsg("Cannot launch any supportive workers for parallel index cleanup of rel %s",
+ RelationGetRelationName(rel)),
+ errhint("You might need to set parameter \"av_reserved_workers_num\" to a value > 0")));
+
+ }
+ }
+
/*
* Switch to the table owner's userid, so that any index functions are run
* as that user. Also lock down security-restricted operations and
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 2b9d548cdeb..e2b3e5b343c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,15 +1,15 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel [auto]vacuum execution.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
*
- * In a parallel vacuum, we perform both index bulk deletion and index cleanup
- * with parallel worker processes. Individual indexes are processed by one
- * vacuum process. ParallelVacuumState contains shared information as well as
- * the memory space for storing dead items allocated in the DSA area. We
+ * In a parallel [auto]vacuum, we perform both index bulk deletion and index
+ * cleanup with parallel worker processes. Individual indexes are processed by
+ * one vacuum process. ParallelVacuumState contains shared information as well
+ * as the memory space for storing dead items allocated in the DSA area. We
* launch parallel worker processes at the start of parallel index
* bulk-deletion and index cleanup and once all indexes are processed, the
* parallel worker processes exit. Each time we process indexes in parallel,
@@ -34,6 +34,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -157,7 +158,8 @@ typedef struct PVIndStats
} PVIndStats;
/*
- * Struct for maintaining a parallel vacuum state. typedef appears in vacuum.h.
+ * Struct for maintaining a parallel [auto]vacuum state. typedef appears in
+ * vacuum.h.
*/
struct ParallelVacuumState
{
@@ -371,10 +373,18 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->relid = RelationGetRelid(rel);
shared->elevel = elevel;
shared->queryid = pgstat_get_my_query_id();
- shared->maintenance_work_mem_worker =
- (nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+
+ if (AmAutoVacuumWorkerProcess())
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) :
+ autovacuum_work_mem;
+ else
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+ maintenance_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -558,7 +568,9 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster ||
+ (av_reserved_workers_num == 0 && AmAutoVacuumWorkerProcess()) ||
+ (max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess()))
return 0;
/*
@@ -597,15 +609,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = AmAutoVacuumWorkerProcess() ?
+ Min(parallel_workers, av_reserved_workers_num) :
+ Min(parallel_workers, max_parallel_maintenance_workers);
return parallel_workers;
}
/*
* Perform index vacuum or index cleanup with parallel workers. This function
- * must be used by the parallel vacuum leader process.
+ * must be used by the parallel [auto]vacuum leader process.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -982,8 +996,8 @@ parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
/*
* Perform work within a launched parallel process.
*
- * Since parallel vacuum workers perform only index vacuum or index cleanup,
- * we don't need to report progress information.
+ * Since parallel [auto]vacuum workers perform only index vacuum or index
+ * cleanup, we don't need to report progress information.
*/
void
parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 16756152b71..725d3231f77 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3406,6 +3406,15 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
return true;
}
+bool
+check_autovacuum_reserved_workers_num(int *newval, void **extra,
+ GucSource source)
+{
+ if (*newval > (max_worker_processes - 8))
+ return false;
+ return true;
+}
+
/*
* Returns whether there is a free autovacuum worker slot available.
*/
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 116ddf7b835..cb86db99da9 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -1046,6 +1046,8 @@ RegisterDynamicBackgroundWorker(BackgroundWorker *worker,
BackgroundWorkerHandle **handle)
{
int slotno;
+ int from;
+ int upto;
bool success = false;
bool parallel;
uint64 generation = 0;
@@ -1088,10 +1090,23 @@ RegisterDynamicBackgroundWorker(BackgroundWorker *worker,
return false;
}
+ /*
+ * Determine range of workers in pool, that we can use (last
+ * 'av_reserved_workers_num' is reserved for autovacuum workers).
+ */
+
+ from = AmAutoVacuumWorkerProcess() ?
+ BackgroundWorkerData->total_slots - av_reserved_workers_num :
+ 0;
+
+ upto = AmAutoVacuumWorkerProcess() ?
+ BackgroundWorkerData->total_slots :
+ BackgroundWorkerData->total_slots - av_reserved_workers_num;
+
/*
* Look for an unused slot. If we find one, grab it.
*/
- for (slotno = 0; slotno < BackgroundWorkerData->total_slots; ++slotno)
+ for (slotno = from; slotno < upto; ++slotno)
{
BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
@@ -1159,7 +1174,13 @@ GetBackgroundWorkerPid(BackgroundWorkerHandle *handle, pid_t *pidp)
BackgroundWorkerSlot *slot;
pid_t pid;
- Assert(handle->slot < max_worker_processes);
+ /* Only autovacuum can use last 'av_reserved_workers_num' workers in pool. */
+ if (!AmAutoVacuumWorkerProcess())
+ Assert(handle->slot < max_worker_processes - av_reserved_workers_num);
+ else
+ Assert(handle->slot < max_worker_processes &&
+ handle->slot >= max_worker_processes - av_reserved_workers_num);
+
slot = &BackgroundWorkerData->slot[handle->slot];
/*
@@ -1298,7 +1319,13 @@ TerminateBackgroundWorker(BackgroundWorkerHandle *handle)
BackgroundWorkerSlot *slot;
bool signal_postmaster = false;
- Assert(handle->slot < max_worker_processes);
+ /* Only autovacuum can use last 'av_reserved_workers_num' workers in pool. */
+ if (!AmAutoVacuumWorkerProcess())
+ Assert(handle->slot < max_worker_processes - av_reserved_workers_num);
+ else
+ Assert(handle->slot < max_worker_processes &&
+ handle->slot >= max_worker_processes - av_reserved_workers_num);
+
slot = &BackgroundWorkerData->slot[handle->slot];
/* Set terminate flag in shared memory, unless slot has been reused. */
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 92b0446b80c..cff13ef6bd7 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -144,6 +144,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int av_reserved_workers_num = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 2f8cbd86759..87cd4e20786 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3604,6 +3604,19 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"autovacuum_reserved_workers_num", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
+ gettext_noop("Number of worker processes, reserved for participation in parallel index processing during autovacuum."),
+ gettext_noop("This parameter is depending on \"max_worker_processes\" (not on \"autovacuum_max_workers\"). "
+ "*Only* autovacuum workers can use these additional processes. "
+ "Also, these processes are taken into account in \"max_parallel_workers\"."),
+ NULL,
+ },
+ &av_reserved_workers_num,
+ 0, 0, MAX_BACKENDS,
+ check_autovacuum_reserved_workers_num, NULL, NULL
+ },
+
{
{"max_parallel_maintenance_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
gettext_noop("Sets the maximum number of parallel processes per maintenance operation."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 34826d01380..2e38bada2b0 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -223,6 +223,7 @@
#max_parallel_maintenance_workers = 2 # limited by max_parallel_workers
#max_parallel_workers = 8 # number of max_worker_processes that
# can be used in parallel operations
+#autovacuum_reserved_workers_num = 0 # disabled by default and limited by max_parallel_workers
#parallel_leader_participation = on
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1e59a7f910f..992c6b63226 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int NBuffers;
extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
+extern PGDLLIMPORT int av_reserved_workers_num;
extern PGDLLIMPORT int max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 799fa7ace68..9913c6e4681 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -31,6 +31,8 @@ extern void assign_application_name(const char *newval, void *extra);
extern const char *show_archive_command(void);
extern bool check_autovacuum_work_mem(int *newval, void **extra,
GucSource source);
+bool check_autovacuum_reserved_workers_num(int *newval, void **extra,
+ GucSource source);
extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
GucSource source);
extern bool check_backtrace_functions(char **newval, void **extra,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..55aa5c45be1 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,7 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+ bool parallel_idx_autovac_enabled;
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
@@ -409,6 +410,15 @@ typedef struct StdRdOptions
((relation)->rd_options ? \
((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
+/*
+ * RelationAllowsParallelIdxAutovac
+ * Returns whether the relation's indexes can be processed in parallel
+ * during autovacuum. Note multiple eval of argument!
+ */
+#define RelationAllowsParallelIdxAutovac(relation) \
+ ((relation)->rd_options ? \
+ ((StdRdOptions *) (relation)->rd_options)->autovacuum.parallel_idx_autovac_enabled : false)
+
/* ViewOptions->check_option values */
typedef enum ViewOptCheckOption
{
diff --git a/src/test/modules/autovacuum/.gitignore b/src/test/modules/autovacuum/.gitignore
new file mode 100644
index 00000000000..0b54641bceb
--- /dev/null
+++ b/src/test/modules/autovacuum/.gitignore
@@ -0,0 +1 @@
+/tmp_check/
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/Makefile b/src/test/modules/autovacuum/Makefile
new file mode 100644
index 00000000000..90c00ff350b
--- /dev/null
+++ b/src/test/modules/autovacuum/Makefile
@@ -0,0 +1,14 @@
+# src/test/modules/autovacuum/Makefile
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/t/001_autovac_parallel.pl b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
new file mode 100644
index 00000000000..a37aaf720f2
--- /dev/null
+++ b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
@@ -0,0 +1,129 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+$node->append_conf('postgresql.conf', qq{
+ autovacuum = off
+ max_wal_size = 4096
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_reserved_workers_num = 1
+});
+$node->start;
+
+my $indexes_num = 80;
+my $initial_rows_num = 100_000;
+
+# Create big table and create specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER, col_5 INTEGER,
+ col_6 INTEGER, col_7 INTEGER, col_8 INTEGER, col_9 INTEGER, col_10 INTEGER,
+ col_11 INTEGER, col_12 INTEGER, col_13 INTEGER, col_14 INTEGER, col_15 INTEGER,
+ col_16 INTEGER, col_17 INTEGER, col_18 INTEGER, col_19 INTEGER, col_20 INTEGER,
+ col_21 INTEGER, col_22 INTEGER, col_23 INTEGER, col_24 INTEGER, col_25 INTEGER,
+ col_26 INTEGER, col_27 INTEGER, col_28 INTEGER, col_29 INTEGER, col_30 INTEGER,
+ col_31 INTEGER, col_32 INTEGER, col_33 INTEGER, col_34 INTEGER, col_35 INTEGER,
+ col_36 INTEGER, col_37 INTEGER, col_38 INTEGER, col_39 INTEGER, col_40 INTEGER,
+ col_41 INTEGER, col_42 INTEGER, col_43 INTEGER, col_44 INTEGER, col_45 INTEGER,
+ col_46 INTEGER, col_47 INTEGER, col_48 INTEGER, col_49 INTEGER, col_50 INTEGER,
+ col_51 INTEGER, col_52 INTEGER, col_53 INTEGER, col_54 INTEGER, col_55 INTEGER,
+ col_56 INTEGER, col_57 INTEGER, col_58 INTEGER, col_59 INTEGER, col_60 INTEGER,
+ col_61 INTEGER, col_62 INTEGER, col_63 INTEGER, col_64 INTEGER, col_65 INTEGER,
+ col_66 INTEGER, col_67 INTEGER, col_68 INTEGER, col_69 INTEGER, col_70 INTEGER,
+ col_71 INTEGER, col_72 INTEGER, col_73 INTEGER, col_74 INTEGER, col_75 INTEGER,
+ col_76 INTEGER, col_77 INTEGER, col_78 INTEGER, col_79 INTEGER, col_80 INTEGER,
+ col_81 INTEGER, col_82 INTEGER, col_83 INTEGER, col_84 INTEGER, col_85 INTEGER,
+ col_86 INTEGER, col_87 INTEGER, col_88 INTEGER, col_89 INTEGER, col_90 INTEGER,
+ col_91 INTEGER, col_92 INTEGER, col_93 INTEGER, col_94 INTEGER, col_95 INTEGER,
+ col_96 INTEGER, col_97 INTEGER, col_98 INTEGER, col_99 INTEGER, col_100 INTEGER
+ ) WITH (parallel_idx_autovac_enabled = true);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM pg_index i
+ JOIN pg_class c ON c.oid = i.indrelid
+ WHERE c.relname = 'test_autovac';",
+ stdout => \$psql_out
+);
+is($psql_out, $indexes_num + 1, "All indexes created successfully");
+
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac (
+ col_1, col_2, col_3, col_4, col_5, col_6, col_7, col_8, col_9, col_10,
+ col_11, col_12, col_13, col_14, col_15, col_16, col_17, col_18, col_19, col_20,
+ col_21, col_22, col_23, col_24, col_25, col_26, col_27, col_28, col_29, col_30,
+ col_31, col_32, col_33, col_34, col_35, col_36, col_37, col_38, col_39, col_40,
+ col_41, col_42, col_43, col_44, col_45, col_46, col_47, col_48, col_49, col_50,
+ col_51, col_52, col_53, col_54, col_55, col_56, col_57, col_58, col_59, col_60,
+ col_61, col_62, col_63, col_64, col_65, col_66, col_67, col_68, col_69, col_70,
+ col_71, col_72, col_73, col_74, col_75, col_76, col_77, col_78, col_79, col_80,
+ col_81, col_82, col_83, col_84, col_85, col_86, col_87, col_88, col_89, col_90,
+ col_91, col_92, col_93, col_94, col_95, col_96, col_97, col_98, col_99, col_100
+ ) VALUES (
+ i, i + 1, i + 2, i + 3, i + 4, i + 5, i + 6, i + 7, i + 8, i + 9,
+ i + 10, i + 11, i + 12, i + 13, i + 14, i + 15, i + 16, i + 17, i + 18, i + 19,
+ i + 20, i + 21, i + 22, i + 23, i + 24, i + 25, i + 26, i + 27, i + 28, i + 29,
+ i + 30, i + 31, i + 32, i + 33, i + 34, i + 35, i + 36, i + 37, i + 38, i + 39,
+ i + 40, i + 41, i + 42, i + 43, i + 44, i + 45, i + 46, i + 47, i + 48, i + 49,
+ i + 50, i + 51, i + 52, i + 53, i + 54, i + 55, i + 56, i + 57, i + 58, i + 59,
+ i + 60, i + 61, i + 62, i + 63, i + 64, i + 65, i + 66, i + 67, i + 68, i + 69,
+ i + 70, i + 71, i + 72, i + 73, i + 74, i + 75, i + 76, i + 77, i + 78, i + 79,
+ i + 80, i + 81, i + 82, i + 83, i + 84, i + 85, i + 86, i + 87, i + 88, i + 89,
+ i + 90, i + 91, i + 92, i + 93, i + 94, i + 95, i + 96, i + 97, i + 98, i + 99
+ );
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM test_autovac;",
+ stdout => \$psql_out
+);
+is($psql_out, $initial_rows_num, "All data inserted into table successfully");
+
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Reduce autovacuum_work_mem, so leader process will perform parallel indexi
+# vacuum phase several times
+$node->append_conf('postgresql.conf', qq{
+ autovacuum_naptime = '1s'
+ autovacuum_vacuum_threshold = 1
+ autovacuum_analyze_threshold = 1
+ autovacuum_vacuum_scale_factor = 0.1
+ autovacuum_analyze_scale_factor = 0.1
+ autovacuum = on
+});
+
+$node->restart;
+
+# sleep(3600);
+
+ok(1, "There are no segfaults");
+
+$node->stop;
+done_testing();
--
2.43.0
v3-0001-Self-managed-parallel-index-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v3-0001-Self-managed-parallel-index-autovacuum.patchDownload
From 96ab66f2bfe1146e20703b725b2aa8c91f6a237f Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Fri, 9 May 2025 17:14:06 +0700
Subject: [PATCH v3] Meet parallel index autovacuum
---
src/backend/access/common/reloptions.c | 11 +
src/backend/commands/vacuum.c | 36 +
src/backend/commands/vacuumparallel.c | 286 ++++-
src/backend/postmaster/autovacuum.c | 1026 ++++++++++++++++-
src/backend/utils/misc/guc_tables.c | 10 +
src/backend/utils/misc/postgresql.conf.sample | 2 +
src/include/postmaster/autovacuum.h | 27 +
src/include/utils/guc_hooks.h | 2 +
src/include/utils/rel.h | 13 +-
src/test/modules/autovacuum/.gitignore | 1 +
src/test/modules/autovacuum/Makefile | 14 +
.../autovacuum/t/001_autovac_parallel.pl | 135 +++
12 files changed, 1517 insertions(+), 46 deletions(-)
create mode 100644 src/test/modules/autovacuum/.gitignore
create mode 100644 src/test/modules/autovacuum/Makefile
create mode 100644 src/test/modules/autovacuum/t/001_autovac_parallel.pl
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 46c1dce222d..b9d642a7a45 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -166,6 +166,15 @@ static relopt_bool boolRelOpts[] =
},
true
},
+ {
+ {
+ "parallel_idx_autovac",
+ "Enables autovacuum to process indexes of this table in parallel mode",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ false
+ },
/* list terminator */
{{NULL}}
};
@@ -1905,6 +1914,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
offsetof(StdRdOptions, vacuum_index_cleanup)},
{"vacuum_truncate", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, vacuum_truncate), offsetof(StdRdOptions, vacuum_truncate_set)},
+ {"parallel_idx_autovac", RELOPT_TYPE_BOOL,
+ offsetof(StdRdOptions, parallel_idx_autovac)},
{"vacuum_max_eager_freeze_failure_rate", RELOPT_TYPE_REAL,
offsetof(StdRdOptions, vacuum_max_eager_freeze_failure_rate)}
};
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 33a33bf6b1c..ab6706743a9 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -57,9 +57,16 @@
#include "utils/guc.h"
#include "utils/guc_hooks.h"
#include "utils/memutils.h"
+#include "utils/rel.h"
#include "utils/snapmgr.h"
#include "utils/syscache.h"
+/*
+ * Minimum number of dead tuples required for the table to be processed in
+ * parallel during autovacuum
+ */
+#define AV_PARALLEL_DEADTUP_THRESHOLD 1024
+
/*
* Minimum interval for cost-based vacuum delay reports from a parallel worker.
* This aims to avoid sending too many messages and waking up the leader too
@@ -2234,6 +2241,35 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params,
else
toast_relid = InvalidOid;
+ /*
+ * Decide whether we need to process table with given oid in parallel mode
+ * during autovacuum.
+ */
+ if (AmAutoVacuumWorkerProcess() &&
+ params->index_cleanup != VACOPTVALUE_DISABLED &&
+ CanUseParallelIdxAutovacForRelation(rel))
+ {
+ PgStat_StatTabEntry *tabentry;
+
+ /* fetch the pgstat table entry */
+ tabentry = pgstat_fetch_stat_tabentry_ext(rel->rd_rel->relisshared,
+ rel->rd_id);
+ if (tabentry && tabentry->dead_tuples >= AV_PARALLEL_DEADTUP_THRESHOLD)
+ {
+ List *indexes = RelationGetIndexList(rel);
+ int num_indexes = list_length(indexes);
+
+ list_free(indexes);
+
+ if (max_parallel_index_autovac_workers > 0)
+ {
+ params->nworkers =
+ Min((num_indexes / AV_PARALLEL_INDEXES_PER_WORKER) + 1,
+ max_parallel_index_autovac_workers);
+ }
+ }
+ }
+
/*
* Switch to the table owner's userid, so that any index functions are run
* as that user. Also lock down security-restricted operations and
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 2b9d548cdeb..077f7a8ff6a 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,20 +1,23 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel [auto]vacuum execution.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
*
- * In a parallel vacuum, we perform both index bulk deletion and index cleanup
- * with parallel worker processes. Individual indexes are processed by one
- * vacuum process. ParallelVacuumState contains shared information as well as
- * the memory space for storing dead items allocated in the DSA area. We
+ * In a parallel [auto]vacuum, we perform both index bulk deletion and index
+ * cleanup with parallel worker processes. Individual indexes are processed by
+ * one vacuum process. ParallelVacuumState contains shared information as well
+ * as the memory space for storing dead items allocated in the DSA area. We
* launch parallel worker processes at the start of parallel index
* bulk-deletion and index cleanup and once all indexes are processed, the
* parallel worker processes exit. Each time we process indexes in parallel,
* the parallel context is re-initialized so that the same DSM can be used for
- * multiple passes of index bulk-deletion and index cleanup.
+ * multiple passes of index bulk-deletion and index cleanup. For maintenance
+ * vacuum, we launch workers manually (using dynamic bgworkers machinery), and
+ * for autovacuum we send signals to the autovacuum launcher (all logic for
+ * communication among parallel autovacuum processes is in autovacuum.c).
*
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
@@ -34,9 +37,11 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
+#include "utils/memutils.h"
#include "utils/rel.h"
/*
@@ -157,11 +162,20 @@ typedef struct PVIndStats
} PVIndStats;
/*
- * Struct for maintaining a parallel vacuum state. typedef appears in vacuum.h.
+ * Struct for maintaining a parallel [auto]vacuum state. typedef appears in
+ * vacuum.h.
*/
struct ParallelVacuumState
{
- /* NULL for worker processes */
+ /* Is this structure used for maintenance vacuum or autovacuum */
+ bool is_autovacuum;
+
+ /*
+ * NULL for worker processes.
+ *
+ * NOTE: Parallel autovacuum only needs a subset of the maintenance vacuum
+ * functionality.
+ */
ParallelContext *pcxt;
/* Parent Heap Relation */
@@ -221,6 +235,10 @@ struct ParallelVacuumState
PVIndVacStatus status;
};
+static ParallelContext *CreateParallelAutoVacContext(int nworkers);
+static void InitializeParallelAutoVacDSM(ParallelContext *pcxt);
+static void DestroyParallelAutoVacContext(ParallelContext *pcxt);
+
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -280,15 +298,21 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
}
pvs = (ParallelVacuumState *) palloc0(sizeof(ParallelVacuumState));
+ pvs->is_autovacuum = AmAutoVacuumWorkerProcess();
pvs->indrels = indrels;
pvs->nindexes = nindexes;
pvs->will_parallel_vacuum = will_parallel_vacuum;
pvs->bstrategy = bstrategy;
pvs->heaprel = rel;
- EnterParallelMode();
- pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
- parallel_workers);
+ if (pvs->is_autovacuum)
+ pcxt = CreateParallelAutoVacContext(parallel_workers);
+ else
+ {
+ EnterParallelMode();
+ pcxt = CreateParallelContext("postgres", "parallel_vacuum_main",
+ parallel_workers);
+ }
Assert(pcxt->nworkers > 0);
pvs->pcxt = pcxt;
@@ -327,7 +351,10 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
else
querylen = 0; /* keep compiler quiet */
- InitializeParallelDSM(pcxt);
+ if (pvs->is_autovacuum)
+ InitializeParallelAutoVacDSM(pvs->pcxt);
+ else
+ InitializeParallelDSM(pcxt);
/* Prepare index vacuum stats */
indstats = (PVIndStats *) shm_toc_allocate(pcxt->toc, est_indstats_len);
@@ -371,10 +398,16 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->relid = RelationGetRelid(rel);
shared->elevel = elevel;
shared->queryid = pgstat_get_my_query_id();
- shared->maintenance_work_mem_worker =
- (nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ if (pvs->is_autovacuum)
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) :
+ autovacuum_work_mem;
+ else
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+ maintenance_work_mem;
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -453,8 +486,13 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
TidStoreDestroy(pvs->dead_items);
- DestroyParallelContext(pvs->pcxt);
- ExitParallelMode();
+ if (pvs->is_autovacuum)
+ DestroyParallelAutoVacContext(pvs->pcxt);
+ else
+ {
+ DestroyParallelContext((ParallelContext *) pvs->pcxt);
+ ExitParallelMode();
+ }
pfree(pvs->will_parallel_vacuum);
pfree(pvs);
@@ -532,6 +570,144 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
}
+/*
+ * Short version of CreateParallelContext (parallel.c). Here we init only those
+ * fields that are needed for parallel index processing during autovacuum.
+ */
+static ParallelContext *
+CreateParallelAutoVacContext(int nworkers)
+{
+ ParallelContext *pcxt;
+ MemoryContext oldcontext;
+
+ Assert(AmAutoVacuumWorkerProcess());
+
+ /* Number of workers should be non-negative. */
+ Assert(nworkers >= 0);
+
+ /* We might be running in a short-lived memory context. */
+ oldcontext = MemoryContextSwitchTo(TopTransactionContext);
+
+ /* Initialize a new ParallelContext. */
+ pcxt = palloc0(sizeof(ParallelContext));
+ pcxt->nworkers = nworkers;
+ pcxt->nworkers_to_launch = nworkers;
+ shm_toc_initialize_estimator(&pcxt->estimator);
+
+ /* Restore previous memory context. */
+ MemoryContextSwitchTo(oldcontext);
+
+ return pcxt;
+}
+
+/*
+ * Short version of InitializeParallelDSM (parallel.c). Here we put into dsm
+ * only those data that are needed for parallel index processing during
+ * autovacuum.
+ */
+static void
+InitializeParallelAutoVacDSM(ParallelContext *pcxt)
+{
+ MemoryContext oldcontext;
+ Size tsnaplen = 0;
+ Size asnaplen = 0;
+ Size segsize = 0;
+ char *tsnapspace;
+ char *asnapspace;
+ Snapshot transaction_snapshot = GetTransactionSnapshot();
+ Snapshot active_snapshot = GetActiveSnapshot();
+
+ Assert(pcxt->nworkers >= 1);
+
+ /* We might be running in a very short-lived memory context. */
+ oldcontext = MemoryContextSwitchTo(TopTransactionContext);
+
+ if (IsolationUsesXactSnapshot())
+ {
+ tsnaplen = EstimateSnapshotSpace(transaction_snapshot);
+ shm_toc_estimate_chunk(&pcxt->estimator, tsnaplen);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+ }
+ asnaplen = EstimateSnapshotSpace(active_snapshot);
+ shm_toc_estimate_chunk(&pcxt->estimator, asnaplen);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+
+ /* Create DSM and initialize with new table of contents. */
+ segsize = shm_toc_estimate(&pcxt->estimator);
+ pcxt->seg = dsm_create(segsize, DSM_CREATE_NULL_IF_MAXSEGMENTS);
+
+ if (pcxt->seg == NULL)
+ {
+ pcxt->nworkers = 0;
+ pcxt->private_memory = MemoryContextAlloc(TopMemoryContext, segsize);
+ }
+
+ pcxt->toc = shm_toc_create(AV_PARALLEL_MAGIC,
+ pcxt->seg == NULL ? pcxt->private_memory :
+ dsm_segment_address(pcxt->seg),
+ segsize);
+
+ /* We can skip the rest of this if we're not budgeting for any workers. */
+ if (pcxt->nworkers > 0)
+ {
+ /*
+ * Serialize the transaction snapshot if the transaction isolation
+ * level uses a transaction snapshot.
+ */
+ if (IsolationUsesXactSnapshot())
+ {
+ tsnapspace = shm_toc_allocate(pcxt->toc, tsnaplen);
+ SerializeSnapshot(transaction_snapshot, tsnapspace);
+ shm_toc_insert(pcxt->toc, AV_PARALLEL_KEY_TRANSACTION_SNAPSHOT,
+ tsnapspace);
+ }
+
+ /* Serialize the active snapshot. */
+ asnapspace = shm_toc_allocate(pcxt->toc, asnaplen);
+ SerializeSnapshot(active_snapshot, asnapspace);
+ shm_toc_insert(pcxt->toc, AV_PARALLEL_KEY_ACTIVE_SNAPSHOT, asnapspace);
+ }
+
+ /* Update nworkers_to_launch, in case we changed nworkers above. */
+ pcxt->nworkers_to_launch = pcxt->nworkers;
+
+ /* Restore previous memory context. */
+ MemoryContextSwitchTo(oldcontext);
+}
+
+/*
+ * Short version of DestroyParallelContext (parallel.c). Here we clean up only
+ * those data that were used during parallel index processing during autovacuum.
+ */
+static void
+DestroyParallelAutoVacContext(ParallelContext *pcxt)
+{
+ /*
+ * If we have allocated a shared memory segment, detach it. This will
+ * implicitly detach the error queues, and any other shared memory queues,
+ * stored there.
+ */
+ if (pcxt->seg != NULL)
+ {
+ dsm_detach(pcxt->seg);
+ pcxt->seg = NULL;
+ }
+
+ /*
+ * If this parallel context is actually in backend-private memory rather
+ * than shared memory, free that memory instead.
+ */
+ if (pcxt->private_memory != NULL)
+ {
+ pfree(pcxt->private_memory);
+ pcxt->private_memory = NULL;
+ }
+
+ AutoVacuumReleaseParallelWork(false);
+ pfree(pcxt);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -558,7 +734,9 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster ||
+ (max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess()) ||
+ (max_parallel_index_autovac_workers == 0 && AmAutoVacuumWorkerProcess()))
return 0;
/*
@@ -597,15 +775,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = AmAutoVacuumWorkerProcess() ?
+ Min(parallel_workers, max_parallel_index_autovac_workers) :
+ Min(parallel_workers, max_parallel_maintenance_workers);
return parallel_workers;
}
/*
* Perform index vacuum or index cleanup with parallel workers. This function
- * must be used by the parallel vacuum leader process.
+ * must be used by the parallel [auto]vacuum leader process.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -670,7 +850,7 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
if (nworkers > 0)
{
/* Reinitialize parallel context to relaunch parallel workers */
- if (num_index_scans > 0)
+ if (num_index_scans > 0 && !pvs->is_autovacuum)
ReinitializeParallelDSM(pvs->pcxt);
/*
@@ -686,9 +866,22 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
* The number of workers can vary between bulkdelete and cleanup
* phase.
*/
- ReinitializeParallelWorkers(pvs->pcxt, nworkers);
-
- LaunchParallelWorkers(pvs->pcxt);
+ if (pvs->is_autovacuum)
+ {
+ pvs->pcxt->nworkers_to_launch = Min(pvs->pcxt->nworkers, nworkers);
+ if (pvs->pcxt->nworkers > 0 && pvs->pcxt->nworkers_to_launch > 0)
+ {
+ pvs->pcxt->nworkers_launched =
+ LaunchParallelAutovacuumWorkers(pvs->heaprel->rd_id,
+ pvs->pcxt->nworkers_to_launch,
+ dsm_segment_handle(pvs->pcxt->seg));
+ }
+ }
+ else
+ {
+ ReinitializeParallelWorkers(pvs->pcxt, nworkers);
+ LaunchParallelWorkers(pvs->pcxt);
+ }
if (pvs->pcxt->nworkers_launched > 0)
{
@@ -733,8 +926,14 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
if (nworkers > 0)
{
- /* Wait for all vacuum workers to finish */
- WaitForParallelWorkersToFinish(pvs->pcxt);
+ /*
+ * Wait for all [auto]vacuum workers involved in parallel index
+ * processing (if any) to finish and advance state machine.
+ */
+ if (pvs->is_autovacuum && pvs->pcxt->nworkers_launched >= 0)
+ ParallelAutovacuumEndSyncPoint(false);
+ else if (!pvs->is_autovacuum)
+ WaitForParallelWorkersToFinish(pvs->pcxt);
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
@@ -982,8 +1181,8 @@ parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
/*
* Perform work within a launched parallel process.
*
- * Since parallel vacuum workers perform only index vacuum or index cleanup,
- * we don't need to report progress information.
+ * Since parallel [auto]vacuum workers perform only index vacuum or index
+ * cleanup, we don't need to report progress information.
*/
void
parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
@@ -997,23 +1196,22 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
BufferUsage *buffer_usage;
WalUsage *wal_usage;
int nindexes;
+ int worker_number;
char *sharedquery;
ErrorContextCallback errcallback;
- /*
- * A parallel vacuum worker must have only PROC_IN_VACUUM flag since we
- * don't support parallel vacuum for autovacuum as of now.
- */
- Assert(MyProc->statusFlags == PROC_IN_VACUUM);
-
- elog(DEBUG1, "starting parallel vacuum worker");
+ Assert(MyProc->statusFlags == PROC_IN_VACUUM || AmAutoVacuumWorkerProcess());
+ elog(DEBUG1, "starting parallel [auto]vacuum worker");
shared = (PVShared *) shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_SHARED, false);
/* Set debug_query_string for individual workers */
- sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
- debug_query_string = sharedquery;
- pgstat_report_activity(STATE_RUNNING, debug_query_string);
+ if (!AmAutoVacuumWorkerProcess())
+ {
+ sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true);
+ debug_query_string = sharedquery;
+ pgstat_report_activity(STATE_RUNNING, debug_query_string);
+ }
/* Track query ID */
pgstat_report_query_id(shared->queryid, false);
@@ -1091,8 +1289,12 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
/* Report buffer/WAL usage during parallel execution */
buffer_usage = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_BUFFER_USAGE, false);
wal_usage = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_WAL_USAGE, false);
- InstrEndParallelQuery(&buffer_usage[ParallelWorkerNumber],
- &wal_usage[ParallelWorkerNumber]);
+
+ worker_number = AmAutoVacuumWorkerProcess() ?
+ GetAutoVacuumParallelWorkerNumber() : ParallelWorkerNumber;
+
+ InstrEndParallelQuery(&buffer_usage[worker_number],
+ &wal_usage[worker_number]);
/* Report any remaining cost-based vacuum delay time */
if (track_cost_delay_timing)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 16756152b71..040af5ebc14 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -90,6 +90,7 @@
#include "postmaster/postmaster.h"
#include "storage/aio_subsys.h"
#include "storage/bufmgr.h"
+#include "storage/condition_variable.h"
#include "storage/ipc.h"
#include "storage/latch.h"
#include "storage/lmgr.h"
@@ -101,6 +102,7 @@
#include "utils/fmgroids.h"
#include "utils/fmgrprotos.h"
#include "utils/guc_hooks.h"
+#include "utils/inval.h"
#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
@@ -129,6 +131,7 @@ int autovacuum_anl_thresh;
double autovacuum_anl_scale;
int autovacuum_freeze_max_age;
int autovacuum_multixact_freeze_max_age;
+int max_parallel_index_autovac_workers;
double autovacuum_vac_cost_delay;
int autovacuum_vac_cost_limit;
@@ -164,6 +167,14 @@ static int default_freeze_table_age;
static int default_multixact_freeze_min_age;
static int default_multixact_freeze_table_age;
+/*
+ * Number of additional workers that was requested for parallel index processing
+ * during autovacuum.
+ */
+static int nworkers_for_idx_autovac = 0;
+
+static int nworkers_launched = 0;
+
/* Memory context for long-lived data */
static MemoryContext AutovacMemCxt;
@@ -210,6 +221,9 @@ typedef struct autovac_table
char *at_datname;
} autovac_table;
+/* Forward declaration */
+typedef struct ParallelAutoVacuumWorkItem ParallelAutoVacuumWorkItem;
+
/*-------------
* This struct holds information about a single worker's whereabouts. We keep
* an array of these in shared memory, sized according to
@@ -222,6 +236,10 @@ typedef struct autovac_table
* wi_proc pointer to PGPROC of the running worker, NULL if not started
* wi_launchtime Time at which this worker was launched
* wi_dobalance Whether this worker should be included in balance calculations
+ * wi_pcleanup if (> 0) => this worker must participate in parallel index
+ * vacuuming as supportive. Must be (== 0) for leader worker.
+ * wi_target_item used only for parallel index vacuum supportive workers. Points
+ * to workitem, that must be processed by this worker.
*
* All fields are protected by AutovacuumLock, except for wi_tableoid and
* wi_sharedrel which are protected by AutovacuumScheduleLock (note these
@@ -237,10 +255,22 @@ typedef struct WorkerInfoData
TimestampTz wi_launchtime;
pg_atomic_flag wi_dobalance;
bool wi_sharedrel;
+ int wi_pcleanup;
+ struct ParallelAutoVacuumWorkItem *wi_target_item;
} WorkerInfoData;
typedef struct WorkerInfoData *WorkerInfo;
+#define AmParallelIdxAutoVacSupportive() \
+ (MyWorkerInfo != NULL && \
+ MyWorkItem != NULL && \
+ MyWorkerInfo->wi_pcleanup > 0)
+
+#define AmParallelIdxAutoVacLeader() \
+ (MyWorkerInfo != NULL && \
+ MyWorkItem != NULL && \
+ MyWorkerInfo->wi_pcleanup == 0)
+
/*
* Possible signals received by the launcher from remote processes. These are
* stored atomically in shared memory so that other processes can set them
@@ -250,9 +280,10 @@ typedef enum
{
AutoVacForkFailed, /* failed trying to start a worker */
AutoVacRebalance, /* rebalance the cost limits */
+ AutoVacParallelReq, /* request for parallel index vacuum */
} AutoVacuumSignal;
-#define AutoVacNumSignals (AutoVacRebalance + 1)
+#define AutoVacNumSignals (AutoVacParallelReq + 1)
/*
* Autovacuum workitem array, stored in AutoVacuumShmem->av_workItems. This
@@ -272,6 +303,55 @@ typedef struct AutoVacuumWorkItem
#define NUM_WORKITEMS 256
+typedef enum
+{
+ LAUNCHER = 0, /* autovacuum launcher must wake everyone up */
+ LEADER, /* leader must wake everyone up */
+ LAST_WORKER, /* the last inited supportive worker must wake everyone
+ up */
+} SyncType;
+
+typedef enum
+{
+ STARTUP = 0, /* initial value - no sync points were passed */
+ START_SYNC_POINT_PASSED, /* start_sync_point was passed */
+ END_SYNC_POINT_PASSED, /* end_sync_point was passed */
+ SHUTDOWN, /* leader wants to shut down parallel index
+ vacuum due to occured error */
+} Status;
+
+/*
+ * Structure, stored in AutoVacuumShmem->pav_workItem. This is used for managing
+ * parallel index processing (whithin single table).
+ */
+struct ParallelAutoVacuumWorkItem
+{
+ Oid avw_database;
+ Oid avw_relation;
+ int nworkers_participating;
+ int nworkers_to_launch;
+ int nworkers_sleeping; /* leader doesn't count */
+ int nfinished; /* # of workers, that already finished parallel
+ index processing (and probably already dead) */
+
+ dsm_handle handl;
+ int leader_proc_pid;
+
+ PGPROC *leader_proc;
+ ConditionVariable cv;
+
+ bool active; /* being processed */
+ bool leader_sleeping_on_ssp; /* sleeping on start sync point */
+ bool leader_sleeping_on_esp; /* sleeping on end sync point */
+ SyncType sync_type;
+ Status status;
+
+ bool needs_launcher;
+ TimestampTz birthtime;
+};
+
+static ParallelAutoVacuumWorkItem *MyWorkItem = NULL;
+
/*-------------
* The main autovacuum shmem struct. On shared memory we store this main
* struct and the array of WorkerInfo structs. This struct keeps:
@@ -283,6 +363,10 @@ typedef struct AutoVacuumWorkItem
* av_startingWorker pointer to WorkerInfo currently being started (cleared by
* the worker itself as soon as it's up and running)
* av_workItems work item array
+ * pav_workItems array of control structures needed for parallel index
+ * processing
+ * pav_workers_left how many workers we can launch for parallel index processing
+ * (must always be < autovacuum_max_workers)
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
*
@@ -298,6 +382,8 @@ typedef struct
dlist_head av_runningWorkers;
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+ ParallelAutoVacuumWorkItem pav_workItems[NUM_WORKITEMS];
+ int pav_workers_left;
pg_atomic_uint32 av_nworkersForBalance;
} AutoVacuumShmemStruct;
@@ -322,11 +408,17 @@ pg_noreturn static void AutoVacLauncherShutdown(void);
static void launcher_determine_sleep(bool canlaunch, bool recursing,
struct timeval *nap);
static void launch_worker(TimestampTz now);
+static void launch_worker_for_pcleanup(TimestampTz now);
+static void eliminate_lock_conflicts(ParallelAutoVacuumWorkItem *item,
+ bool all_launched);
static List *get_database_list(void);
static void rebuild_database_list(Oid newdb);
static int db_comparator(const void *a, const void *b);
static void autovac_recalculate_workers_for_balance(void);
+static int parallel_autovacuum_start_sync_point(bool keep_lock);
+static void handle_parallel_idx_autovac_errors(void);
+
static void do_autovacuum(void);
static void FreeWorkerInfo(int code, Datum arg);
@@ -355,7 +447,61 @@ static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+typedef bool (*wakeup_condition) (ParallelAutoVacuumWorkItem *item);
+static bool start_sync_point_wakeup_cond(ParallelAutoVacuumWorkItem *item);
+static bool end_sync_point_wakeup_cond(ParallelAutoVacuumWorkItem *item);
+static void CVSleep(ParallelAutoVacuumWorkItem *item, wakeup_condition wakeup_cond);
+/*
+ * Returns pointer to free work item, that can be used for parallel index
+ * vacuuming, or NULL if there is no such work items.
+ */
+static ParallelAutoVacuumWorkItem *
+get_free_workitem_for_leader(void)
+{
+ Assert(LWLockHeldByMe(AutovacuumLock));
+
+ for (int i = 0; i < NUM_WORKITEMS; i++)
+ {
+ ParallelAutoVacuumWorkItem *item = &AutoVacuumShmem->pav_workItems[i];
+
+ if (item->active && item->leader_proc_pid != MyProcPid)
+ continue;
+
+ return item;
+ }
+
+ return NULL;
+}
+
+/*
+ * Returns pointer to work item, that must be processed by autovacuum launcher,
+ * or NULL if there is no such work items.
+ */
+static ParallelAutoVacuumWorkItem *
+get_free_workitem_for_launcher(void)
+{
+ TimestampTz latest = GetCurrentTimestamp();
+ ParallelAutoVacuumWorkItem *item = NULL;
+
+ Assert(LWLockHeldByMe(AutovacuumLock));
+
+ for (int i = 0; i < NUM_WORKITEMS; i++)
+ {
+ ParallelAutoVacuumWorkItem *tmp = &AutoVacuumShmem->pav_workItems[i];
+
+ if (!tmp->needs_launcher)
+ continue;
+
+ if (latest > tmp->birthtime)
+ {
+ latest = tmp->birthtime;
+ item = tmp;
+ }
+ }
+
+ return item;
+}
/********************************************************************
* AUTOVACUUM LAUNCHER CODE
@@ -583,7 +729,15 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
* wakening conditions.
*/
- launcher_determine_sleep(av_worker_available(), false, &nap);
+ if (nworkers_launched < nworkers_for_idx_autovac)
+ {
+ /* Take the smallest possible sleep interval. */
+ nap.tv_sec = 0;
+ nap.tv_usec = MIN_AUTOVAC_SLEEPTIME * 1000;
+ }
+ else
+ launcher_determine_sleep(!dclist_is_empty(&AutoVacuumShmem->av_freeWorkers),
+ false, &nap);
/*
* Wait until naptime expires or we get some type of signal (all the
@@ -598,6 +752,23 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
ProcessAutoVacLauncherInterrupts();
+ if (MyWorkItem == NULL && max_parallel_index_autovac_workers > 0)
+ {
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ if (AutoVacuumShmem->pav_workers_left > 0)
+ {
+ MyWorkItem = get_free_workitem_for_launcher();
+ if (MyWorkItem != NULL)
+ {
+ Assert(MyWorkItem->active == true);
+ MyWorkItem->needs_launcher = false;
+ nworkers_for_idx_autovac = MyWorkItem->nworkers_to_launch;
+ nworkers_launched = 0;
+ }
+ }
+ LWLockRelease(AutovacuumLock);
+ }
+
/*
* a worker finished, or postmaster signaled failure to start a worker
*/
@@ -614,6 +785,22 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
LWLockRelease(AutovacuumLock);
}
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ if (AutoVacuumShmem->av_signal[AutoVacParallelReq])
+ {
+ AutoVacuumShmem->av_signal[AutoVacParallelReq] = false;
+
+ if (MyWorkItem == NULL)
+ {
+ MyWorkItem = get_free_workitem_for_launcher();
+ Assert(MyWorkItem != NULL && MyWorkItem->active == true);
+ MyWorkItem->needs_launcher = false;
+ nworkers_for_idx_autovac = MyWorkItem->nworkers_to_launch;
+ nworkers_launched = 0;
+ }
+ }
+ LWLockRelease(AutovacuumLock);
+
if (AutoVacuumShmem->av_signal[AutoVacForkFailed])
{
/*
@@ -686,6 +873,8 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
worker->wi_sharedrel = false;
worker->wi_proc = NULL;
worker->wi_launchtime = 0;
+ worker->wi_pcleanup = -1;
+ worker->wi_target_item = NULL;
dclist_push_head(&AutoVacuumShmem->av_freeWorkers,
&worker->wi_links);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -698,9 +887,27 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
}
LWLockRelease(AutovacuumLock); /* either shared or exclusive */
- /* if we can't do anything, just go back to sleep */
if (!can_launch)
+ {
+ /*
+ * If launcher cannot launch all requested for parallel index
+ * vacuum workers, it must handle all possible lock conflicts and
+ * tell everyone, that there will no new supportive workers.
+ */
+ if (nworkers_launched < nworkers_for_idx_autovac)
+ {
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ Assert(MyWorkItem->active);
+
+ eliminate_lock_conflicts(MyWorkItem, false);
+ nworkers_launched = nworkers_for_idx_autovac = 0;
+ MyWorkItem = NULL;
+ LWLockRelease(AutovacuumLock);
+ }
+
+ /* if we can't do anything else, just go back to sleep */
continue;
+ }
/* We're OK to start a new worker */
@@ -716,6 +923,38 @@ AutoVacLauncherMain(const void *startup_data, size_t startup_data_len)
*/
launch_worker(current_time);
}
+ else if (nworkers_launched < nworkers_for_idx_autovac)
+ {
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Check whether we reach the limit of supportive workers.
+ */
+ if (AutoVacuumShmem->pav_workers_left == 0)
+ {
+ ereport(WARNING,
+ (errcode(ERRCODE_CONFIGURATION_LIMIT_EXCEEDED),
+ errmsg("cannot launch more a/v workers for parallel index cleanup of rel %d in database %d",
+ MyWorkItem->avw_relation, MyWorkItem->avw_database),
+ errhint("You might need to increase \"max_parallel_index_autovac_workers\" parameter")));
+ eliminate_lock_conflicts(MyWorkItem, false);
+ nworkers_launched = nworkers_for_idx_autovac = 0;
+ MyWorkItem = NULL;
+ }
+ else
+ {
+ /*
+ * One of active autovacuum workers sent us request to lauch
+ * participants for parallel index vacuum. We check this case first
+ * because we need to start participants as soon as possible.
+ */
+ launch_worker_for_pcleanup(current_time);
+ AutoVacuumShmem->pav_workers_left -= 1;
+ }
+
+ LWLockRelease(AutovacuumLock);
+ }
else
{
/*
@@ -1267,6 +1506,8 @@ do_start_worker(void)
worker->wi_dboid = avdb->adw_datid;
worker->wi_proc = NULL;
worker->wi_launchtime = GetCurrentTimestamp();
+ worker->wi_pcleanup = -1;
+ worker->wi_target_item = NULL;
AutoVacuumShmem->av_startingWorker = worker;
@@ -1349,6 +1590,132 @@ launch_worker(TimestampTz now)
}
}
+/*
+ * launch_worker_for_pcleanup
+ *
+ * Wrapper for starting a worker (requested by leader of parallel index
+ * vacuuming) from the launcher.
+ */
+static void
+launch_worker_for_pcleanup(TimestampTz now)
+{
+ WorkerInfo worker;
+ dlist_node *wptr;
+
+ Assert(MyWorkItem != NULL);
+ Assert(nworkers_launched < nworkers_for_idx_autovac);
+ Assert(LWLockHeldByMe(AutovacuumLock));
+
+ /*
+ * Get a worker entry from the freelist. We checked above, so there
+ * really should be a free slot.
+ */
+ wptr = dclist_pop_head_node(&AutoVacuumShmem->av_freeWorkers);
+
+ worker = dlist_container(WorkerInfoData, wi_links, wptr);
+ worker->wi_dboid = InvalidOid;
+ worker->wi_proc = NULL;
+ worker->wi_launchtime = GetCurrentTimestamp();
+ worker->wi_target_item = MyWorkItem;
+
+ /*
+ * Set indicator, that this workers must join to parallel index vacuum.
+ * This variable also plays the role of an unique id among parallel index
+ * vacuum workers. First id is '1', because '0' is reserved for leader.
+ */
+ worker->wi_pcleanup = (nworkers_launched + 1);
+
+ AutoVacuumShmem->av_startingWorker = worker;
+
+ SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_WORKER);
+
+ Assert(MyWorkItem->active);
+
+ nworkers_launched += 1;
+
+ if (nworkers_launched < nworkers_for_idx_autovac)
+ return;
+
+ Assert(MyWorkItem->sync_type == LAUNCHER &&
+ nworkers_launched == nworkers_for_idx_autovac);
+
+ /*
+ * If launcher managed to launch all requested for parallel index
+ * vacuum workers, it must handle all possible lock conflicts.
+ */
+ eliminate_lock_conflicts(MyWorkItem, true);
+ MyWorkItem = NULL;
+}
+
+/*
+ * Must be called from autovacuum launcher when it launched all requested
+ * workers for parallel index vacuum, or when it realized, that no more
+ * processes can be launched.
+ *
+ * In this function launcher will assign roles in such a way as to avoid lock
+ * conflicts between leader and supportive workers.
+ *
+ * AutovacuumLock must be held in exclusive mode before calling this function!
+ */
+static void
+eliminate_lock_conflicts(ParallelAutoVacuumWorkItem *item, bool all_launched)
+{
+ Assert(AmAutoVacuumLauncherProcess());
+ Assert(LWLockHeldByMe(AutovacuumLock));
+
+ /* So, let's start... */
+
+ if (item->leader_sleeping_on_ssp &&
+ item->nworkers_sleeping == nworkers_launched)
+ {
+ /*
+ * If both leader and all launched supportive workers are sleeping, then
+ * only we can wake everyone up.
+ */
+ ConditionVariableBroadcast(&item->cv);
+
+ /* Advance status. */
+ item->status = START_SYNC_POINT_PASSED;
+ }
+ else if (item->leader_sleeping_on_ssp &&
+ item->nworkers_sleeping < nworkers_launched)
+ {
+ /*
+ * If leader already sleeping, but several supportive workers are
+ * initing, we shift the responsibility for awakening everyone into the
+ * worker who completes initialization last
+ */
+ item->sync_type = LAST_WORKER;
+ }
+ else if (!item->leader_sleeping_on_ssp &&
+ item->nworkers_sleeping == nworkers_launched)
+ {
+ /*
+ * If only leader is not sleeping - it must wake up all workers when it
+ * finishes all preparations.
+ */
+ item->sync_type = LEADER;
+ }
+ else
+ {
+ /*
+ * If nobody is sleeping, we assume that leader has higher chanses to
+ * asleep first, so set sync type to LAST_WORKER, but if the last worker
+ * will see that leader still not sleeping, it will change sync type to
+ * LEADER and asleep.
+ */
+ item->sync_type = LAST_WORKER;
+ }
+
+ /*
+ * If we cannot launch all requested workers, refresh
+ * nworkers_to_launch value, so that the last worker can find out
+ * that he is really the last.
+ */
+ if (!all_launched && item->sync_type == LAST_WORKER)
+ item->nworkers_to_launch = nworkers_launched;
+}
+
/*
* Called from postmaster to signal a failure to fork a process to become
* worker. The postmaster should kill(SIGUSR2) the launcher shortly
@@ -1360,6 +1727,38 @@ AutoVacWorkerFailed(void)
AutoVacuumShmem->av_signal[AutoVacForkFailed] = true;
}
+/*
+ * Called from autovacuum worker to signal that he needs participants in
+ * parallel index vacuum. Function sends SIGUSR2 to the launcher and returns
+ * 'true' iff signal was sent successfully.
+ */
+bool
+AutoVacParallelWorkRequest(void)
+{
+ if (AutoVacuumShmem->av_launcherpid == 0)
+ {
+ ereport(WARNING,
+ (errcode(ERRCODE_INTERNAL_ERROR),
+ errmsg("autovacuum launcher is dead")));
+
+ return false;
+ }
+
+ if (kill(AutoVacuumShmem->av_launcherpid, SIGUSR2) < 0)
+ {
+ ereport(WARNING,
+ (errcode(ERRCODE_SYSTEM_ERROR),
+ errmsg("failed to send signal to autovac launcher (pid %d): %m",
+ AutoVacuumShmem->av_launcherpid)));
+
+ return false;
+ }
+
+ AutoVacuumShmem->av_signal[AutoVacParallelReq] = true;
+ return true;
+}
+
+
/* SIGUSR2: a worker is up and running, or just finished, or failed to fork */
static void
avl_sigusr2_handler(SIGNAL_ARGS)
@@ -1559,6 +1958,8 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
{
char dbname[NAMEDATALEN];
+ Assert(MyWorkerInfo->wi_pcleanup < 0);
+
/*
* Report autovac startup to the cumulative stats system. We
* deliberately do this before InitPostgres, so that the
@@ -1593,12 +1994,122 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
recentMulti = ReadNextMultiXactId();
do_autovacuum();
}
+ else if (MyWorkerInfo->wi_target_item != NULL)
+ {
+ dsm_handle handle;
+ PGPROC *leader_proc;
+ int leader_proc_pid;
+ dsm_segment *seg;
+ shm_toc *toc;
+ char *asnapspace;
+ char *tsnapspace;
+ char dbname[NAMEDATALEN];
+ Snapshot tsnapshot;
+ Snapshot asnapshot;
+
+ /*
+ * We will abort parallel index vacuuming whithin current process if
+ * something errors out
+ */
+ PG_TRY();
+ {
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ MyWorkItem = MyWorkerInfo->wi_target_item;
+ dbid = MyWorkItem->avw_database;
+ handle = MyWorkItem->handl;
+ leader_proc = MyWorkItem->leader_proc;
+ leader_proc_pid = MyWorkItem->leader_proc_pid;
+ LWLockRelease(AutovacuumLock);
+
+ InitPostgres(NULL, dbid, NULL, InvalidOid,
+ INIT_PG_OVERRIDE_ALLOW_CONNS,
+ dbname);
+
+ set_ps_display(dbname);
+ if (PostAuthDelay)
+ pg_usleep(PostAuthDelay * 1000000L);
+
+ /* And do an appropriate amount of work */
+ recentXid = ReadNextTransactionId();
+ recentMulti = ReadNextMultiXactId();
+
+ if (parallel_autovacuum_start_sync_point(false) == -1)
+ {
+ /* We are not participating anymore */
+ MyWorkItem = NULL;
+ MyWorkerInfo->wi_pcleanup = -1;
+ goto exit;
+ }
+
+ seg = dsm_attach(handle);
+ if (seg == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("could not map dynamic shared memory segment")));
+
+ toc = shm_toc_attach(AV_PARALLEL_MAGIC, dsm_segment_address(seg));
+ if (toc == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("invalid magic number in dynamic shared memory segment")));
+
+ if (!BecomeLockGroupMember(leader_proc, leader_proc_pid))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("could not become lock group member")));
+ }
+
+ StartTransactionCommand();
+
+ asnapspace =
+ shm_toc_lookup(toc, AV_PARALLEL_KEY_ACTIVE_SNAPSHOT, false);
+ tsnapspace =
+ shm_toc_lookup(toc, AV_PARALLEL_KEY_TRANSACTION_SNAPSHOT, true);
+ asnapshot = RestoreSnapshot(asnapspace);
+ tsnapshot = tsnapspace ? RestoreSnapshot(tsnapspace) : asnapshot;
+ RestoreTransactionSnapshot(tsnapshot, leader_proc);
+ PushActiveSnapshot(asnapshot);
+
+ /*
+ * We've changed which tuples we can see, and must therefore
+ * invalidate system caches.
+ */
+ InvalidateSystemCaches();
+
+ parallel_vacuum_main(seg, toc);
+
+ /* Must pop active snapshot so snapmgr.c doesn't complain. */
+ PopActiveSnapshot();
+
+ dsm_detach(seg);
+ CommitTransactionCommand();
+ ParallelAutovacuumEndSyncPoint(false);
+ }
+ PG_CATCH();
+ {
+ EmitErrorReport();
+ if (AmParallelIdxAutoVacSupportive())
+ handle_parallel_idx_autovac_errors();
+ }
+ PG_END_TRY();
+ }
/*
* The launcher will be notified of my death in ProcKill, *if* we managed
* to get a worker slot at all
*/
+exit:
+
+ if (MyWorkerInfo->wi_target_item != NULL)
+ {
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ AutoVacuumShmem->pav_workers_left += 1;
+ Assert(AutoVacuumShmem->pav_workers_left <= max_parallel_index_autovac_workers);
+ LWLockRelease(AutovacuumLock);
+ }
+
/* All done, go away */
proc_exit(0);
}
@@ -2461,6 +2972,10 @@ do_autovacuum(void)
tab->at_datname, tab->at_nspname, tab->at_relname);
EmitErrorReport();
+ /* if we are parallel index vacuuming leader, we must shut it down */
+ if (AmParallelIdxAutoVacLeader())
+ handle_parallel_idx_autovac_errors();
+
/* this resets ProcGlobal->statusFlags[i] too */
AbortOutOfAnyTransaction();
FlushErrorState();
@@ -3296,6 +3811,492 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * Release work item, used for managing parallel index vacuum. Must be called
+ * once and only from leader worker.
+ *
+ * If 'keep_lock' is true, then AutovacuumLock will not be released in the end
+ * of function execution.
+ */
+void
+AutoVacuumReleaseParallelWork(bool keep_lock)
+{
+ /*
+ * We might not get the workitem from launcher (we must not be considered
+ * as leader in this case), so just leave.
+ */
+ if (!AmParallelIdxAutoVacLeader())
+ return;
+
+ if (!LWLockHeldByMe(AutovacuumLock))
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ Assert(AmParallelIdxAutoVacLeader() &&
+ MyWorkItem->leader_proc_pid == MyProcPid);
+
+ MyWorkItem->leader_proc = NULL;
+ MyWorkItem->leader_proc_pid = 0;
+ MyWorkItem->active = false;
+ MyWorkItem = NULL;
+
+ /* We are not leader anymore. */
+ MyWorkerInfo->wi_pcleanup = -1;
+
+ if (!keep_lock)
+ LWLockRelease(AutovacuumLock);
+}
+
+static bool
+start_sync_point_wakeup_cond(ParallelAutoVacuumWorkItem *item)
+{
+ bool need_wakeup = false;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ if (AmParallelIdxAutoVacLeader())
+ {
+ /*
+ * In normal case we should exit sleep loop after last launched
+ * supportive worker passed sync point (status == START_SYNC_POINT_PASSED).
+ * But if we are in SHUTDOWN mode, all launched workers will just exit
+ * sync point whithout status advancing. We can handle such case if we
+ * check that n_participating == n_to_launch.
+ */
+ if (item->status == SHUTDOWN)
+ need_wakeup = (item->nworkers_participating == item->nworkers_to_launch);
+ else
+ need_wakeup = item->status == START_SYNC_POINT_PASSED;
+ }
+ else
+ need_wakeup = (item->status == START_SYNC_POINT_PASSED ||
+ item->status == SHUTDOWN);
+
+ LWLockRelease(AutovacuumLock);
+ return need_wakeup;
+}
+
+static bool
+end_sync_point_wakeup_cond(ParallelAutoVacuumWorkItem *item)
+{
+ bool need_wakeup = false;
+
+ Assert(AmParallelIdxAutoVacLeader());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ need_wakeup = item->status == END_SYNC_POINT_PASSED;
+ LWLockRelease(AutovacuumLock);
+ return need_wakeup;
+}
+
+/*
+ * Waiting on condition variable is frequent operation, so it has beed taken
+ * out with a separate function. Caller must acquire hold AutovacuumLock before
+ * calling it.
+ */
+static void
+CVSleep(ParallelAutoVacuumWorkItem *item, wakeup_condition wakeup_cond)
+{
+ ConditionVariablePrepareToSleep(&item->cv);
+ LWLockRelease(AutovacuumLock);
+
+ PG_TRY();
+ {
+ do
+ {
+ ConditionVariableSleep(&item->cv, PG_WAIT_IPC);
+ } while (!wakeup_cond(item));
+ }
+ PG_CATCH();
+ {
+ ConditionVariableCancelSleep();
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
+
+ ConditionVariableCancelSleep();
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+}
+
+/*
+ * This function used to synchronize leader with supportive workers during
+ * parallel index vacuuming. Each process will exit iff:
+ * Leader worker is ready to perform parallel vacuum &&
+ * All launched supportive workers are ready to perform parallel vacuum &&
+ * (Autovacuum launcher already launched all requested workers ||
+ * Autovacuum launcher cannot launch more workers)
+ *
+ * If 'keep_lock' is true, then AutovacuumLock will not be released in the end
+ * of function execution.
+ *
+ * NOTE: Some workers may call this function when leader worker decided to shut
+ * down parallel vacuuming. In this case '-1' value will be returned.
+ */
+static int
+parallel_autovacuum_start_sync_point(bool keep_lock)
+{
+ SyncType sync_type;
+ int num_participants;
+
+ if (!LWLockHeldByMe(AutovacuumLock))
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ Assert(MyWorkItem->active);
+ sync_type = MyWorkItem->sync_type;
+
+ if (AmParallelIdxAutoVacLeader())
+ {
+ Assert(MyWorkItem->leader_proc_pid == MyProcPid);
+
+ /* Wake up all sleeping supportive workers, if required ... */
+ if (sync_type == LEADER)
+ {
+ ConditionVariableBroadcast(&MyWorkItem->cv);
+
+ /*
+ * Advance status, because we are guaranteed to pass this
+ * sync point.
+ * Don't advance if we call this function from error handle function
+ * (status == SHUTDOWN).
+ */
+ if (MyWorkItem->status != SHUTDOWN)
+ MyWorkItem->status = START_SYNC_POINT_PASSED;
+ }
+ /* ... otherwise, wait for somebody to wake us up */
+ else
+ {
+ MyWorkItem->leader_sleeping_on_ssp = true;
+ CVSleep(MyWorkItem, start_sync_point_wakeup_cond);
+ MyWorkItem->leader_sleeping_on_ssp = false;
+
+ /*
+ * A priori, we believe that in the end everyone should be awakened
+ * by the leader.
+ */
+ MyWorkItem->sync_type = LEADER;
+ }
+ }
+ else
+ {
+ MyWorkItem->nworkers_participating += 1;
+
+ /*
+ * If we know, that launcher will no longer attempt to launch more
+ * supportive workers for this item => we are LAST_WORKER for sure.
+ *
+ * Note, that launcher set LAST_WORKER sync type without knowing
+ * current status of leader. So we also check that leader is sleeping
+ * before wake all up. Otherwise, we must wait for leader (and ask him
+ * to wake all up).
+ */
+ if (MyWorkItem->nworkers_participating == MyWorkItem->nworkers_to_launch &&
+ sync_type == LAST_WORKER && MyWorkItem->leader_sleeping_on_ssp)
+ {
+ ConditionVariableBroadcast(&MyWorkItem->cv);
+
+ /*
+ * We must not advance status if leader wants to shut down parallel
+ * execution (see checks below).
+ */
+ if (MyWorkItem->status != SHUTDOWN)
+ MyWorkItem->status = START_SYNC_POINT_PASSED;
+ }
+ else
+ {
+ if (MyWorkItem->nworkers_participating == MyWorkItem->nworkers_to_launch &&
+ sync_type == LAST_WORKER)
+ {
+ MyWorkItem->sync_type = LEADER;
+ }
+
+ MyWorkItem->nworkers_sleeping += 1;
+ CVSleep(MyWorkItem, start_sync_point_wakeup_cond);
+ MyWorkItem->nworkers_sleeping -= 1;
+ }
+ }
+
+ /* Tell caller that it must not participate in parallel index cleanup. */
+ if (MyWorkItem->status == SHUTDOWN)
+ num_participants = -1;
+ else
+ num_participants = MyWorkItem->nworkers_participating;
+
+ if (!keep_lock)
+ LWLockRelease(AutovacuumLock);
+
+ return num_participants;
+}
+
+/*
+ * Like function above, but must be called by leader and supportive workers
+ * when they finished parallel index vacuum.
+ *
+ * If 'keep_lock' is true, then AutovacuumLock will not be released in the end
+ * of function execution.
+ */
+void
+ParallelAutovacuumEndSyncPoint(bool keep_lock)
+{
+ if (!LWLockHeldByMe(AutovacuumLock))
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ Assert(MyWorkItem->active);
+
+ if (MyWorkItem->nworkers_participating == 0)
+ {
+ Assert(!AmParallelIdxAutoVacSupportive());
+
+ /*
+ * We have two cases when no supportive workers were launched:
+ * 1) Leader got MyWorkItem, but launcher didn't launch any
+ * workers => just advance status, because we don't need to wait
+ * for anybody.
+ * 2) Leader didn't get MyWorkItem, because it was already in use =>
+ * we must not touch it. Just leave.
+ */
+ if (AmParallelIdxAutoVacLeader())
+ {
+ Assert(MyWorkItem->leader_proc_pid == MyProcPid);
+ MyWorkItem->status = END_SYNC_POINT_PASSED;
+ }
+ else
+ Assert(MyWorkItem->leader_proc_pid != MyProcPid);
+
+ if (!keep_lock)
+ LWLockRelease(AutovacuumLock);
+
+ return;
+ }
+
+ if (AmParallelIdxAutoVacLeader())
+ {
+ Assert(MyWorkItem->leader_proc_pid == MyProcPid);
+ Assert(MyWorkItem->sync_type == LEADER);
+
+ /* Wait for all workers to finish (only last worker will wake us up) */
+ if (MyWorkItem->nfinished != MyWorkItem->nworkers_participating)
+ {
+ MyWorkItem->sync_type = LAST_WORKER;
+ MyWorkItem->leader_sleeping_on_esp = true;
+ CVSleep(MyWorkItem, end_sync_point_wakeup_cond);
+ MyWorkItem->leader_sleeping_on_esp = false;
+
+ Assert(MyWorkItem->nfinished == MyWorkItem->nworkers_participating);
+
+ /*
+ * Advance status, because we are guaranteed to pass this
+ * sync point.
+ */
+ MyWorkItem->status = END_SYNC_POINT_PASSED;
+ }
+ }
+ else
+ {
+ MyWorkItem->nfinished += 1;
+
+ /* If we are last finished worker - wake up the leader.
+ *
+ * If not - just leave, because supportive worker already finished all
+ * work and must die.
+ */
+ if (MyWorkItem->sync_type == LAST_WORKER &&
+ MyWorkItem->nfinished == MyWorkItem->nworkers_participating &&
+ MyWorkItem->leader_sleeping_on_esp)
+ {
+ ConditionVariableBroadcast(&MyWorkItem->cv);
+
+ /*
+ * Don't need to check SHUTDOWN status here - all supportive workers
+ * are about to finish anyway.
+ */
+ MyWorkItem->status = END_SYNC_POINT_PASSED;
+ }
+
+ /* We are not participate anymore */
+ MyWorkerInfo->wi_pcleanup = -1;
+ MyWorkItem = NULL;
+ }
+
+ if (!keep_lock)
+ LWLockRelease(AutovacuumLock);
+
+ return;
+}
+
+/*
+ * Get id of parallel index vacuum worker (counting from 0).
+ */
+int
+GetAutoVacuumParallelWorkerNumber(void)
+{
+ Assert(AmAutoVacuumWorkerProcess() && MyWorkerInfo->wi_pcleanup > 0);
+ return (MyWorkerInfo->wi_pcleanup - 1);
+}
+
+/*
+ * Leader autovacuum process can decide, that he needs several helper workers
+ * to process table in parallel mode. He must set up parallel context and call
+ * LaunchParallelAutovacuumWorkers.
+ *
+ * In this function we do following :
+ * 1) Send signal to autovacuum lancher that creates 'supportive workers'
+ * during launcher's standard work loop.
+ * 2) Wait for supportive workers to start.
+ *
+ * Funcition return number of workers that launcher was able to launch (may be
+ * less then 'nworkers_to_launch').
+ */
+int
+LaunchParallelAutovacuumWorkers(Oid rel_id, int nworkers_to_launch,
+ dsm_handle handle)
+{
+ int nworkers_launched = 0;
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ if (MyWorkItem == NULL)
+ MyWorkItem = get_free_workitem_for_leader();
+
+ if (MyWorkItem == NULL)
+ {
+ LWLockRelease(AutovacuumLock);
+ return -1;
+ }
+
+ /* Notify autovacuum launcher that we need supportive workers */
+ if (AutoVacParallelWorkRequest())
+ {
+ /* OK, we can use this workitem entry. Init it. */
+ MyWorkItem->avw_database = MyDatabaseId;
+ MyWorkItem->avw_relation = rel_id;
+ MyWorkItem->handl = handle;
+ MyWorkItem->leader_proc = MyProc;
+ MyWorkItem->leader_proc_pid = MyProcPid;
+ MyWorkItem->nworkers_participating = 0;
+ MyWorkItem->nworkers_to_launch = nworkers_to_launch;
+ MyWorkItem->leader_sleeping_on_ssp = false;
+ MyWorkItem->leader_sleeping_on_esp = false;
+ MyWorkItem->nworkers_sleeping = 0;
+ MyWorkItem->nfinished = 0;
+ MyWorkItem->sync_type = LAUNCHER;
+ MyWorkItem->status = STARTUP;
+
+ MyWorkItem->active = true;
+ MyWorkItem->needs_launcher = true;
+ MyWorkItem->birthtime = GetCurrentTimestamp();
+ LWLockRelease(AutovacuumLock);
+
+ /* Become the leader */
+ MyWorkerInfo->wi_pcleanup = 0;
+
+ /* All created workers must get same locks as leader process */
+ BecomeLockGroupLeader();
+
+ /*
+ * Wait until all supprotive workers are launched. Also retrieve actual
+ * number of participants
+ */
+
+ nworkers_launched = parallel_autovacuum_start_sync_point(false);
+ Assert(nworkers_launched >= 0);
+ }
+ else
+ {
+ /*
+ * If we (for any reason) cannot send signal to the launcher, don't try
+ * to do index vacuuming in parallel
+ */
+ MyWorkItem = NULL;
+ LWLockRelease(AutovacuumLock);
+ return 0;
+ }
+
+ return nworkers_launched;
+}
+
+/*
+ * During parallel index vacuuming any worker (both supportives and leader) can
+ * catch an error.
+ * In order to handle it in the right way we must call this function.
+ */
+static void
+handle_parallel_idx_autovac_errors(void)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ if (AmParallelIdxAutoVacLeader())
+ {
+ if (MyWorkItem->status == START_SYNC_POINT_PASSED)
+ {
+ /*
+ * If start sync point already passed - just wait for all supportive
+ * workers to finish and exit.
+ */
+ ParallelAutovacuumEndSyncPoint(true);
+ }
+ else if (MyWorkItem->status == STARTUP)
+ {
+ /*
+ * If no sync point are passed we can prevent supportive workers
+ * from performing their work - set SHUTDOWN status and wait while
+ * all workers will see it.
+ */
+ MyWorkItem->status = SHUTDOWN;
+ parallel_autovacuum_start_sync_point(true);
+ }
+
+ AutoVacuumReleaseParallelWork(true);
+ }
+ else
+ {
+ Assert(AmParallelIdxAutoVacSupportive());
+
+ if (MyWorkItem->status == STARTUP || MyWorkItem->status == SHUTDOWN)
+ {
+ /*
+ * If no sync point are passed - just exclude ourselves from
+ * participants. Further parallel index vacuuming will take place
+ * as usual.
+ */
+ MyWorkItem->nworkers_to_launch -= 1;
+
+ if (MyWorkItem->nworkers_participating == MyWorkItem->nworkers_to_launch &&
+ MyWorkItem->sync_type == LAST_WORKER && MyWorkItem->leader_sleeping_on_ssp)
+ {
+ ConditionVariableBroadcast(&MyWorkItem->cv);
+
+ if (MyWorkItem->status != SHUTDOWN)
+ MyWorkItem->status = START_SYNC_POINT_PASSED;
+ }
+ }
+ else if (MyWorkItem->status == START_SYNC_POINT_PASSED)
+ {
+ /*
+ * If start sync point already passed we will simulate the usual
+ * end of work (see ParallelAutovacuumEndSyncPoint).
+ */
+ MyWorkItem->nfinished += 1;
+
+ /*
+ * We check "!MyWorkItem->leader_sleeping_on_ssp" in order to handle an
+ * almost impossible situation, when leader didn't have time to wake
+ * up after start sync point (but last worker already advenced
+ * status to START_SYNC_POINT_PASSED). In this case we should not
+ * advance status to END_SYNC_POINT_PASSED, so leader can continue
+ * processing.
+ */
+ if (MyWorkItem->sync_type == LAST_WORKER &&
+ MyWorkItem->nfinished == MyWorkItem->nworkers_participating &&
+ !MyWorkItem->leader_sleeping_on_ssp)
+ {
+ ConditionVariableBroadcast(&MyWorkItem->cv);
+ MyWorkItem->status = END_SYNC_POINT_PASSED;
+ }
+ }
+ }
+
+ LWLockRelease(AutovacuumLock);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3361,6 +4362,12 @@ AutoVacuumShmemInit(void)
AutoVacuumShmem->av_startingWorker = NULL;
memset(AutoVacuumShmem->av_workItems, 0,
sizeof(AutoVacuumWorkItem) * NUM_WORKITEMS);
+ memset(&AutoVacuumShmem->pav_workItems, 0,
+ sizeof(ParallelAutoVacuumWorkItem) * NUM_WORKITEMS);
+ for (int j = 0; j < NUM_WORKITEMS; j++)
+ ConditionVariableInit(&AutoVacuumShmem->pav_workItems[j].cv);
+
+ AutoVacuumShmem->pav_workers_left = max_parallel_index_autovac_workers;
worker = (WorkerInfo) ((char *) AutoVacuumShmem +
MAXALIGN(sizeof(AutoVacuumShmemStruct)));
@@ -3406,6 +4413,19 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
return true;
}
+/*
+ * GUC check_hook for max_parallel_index_autovac_workers
+ */
+bool
+check_max_parallel_index_autovac_workers(int *newval, void **extra,
+ GucSource source)
+{
+ if (*newval >= autovacuum_max_workers)
+ return false;
+ return true;
+}
+
+
/*
* Returns whether there is a free autovacuum worker slot available.
*/
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 2f8cbd86759..00c746bf853 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3647,6 +3647,16 @@ struct config_int ConfigureNamesInt[] =
check_autovacuum_work_mem, NULL, NULL
},
+ {
+ {"max_parallel_index_autovac_workers", PGC_POSTMASTER, VACUUM_AUTOVACUUM,
+ gettext_noop("Sets the maximum number of autovacuum workers that can be launched for parallel index processing during autovacuum."),
+ gettext_noop("This parameter limits the total number of such processes per cluster and must be < autovacuum_max_workers"),
+ },
+ &max_parallel_index_autovac_workers,
+ 0, 0, MAX_PARALLEL_WORKER_LIMIT,
+ check_max_parallel_index_autovac_workers, NULL, NULL
+ },
+
{
{"tcp_keepalives_idle", PGC_USERSET, CONN_AUTH_TCP,
gettext_noop("Time between issuing TCP keepalives."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 34826d01380..25c3c4fb258 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -146,6 +146,8 @@
#hash_mem_multiplier = 2.0 # 1-1000.0 multiplier on hash table work_mem
#maintenance_work_mem = 64MB # min 64kB
#autovacuum_work_mem = -1 # min 64kB, or -1 to use maintenance_work_mem
+#max_parallel_index_autovac_workers = 0 # this feature disabled by default
+ # (change requires restart)
#logical_decoding_work_mem = 64MB # min 64kB
#max_stack_depth = 2MB # min 100kB
#shared_memory_type = mmap # the default is the first option
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e8135f41a1c..bc3e3625a61 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -15,6 +15,8 @@
#define AUTOVACUUM_H
#include "storage/block.h"
+#include "storage/dsm_impl.h"
+#include "storage/lock.h"
/*
* Other processes can request specific work from autovacuum, identified by
@@ -25,12 +27,28 @@ typedef enum
AVW_BRINSummarizeRange,
} AutoVacuumWorkItemType;
+/*
+ * Magic number for parallel context TOC. Used for parallel index processing
+ * during autovacuum.
+ */
+#define AV_PARALLEL_MAGIC 0xaaaaaaaa
+
+/* Magic numbers for per-context parallel index processing state sharing. */
+#define AV_PARALLEL_KEY_TRANSACTION_SNAPSHOT UINT64CONST(0xFFF0000000000001)
+#define AV_PARALLEL_KEY_ACTIVE_SNAPSHOT UINT64CONST(0xFFF0000000000002)
+
+/*
+ * During parallel index processing we want to launch one a/v worker for every
+ * 30 indexes of table.
+ */
+#define AV_PARALLEL_INDEXES_PER_WORKER 30
/* GUC variables */
extern PGDLLIMPORT bool autovacuum_start_daemon;
extern PGDLLIMPORT int autovacuum_worker_slots;
extern PGDLLIMPORT int autovacuum_max_workers;
extern PGDLLIMPORT int autovacuum_work_mem;
+extern PGDLLIMPORT int max_parallel_index_autovac_workers;
extern PGDLLIMPORT int autovacuum_naptime;
extern PGDLLIMPORT int autovacuum_vac_thresh;
extern PGDLLIMPORT int autovacuum_vac_max_thresh;
@@ -58,12 +76,21 @@ extern void autovac_init(void);
/* called from postmaster when a worker could not be forked */
extern void AutoVacWorkerFailed(void);
+/* called from autovac worker when it needs participants in parallel index cleanup */
+extern bool AutoVacParallelWorkRequest(void);
+
pg_noreturn extern void AutoVacLauncherMain(const void *startup_data, size_t startup_data_len);
pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t startup_data_len);
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+extern void AutoVacuumReleaseParallelWork(bool keep_lock);
+extern int AutoVacuumParallelWorkWaitForStart(void);
+extern void ParallelAutovacuumEndSyncPoint( bool keep_lock);
+extern int GetAutoVacuumParallelWorkerNumber(void);
+extern int LaunchParallelAutovacuumWorkers(Oid rel_id, int nworkers_to_launch,
+ dsm_handle handle);
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 799fa7ace68..fb1b52a0ee4 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -31,6 +31,8 @@ extern void assign_application_name(const char *newval, void *extra);
extern const char *show_archive_command(void);
extern bool check_autovacuum_work_mem(int *newval, void **extra,
GucSource source);
+bool check_max_parallel_index_autovac_workers(int *newval, void **extra,
+ GucSource source);
extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
GucSource source);
extern bool check_backtrace_functions(char **newval, void **extra,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..c4d378917a2 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -348,7 +348,8 @@ typedef struct StdRdOptions
StdRdOptIndexCleanup vacuum_index_cleanup; /* controls index vacuuming */
bool vacuum_truncate; /* enables vacuum to truncate a relation */
bool vacuum_truncate_set; /* whether vacuum_truncate is set */
-
+ bool parallel_idx_autovac; /* enables autovacuum to process indexes
+ of this table in parallel mode */
/*
* Fraction of pages in a relation that vacuum can eagerly scan and fail
* to freeze. 0 if disabled, -1 if unspecified.
@@ -400,6 +401,16 @@ typedef struct StdRdOptions
(relation)->rd_rel->relkind == RELKIND_MATVIEW) ? \
((StdRdOptions *) (relation)->rd_options)->user_catalog_table : false)
+/*
+ * CanUseParallelIdxAutovacForRelation
+ * Check whether we can process indexes of this relation in paralllel mode
+ * during autovacuum.
+ */
+ #define CanUseParallelIdxAutovacForRelation(relation) \
+ (AssertMacro(RelationIsValid(relation)), \
+ (relation)->rd_options ? \
+ ((StdRdOptions *) (relation)->rd_options)->parallel_idx_autovac : false)
+
/*
* RelationGetParallelWorkers
* Returns the relation's parallel_workers reloption setting.
diff --git a/src/test/modules/autovacuum/.gitignore b/src/test/modules/autovacuum/.gitignore
new file mode 100644
index 00000000000..0b54641bceb
--- /dev/null
+++ b/src/test/modules/autovacuum/.gitignore
@@ -0,0 +1 @@
+/tmp_check/
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/Makefile b/src/test/modules/autovacuum/Makefile
new file mode 100644
index 00000000000..9b3f52c4879
--- /dev/null
+++ b/src/test/modules/autovacuum/Makefile
@@ -0,0 +1,14 @@
+# src/test/modules/autovacuum/Makefile
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/autovacuum/t/001_autovac_parallel.pl b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
new file mode 100644
index 00000000000..5da4226f0d6
--- /dev/null
+++ b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
@@ -0,0 +1,135 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+$node->append_conf('postgresql.conf', qq{
+ autovacuum = off
+ max_wal_size = 4096
+ autovacuum_max_workers = 10
+});
+$node->start;
+
+my $indexes_num = 80;
+my $initial_rows_num = 1_000_000;
+
+# Create big table and create specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER, col_5 INTEGER,
+ col_6 INTEGER, col_7 INTEGER, col_8 INTEGER, col_9 INTEGER, col_10 INTEGER,
+ col_11 INTEGER, col_12 INTEGER, col_13 INTEGER, col_14 INTEGER, col_15 INTEGER,
+ col_16 INTEGER, col_17 INTEGER, col_18 INTEGER, col_19 INTEGER, col_20 INTEGER,
+ col_21 INTEGER, col_22 INTEGER, col_23 INTEGER, col_24 INTEGER, col_25 INTEGER,
+ col_26 INTEGER, col_27 INTEGER, col_28 INTEGER, col_29 INTEGER, col_30 INTEGER,
+ col_31 INTEGER, col_32 INTEGER, col_33 INTEGER, col_34 INTEGER, col_35 INTEGER,
+ col_36 INTEGER, col_37 INTEGER, col_38 INTEGER, col_39 INTEGER, col_40 INTEGER,
+ col_41 INTEGER, col_42 INTEGER, col_43 INTEGER, col_44 INTEGER, col_45 INTEGER,
+ col_46 INTEGER, col_47 INTEGER, col_48 INTEGER, col_49 INTEGER, col_50 INTEGER,
+ col_51 INTEGER, col_52 INTEGER, col_53 INTEGER, col_54 INTEGER, col_55 INTEGER,
+ col_56 INTEGER, col_57 INTEGER, col_58 INTEGER, col_59 INTEGER, col_60 INTEGER,
+ col_61 INTEGER, col_62 INTEGER, col_63 INTEGER, col_64 INTEGER, col_65 INTEGER,
+ col_66 INTEGER, col_67 INTEGER, col_68 INTEGER, col_69 INTEGER, col_70 INTEGER,
+ col_71 INTEGER, col_72 INTEGER, col_73 INTEGER, col_74 INTEGER, col_75 INTEGER,
+ col_76 INTEGER, col_77 INTEGER, col_78 INTEGER, col_79 INTEGER, col_80 INTEGER,
+ col_81 INTEGER, col_82 INTEGER, col_83 INTEGER, col_84 INTEGER, col_85 INTEGER,
+ col_86 INTEGER, col_87 INTEGER, col_88 INTEGER, col_89 INTEGER, col_90 INTEGER,
+ col_91 INTEGER, col_92 INTEGER, col_93 INTEGER, col_94 INTEGER, col_95 INTEGER,
+ col_96 INTEGER, col_97 INTEGER, col_98 INTEGER, col_99 INTEGER, col_100 INTEGER
+ ) WITH (parallel_idx_autovac = true);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM pg_index i
+ JOIN pg_class c ON c.oid = i.indrelid
+ WHERE c.relname = 'test_autovac';",
+ stdout => \$psql_out
+);
+is($psql_out, $indexes_num + 1, "All indexes created successfully");
+
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac (
+ col_1, col_2, col_3, col_4, col_5, col_6, col_7, col_8, col_9, col_10,
+ col_11, col_12, col_13, col_14, col_15, col_16, col_17, col_18, col_19, col_20,
+ col_21, col_22, col_23, col_24, col_25, col_26, col_27, col_28, col_29, col_30,
+ col_31, col_32, col_33, col_34, col_35, col_36, col_37, col_38, col_39, col_40,
+ col_41, col_42, col_43, col_44, col_45, col_46, col_47, col_48, col_49, col_50,
+ col_51, col_52, col_53, col_54, col_55, col_56, col_57, col_58, col_59, col_60,
+ col_61, col_62, col_63, col_64, col_65, col_66, col_67, col_68, col_69, col_70,
+ col_71, col_72, col_73, col_74, col_75, col_76, col_77, col_78, col_79, col_80,
+ col_81, col_82, col_83, col_84, col_85, col_86, col_87, col_88, col_89, col_90,
+ col_91, col_92, col_93, col_94, col_95, col_96, col_97, col_98, col_99, col_100
+ ) VALUES (
+ i, i + 1, i + 2, i + 3, i + 4, i + 5, i + 6, i + 7, i + 8, i + 9,
+ i + 10, i + 11, i + 12, i + 13, i + 14, i + 15, i + 16, i + 17, i + 18, i + 19,
+ i + 20, i + 21, i + 22, i + 23, i + 24, i + 25, i + 26, i + 27, i + 28, i + 29,
+ i + 30, i + 31, i + 32, i + 33, i + 34, i + 35, i + 36, i + 37, i + 38, i + 39,
+ i + 40, i + 41, i + 42, i + 43, i + 44, i + 45, i + 46, i + 47, i + 48, i + 49,
+ i + 50, i + 51, i + 52, i + 53, i + 54, i + 55, i + 56, i + 57, i + 58, i + 59,
+ i + 60, i + 61, i + 62, i + 63, i + 64, i + 65, i + 66, i + 67, i + 68, i + 69,
+ i + 70, i + 71, i + 72, i + 73, i + 74, i + 75, i + 76, i + 77, i + 78, i + 79,
+ i + 80, i + 81, i + 82, i + 83, i + 84, i + 85, i + 86, i + 87, i + 88, i + 89,
+ i + 90, i + 91, i + 92, i + 93, i + 94, i + 95, i + 96, i + 97, i + 98, i + 99
+ );
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM test_autovac;",
+ stdout => \$psql_out
+);
+is($psql_out, $initial_rows_num, "All data inserted into table successfully");
+
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+my $dead_tuples_thresh = $initial_rows_num / 4;
+my $indexes_num_thresh = $indexes_num / 2;
+my $num_workers = 2;
+
+# Reduce autovacuum_work_mem, so leader process will perform parallel indexi
+# vacuum phase several times
+$node->append_conf('postgresql.conf', qq{
+ autovacuum_naptime = '1s'
+ autovacuum_work_mem = 2048
+ autovacuum_vacuum_threshold = 1
+ autovacuum_analyze_threshold = 1
+ autovacuum_vacuum_scale_factor = 0.1
+ autovacuum_analyze_scale_factor = 0.1
+ max_parallel_index_autovac_workers = $num_workers
+ autovacuum = on
+});
+
+$node->restart;
+
+# wait for autovacuum to reset datfrozenxid age to 0
+$node->poll_query_until('postgres', q{
+ SELECT count(*) = 0 FROM pg_database WHERE mxid_age(datfrozenxid) > 0
+}) or die "Timed out while waiting for autovacuum";
+
+ok(1, "There are no segfaults");
+
+$node->stop;
+done_testing();
--
2.43.0
Hi,
On 09/05/25 15:33, Daniil Davydov wrote:
Hi,
As I promised - meet parallel index autovacuum with bgworkers
(Parallel-index-autovacuum-with-bgworkers.patch). This is pretty
simple implementation :
1) Added new table option `parallel_idx_autovac_enabled` that must be
set to `true` if user wants autovacuum to process table in parallel.
2) Added new GUC variable `autovacuum_reserved_workers_num`. This is
number of parallel workers from bgworkers pool that can be used only
by autovacuum workers. The `autovacuum_reserved_workers_num` parameter
actually reserves a requested part of the processes, the total number
of which is equal to `max_worker_processes`.
3) When an autovacuum worker decides to process some table in
parallel, it just sets `VacuumParams->nworkers` to appropriate value
(> 0) and then the code is executed as if it were a regular VACUUM
PARALLEL.
4) I kept test/modules/autovacuum as sandbox where you can play with
parallel index autovacuum a bit.What do you think about this implementation?
I've reviewed the v1-0001 patch, the build on MacOS using meson+ninja is
failing:
❯❯❯ ninja -C build install
ninja: Entering directory `build'
[1/126] Compiling C object
src/backend/postgres_lib.a.p/utils_misc_guc_tables.c.o
FAILED: src/backend/postgres_lib.a.p/utils_misc_guc_tables.c.o
../src/backend/utils/misc/guc_tables.c:3613:4: error: incompatible
pointer to integer conversion initializing 'int' with an expression of
type 'void *' [-Wint-conversion]
3613 | NULL,
| ^~~~
It seems that the "autovacuum_reserved_workers_num" declaration on
guc_tables.c has an extra gettext_noop() call?
One other point is that as you've added TAP tests for the autovacuum I
think you also need to create a meson.build file as you already create
the Makefile.
You also need to update the src/test/modules/meson.build and
src/test/modules/Makefile to include the new test/modules/autovacuum
path.
--
Matheus Alcantara
Hi,
On Fri, May 16, 2025 at 4:06 AM Matheus Alcantara
<matheusssilv97@gmail.com> wrote:
I've reviewed the v1-0001 patch, the build on MacOS using meson+ninja is
failing:
❯❯❯ ninja -C build install
ninja: Entering directory `build'
[1/126] Compiling C object
src/backend/postgres_lib.a.p/utils_misc_guc_tables.c.o
FAILED: src/backend/postgres_lib.a.p/utils_misc_guc_tables.c.o
../src/backend/utils/misc/guc_tables.c:3613:4: error: incompatible
pointer to integer conversion initializing 'int' with an expression of
type 'void *' [-Wint-conversion]
3613 | NULL,
| ^~~~
Thank you for reviewing this patch!
It seems that the "autovacuum_reserved_workers_num" declaration on
guc_tables.c has an extra gettext_noop() call?
Good catch, I fixed this warning in the v2 version.
One other point is that as you've added TAP tests for the autovacuum I
think you also need to create a meson.build file as you already create
the Makefile.You also need to update the src/test/modules/meson.build and
src/test/modules/Makefile to include the new test/modules/autovacuum
path.
OK, I should clarify this moment : modules/autovacuum is not a normal
test but a sandbox - just an example of how we can trigger parallel
index autovacuum. Also it may be used for debugging purposes.
In fact, 001_autovac_parallel.pl is not verifying anything.
I'll do as you asked (add all meson and Make stuff), but please don't
focus on it. The creation of the real test is still in progress. (I'll
try to complete it as soon as possible).
In this letter I will divide the patch into 2 parts : implementation
and sandbox. What do you think about implementation?
--
Best regards,
Daniil Davydov
Attachments:
v2-0001-Parallel-index-autovacuum-with-bgworkers.patchapplication/x-patch; name=v2-0001-Parallel-index-autovacuum-with-bgworkers.patchDownload
From c518f1226f8961fdef88600a6d388674e184cff7 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Fri, 16 May 2025 11:58:40 +0700
Subject: [PATCH v2 1/2] Parallel index autovacuum with bgworkers
---
src/backend/access/common/reloptions.c | 11 ++++
src/backend/commands/vacuum.c | 55 +++++++++++++++++++
src/backend/commands/vacuumparallel.c | 46 ++++++++++------
src/backend/postmaster/autovacuum.c | 9 +++
src/backend/postmaster/bgworker.c | 33 ++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc_tables.c | 12 ++++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/miscadmin.h | 1 +
src/include/utils/guc_hooks.h | 2 +
src/include/utils/rel.h | 10 ++++
11 files changed, 162 insertions(+), 19 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 46c1dce222d..ccf59208783 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -166,6 +166,15 @@ static relopt_bool boolRelOpts[] =
},
true
},
+ {
+ {
+ "parallel_idx_autovac_enabled",
+ "Allows autovacuum to process indexes of this table in parallel mode",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ false
+ },
/* list terminator */
{{NULL}}
};
@@ -1863,6 +1872,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"parallel_idx_autovac_enabled", RELOPT_TYPE_BOOL,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, parallel_idx_autovac_enabled)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 33a33bf6b1c..f7667f14147 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -57,9 +57,21 @@
#include "utils/guc.h"
#include "utils/guc_hooks.h"
#include "utils/memutils.h"
+#include "utils/rel.h"
#include "utils/snapmgr.h"
#include "utils/syscache.h"
+/*
+ * Minimum number of dead tuples required for the table's indexes to be
+ * processed in parallel during autovacuum.
+ */
+#define AV_PARALLEL_DEADTUP_THRESHOLD 1024
+
+/*
+ * How many indexes should process each parallel worker during autovacuum.
+ */
+#define NUM_INDEXES_PER_PARALLEL_WORKER 30
+
/*
* Minimum interval for cost-based vacuum delay reports from a parallel worker.
* This aims to avoid sending too many messages and waking up the leader too
@@ -2234,6 +2246,49 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params,
else
toast_relid = InvalidOid;
+ /*
+ * If we are running autovacuum - decide whether we need to process indexes
+ * of table with given oid in parallel.
+ */
+ if (AmAutoVacuumWorkerProcess() &&
+ params->index_cleanup != VACOPTVALUE_DISABLED &&
+ RelationAllowsParallelIdxAutovac(rel))
+ {
+ PgStat_StatTabEntry *tabentry;
+
+ /* fetch the pgstat table entry */
+ tabentry = pgstat_fetch_stat_tabentry_ext(rel->rd_rel->relisshared,
+ rel->rd_id);
+ if (tabentry && tabentry->dead_tuples >= AV_PARALLEL_DEADTUP_THRESHOLD)
+ {
+ List *indexes = RelationGetIndexList(rel);
+ int num_indexes = list_length(indexes);
+
+ list_free(indexes);
+
+ if (av_reserved_workers_num > 0)
+ {
+ /*
+ * We request at least one parallel worker, if user set
+ * 'parallel_idx_autovac_enabled' option. The total number of
+ * additional parallel workers depends on how many indexes the
+ * table has. For now we assume that each parallel worker should
+ * process NUM_INDEXES_PER_PARALLEL_WORKER indexes.
+ */
+ params->nworkers =
+ Min((num_indexes / NUM_INDEXES_PER_PARALLEL_WORKER) + 1,
+ av_reserved_workers_num);
+ }
+ else
+ ereport(WARNING,
+ (errcode(ERRCODE_CONFIGURATION_LIMIT_EXCEEDED),
+ errmsg("Cannot launch any supportive workers for parallel index cleanup of rel %s",
+ RelationGetRelationName(rel)),
+ errhint("You might need to set parameter \"av_reserved_workers_num\" to a value > 0")));
+
+ }
+ }
+
/*
* Switch to the table owner's userid, so that any index functions are run
* as that user. Also lock down security-restricted operations and
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 2b9d548cdeb..e2b3e5b343c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,15 +1,15 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel [auto]vacuum execution.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
*
- * In a parallel vacuum, we perform both index bulk deletion and index cleanup
- * with parallel worker processes. Individual indexes are processed by one
- * vacuum process. ParallelVacuumState contains shared information as well as
- * the memory space for storing dead items allocated in the DSA area. We
+ * In a parallel [auto]vacuum, we perform both index bulk deletion and index
+ * cleanup with parallel worker processes. Individual indexes are processed by
+ * one vacuum process. ParallelVacuumState contains shared information as well
+ * as the memory space for storing dead items allocated in the DSA area. We
* launch parallel worker processes at the start of parallel index
* bulk-deletion and index cleanup and once all indexes are processed, the
* parallel worker processes exit. Each time we process indexes in parallel,
@@ -34,6 +34,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -157,7 +158,8 @@ typedef struct PVIndStats
} PVIndStats;
/*
- * Struct for maintaining a parallel vacuum state. typedef appears in vacuum.h.
+ * Struct for maintaining a parallel [auto]vacuum state. typedef appears in
+ * vacuum.h.
*/
struct ParallelVacuumState
{
@@ -371,10 +373,18 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->relid = RelationGetRelid(rel);
shared->elevel = elevel;
shared->queryid = pgstat_get_my_query_id();
- shared->maintenance_work_mem_worker =
- (nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+
+ if (AmAutoVacuumWorkerProcess())
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) :
+ autovacuum_work_mem;
+ else
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+ maintenance_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -558,7 +568,9 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster ||
+ (av_reserved_workers_num == 0 && AmAutoVacuumWorkerProcess()) ||
+ (max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess()))
return 0;
/*
@@ -597,15 +609,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = AmAutoVacuumWorkerProcess() ?
+ Min(parallel_workers, av_reserved_workers_num) :
+ Min(parallel_workers, max_parallel_maintenance_workers);
return parallel_workers;
}
/*
* Perform index vacuum or index cleanup with parallel workers. This function
- * must be used by the parallel vacuum leader process.
+ * must be used by the parallel [auto]vacuum leader process.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -982,8 +996,8 @@ parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
/*
* Perform work within a launched parallel process.
*
- * Since parallel vacuum workers perform only index vacuum or index cleanup,
- * we don't need to report progress information.
+ * Since parallel [auto]vacuum workers perform only index vacuum or index
+ * cleanup, we don't need to report progress information.
*/
void
parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 4d4a1a3197e..e7e340c4e7c 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3406,6 +3406,15 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
return true;
}
+bool
+check_autovacuum_reserved_workers_num(int *newval, void **extra,
+ GucSource source)
+{
+ if (*newval > (max_worker_processes - 8))
+ return false;
+ return true;
+}
+
/*
* Returns whether there is a free autovacuum worker slot available.
*/
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 116ddf7b835..cb86db99da9 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -1046,6 +1046,8 @@ RegisterDynamicBackgroundWorker(BackgroundWorker *worker,
BackgroundWorkerHandle **handle)
{
int slotno;
+ int from;
+ int upto;
bool success = false;
bool parallel;
uint64 generation = 0;
@@ -1088,10 +1090,23 @@ RegisterDynamicBackgroundWorker(BackgroundWorker *worker,
return false;
}
+ /*
+ * Determine range of workers in pool, that we can use (last
+ * 'av_reserved_workers_num' is reserved for autovacuum workers).
+ */
+
+ from = AmAutoVacuumWorkerProcess() ?
+ BackgroundWorkerData->total_slots - av_reserved_workers_num :
+ 0;
+
+ upto = AmAutoVacuumWorkerProcess() ?
+ BackgroundWorkerData->total_slots :
+ BackgroundWorkerData->total_slots - av_reserved_workers_num;
+
/*
* Look for an unused slot. If we find one, grab it.
*/
- for (slotno = 0; slotno < BackgroundWorkerData->total_slots; ++slotno)
+ for (slotno = from; slotno < upto; ++slotno)
{
BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
@@ -1159,7 +1174,13 @@ GetBackgroundWorkerPid(BackgroundWorkerHandle *handle, pid_t *pidp)
BackgroundWorkerSlot *slot;
pid_t pid;
- Assert(handle->slot < max_worker_processes);
+ /* Only autovacuum can use last 'av_reserved_workers_num' workers in pool. */
+ if (!AmAutoVacuumWorkerProcess())
+ Assert(handle->slot < max_worker_processes - av_reserved_workers_num);
+ else
+ Assert(handle->slot < max_worker_processes &&
+ handle->slot >= max_worker_processes - av_reserved_workers_num);
+
slot = &BackgroundWorkerData->slot[handle->slot];
/*
@@ -1298,7 +1319,13 @@ TerminateBackgroundWorker(BackgroundWorkerHandle *handle)
BackgroundWorkerSlot *slot;
bool signal_postmaster = false;
- Assert(handle->slot < max_worker_processes);
+ /* Only autovacuum can use last 'av_reserved_workers_num' workers in pool. */
+ if (!AmAutoVacuumWorkerProcess())
+ Assert(handle->slot < max_worker_processes - av_reserved_workers_num);
+ else
+ Assert(handle->slot < max_worker_processes &&
+ handle->slot >= max_worker_processes - av_reserved_workers_num);
+
slot = &BackgroundWorkerData->slot[handle->slot];
/* Set terminate flag in shared memory, unless slot has been reused. */
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 92b0446b80c..cff13ef6bd7 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -144,6 +144,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int av_reserved_workers_num = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 2f8cbd86759..90b4e9570cf 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3604,6 +3604,18 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"autovacuum_reserved_workers_num", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
+ gettext_noop("Number of worker processes, reserved for participation in parallel index processing during autovacuum."),
+ gettext_noop("This parameter is depending on \"max_worker_processes\" (not on \"autovacuum_max_workers\"). "
+ "*Only* autovacuum workers can use these additional processes. "
+ "Also, these processes are taken into account in \"max_parallel_workers\"."),
+ },
+ &av_reserved_workers_num,
+ 0, 0, MAX_BACKENDS,
+ check_autovacuum_reserved_workers_num, NULL, NULL
+ },
+
{
{"max_parallel_maintenance_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
gettext_noop("Sets the maximum number of parallel processes per maintenance operation."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 34826d01380..2e38bada2b0 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -223,6 +223,7 @@
#max_parallel_maintenance_workers = 2 # limited by max_parallel_workers
#max_parallel_workers = 8 # number of max_worker_processes that
# can be used in parallel operations
+#autovacuum_reserved_workers_num = 0 # disabled by default and limited by max_parallel_workers
#parallel_leader_participation = on
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1e59a7f910f..992c6b63226 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int NBuffers;
extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
+extern PGDLLIMPORT int av_reserved_workers_num;
extern PGDLLIMPORT int max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 799fa7ace68..9913c6e4681 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -31,6 +31,8 @@ extern void assign_application_name(const char *newval, void *extra);
extern const char *show_archive_command(void);
extern bool check_autovacuum_work_mem(int *newval, void **extra,
GucSource source);
+bool check_autovacuum_reserved_workers_num(int *newval, void **extra,
+ GucSource source);
extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
GucSource source);
extern bool check_backtrace_functions(char **newval, void **extra,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..55aa5c45be1 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,7 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+ bool parallel_idx_autovac_enabled;
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
@@ -409,6 +410,15 @@ typedef struct StdRdOptions
((relation)->rd_options ? \
((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
+/*
+ * RelationAllowsParallelIdxAutovac
+ * Returns whether the relation's indexes can be processed in parallel
+ * during autovacuum. Note multiple eval of argument!
+ */
+#define RelationAllowsParallelIdxAutovac(relation) \
+ ((relation)->rd_options ? \
+ ((StdRdOptions *) (relation)->rd_options)->autovacuum.parallel_idx_autovac_enabled : false)
+
/* ViewOptions->check_option values */
typedef enum ViewOptCheckOption
{
--
2.43.0
v2-0002-Sandbox-for-parallel-index-autovacuum.patchapplication/x-patch; name=v2-0002-Sandbox-for-parallel-index-autovacuum.patchDownload
From 5a25535f5f4212ca756b9c67bcecf3a271ceb215 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Fri, 16 May 2025 11:59:03 +0700
Subject: [PATCH v2 2/2] Sandbox for parallel index autovacuum
---
src/test/modules/Makefile | 1 +
src/test/modules/autovacuum/.gitignore | 1 +
src/test/modules/autovacuum/Makefile | 14 ++
src/test/modules/autovacuum/meson.build | 12 ++
.../autovacuum/t/001_autovac_parallel.pl | 129 ++++++++++++++++++
src/test/modules/meson.build | 1 +
6 files changed, 158 insertions(+)
create mode 100644 src/test/modules/autovacuum/.gitignore
create mode 100644 src/test/modules/autovacuum/Makefile
create mode 100644 src/test/modules/autovacuum/meson.build
create mode 100644 src/test/modules/autovacuum/t/001_autovac_parallel.pl
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index aa1d27bbed3..b7f3e342e82 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -5,6 +5,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
SUBDIRS = \
+ autovacuum \
brin \
commit_ts \
delay_execution \
diff --git a/src/test/modules/autovacuum/.gitignore b/src/test/modules/autovacuum/.gitignore
new file mode 100644
index 00000000000..0b54641bceb
--- /dev/null
+++ b/src/test/modules/autovacuum/.gitignore
@@ -0,0 +1 @@
+/tmp_check/
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/Makefile b/src/test/modules/autovacuum/Makefile
new file mode 100644
index 00000000000..90c00ff350b
--- /dev/null
+++ b/src/test/modules/autovacuum/Makefile
@@ -0,0 +1,14 @@
+# src/test/modules/autovacuum/Makefile
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/meson.build b/src/test/modules/autovacuum/meson.build
new file mode 100644
index 00000000000..f91c1a14d2b
--- /dev/null
+++ b/src/test/modules/autovacuum/meson.build
@@ -0,0 +1,12 @@
+# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+
+tests += {
+ 'name': 'autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_autovac_parallel.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/autovacuum/t/001_autovac_parallel.pl b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
new file mode 100644
index 00000000000..a44cbebe0fd
--- /dev/null
+++ b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
@@ -0,0 +1,129 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+$node->append_conf('postgresql.conf', qq{
+ autovacuum = off
+ max_wal_size = 4096
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_reserved_workers_num = 1
+});
+$node->start;
+
+my $indexes_num = 80;
+my $initial_rows_num = 100_000;
+
+# Create big table and create specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER, col_5 INTEGER,
+ col_6 INTEGER, col_7 INTEGER, col_8 INTEGER, col_9 INTEGER, col_10 INTEGER,
+ col_11 INTEGER, col_12 INTEGER, col_13 INTEGER, col_14 INTEGER, col_15 INTEGER,
+ col_16 INTEGER, col_17 INTEGER, col_18 INTEGER, col_19 INTEGER, col_20 INTEGER,
+ col_21 INTEGER, col_22 INTEGER, col_23 INTEGER, col_24 INTEGER, col_25 INTEGER,
+ col_26 INTEGER, col_27 INTEGER, col_28 INTEGER, col_29 INTEGER, col_30 INTEGER,
+ col_31 INTEGER, col_32 INTEGER, col_33 INTEGER, col_34 INTEGER, col_35 INTEGER,
+ col_36 INTEGER, col_37 INTEGER, col_38 INTEGER, col_39 INTEGER, col_40 INTEGER,
+ col_41 INTEGER, col_42 INTEGER, col_43 INTEGER, col_44 INTEGER, col_45 INTEGER,
+ col_46 INTEGER, col_47 INTEGER, col_48 INTEGER, col_49 INTEGER, col_50 INTEGER,
+ col_51 INTEGER, col_52 INTEGER, col_53 INTEGER, col_54 INTEGER, col_55 INTEGER,
+ col_56 INTEGER, col_57 INTEGER, col_58 INTEGER, col_59 INTEGER, col_60 INTEGER,
+ col_61 INTEGER, col_62 INTEGER, col_63 INTEGER, col_64 INTEGER, col_65 INTEGER,
+ col_66 INTEGER, col_67 INTEGER, col_68 INTEGER, col_69 INTEGER, col_70 INTEGER,
+ col_71 INTEGER, col_72 INTEGER, col_73 INTEGER, col_74 INTEGER, col_75 INTEGER,
+ col_76 INTEGER, col_77 INTEGER, col_78 INTEGER, col_79 INTEGER, col_80 INTEGER,
+ col_81 INTEGER, col_82 INTEGER, col_83 INTEGER, col_84 INTEGER, col_85 INTEGER,
+ col_86 INTEGER, col_87 INTEGER, col_88 INTEGER, col_89 INTEGER, col_90 INTEGER,
+ col_91 INTEGER, col_92 INTEGER, col_93 INTEGER, col_94 INTEGER, col_95 INTEGER,
+ col_96 INTEGER, col_97 INTEGER, col_98 INTEGER, col_99 INTEGER, col_100 INTEGER
+ ) WITH (parallel_idx_autovac_enabled = true);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM pg_index i
+ JOIN pg_class c ON c.oid = i.indrelid
+ WHERE c.relname = 'test_autovac';",
+ stdout => \$psql_out
+);
+is($psql_out, $indexes_num + 1, "All indexes created successfully");
+
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac (
+ col_1, col_2, col_3, col_4, col_5, col_6, col_7, col_8, col_9, col_10,
+ col_11, col_12, col_13, col_14, col_15, col_16, col_17, col_18, col_19, col_20,
+ col_21, col_22, col_23, col_24, col_25, col_26, col_27, col_28, col_29, col_30,
+ col_31, col_32, col_33, col_34, col_35, col_36, col_37, col_38, col_39, col_40,
+ col_41, col_42, col_43, col_44, col_45, col_46, col_47, col_48, col_49, col_50,
+ col_51, col_52, col_53, col_54, col_55, col_56, col_57, col_58, col_59, col_60,
+ col_61, col_62, col_63, col_64, col_65, col_66, col_67, col_68, col_69, col_70,
+ col_71, col_72, col_73, col_74, col_75, col_76, col_77, col_78, col_79, col_80,
+ col_81, col_82, col_83, col_84, col_85, col_86, col_87, col_88, col_89, col_90,
+ col_91, col_92, col_93, col_94, col_95, col_96, col_97, col_98, col_99, col_100
+ ) VALUES (
+ i, i + 1, i + 2, i + 3, i + 4, i + 5, i + 6, i + 7, i + 8, i + 9,
+ i + 10, i + 11, i + 12, i + 13, i + 14, i + 15, i + 16, i + 17, i + 18, i + 19,
+ i + 20, i + 21, i + 22, i + 23, i + 24, i + 25, i + 26, i + 27, i + 28, i + 29,
+ i + 30, i + 31, i + 32, i + 33, i + 34, i + 35, i + 36, i + 37, i + 38, i + 39,
+ i + 40, i + 41, i + 42, i + 43, i + 44, i + 45, i + 46, i + 47, i + 48, i + 49,
+ i + 50, i + 51, i + 52, i + 53, i + 54, i + 55, i + 56, i + 57, i + 58, i + 59,
+ i + 60, i + 61, i + 62, i + 63, i + 64, i + 65, i + 66, i + 67, i + 68, i + 69,
+ i + 70, i + 71, i + 72, i + 73, i + 74, i + 75, i + 76, i + 77, i + 78, i + 79,
+ i + 80, i + 81, i + 82, i + 83, i + 84, i + 85, i + 86, i + 87, i + 88, i + 89,
+ i + 90, i + 91, i + 92, i + 93, i + 94, i + 95, i + 96, i + 97, i + 98, i + 99
+ );
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM test_autovac;",
+ stdout => \$psql_out
+);
+is($psql_out, $initial_rows_num, "All data inserted into table successfully");
+
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Reduce autovacuum_work_mem, so leader process will perform parallel indexi
+# vacuum phase several times
+$node->append_conf('postgresql.conf', qq{
+ autovacuum_naptime = '1s'
+ autovacuum_vacuum_threshold = 1
+ autovacuum_analyze_threshold = 1
+ autovacuum_vacuum_scale_factor = 0.1
+ autovacuum_analyze_scale_factor = 0.1
+ autovacuum = on
+});
+
+$node->restart;
+
+# sleep(3600);
+
+ok(1, "There are no segfaults");
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 9de0057bd1d..7f2ad810ca0 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -1,5 +1,6 @@
# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+subdir('autovacuum')
subdir('brin')
subdir('commit_ts')
subdir('delay_execution')
--
2.43.0
On Thu, May 15, 2025 at 10:10 PM Daniil Davydov <3danissimo@gmail.com> wrote:
Hi,
On Fri, May 16, 2025 at 4:06 AM Matheus Alcantara
<matheusssilv97@gmail.com> wrote:I've reviewed the v1-0001 patch, the build on MacOS using meson+ninja is
failing:
❯❯❯ ninja -C build install
ninja: Entering directory `build'
[1/126] Compiling C object
src/backend/postgres_lib.a.p/utils_misc_guc_tables.c.o
FAILED: src/backend/postgres_lib.a.p/utils_misc_guc_tables.c.o
../src/backend/utils/misc/guc_tables.c:3613:4: error: incompatible
pointer to integer conversion initializing 'int' with an expression of
type 'void *' [-Wint-conversion]
3613 | NULL,
| ^~~~Thank you for reviewing this patch!
It seems that the "autovacuum_reserved_workers_num" declaration on
guc_tables.c has an extra gettext_noop() call?Good catch, I fixed this warning in the v2 version.
One other point is that as you've added TAP tests for the autovacuum I
think you also need to create a meson.build file as you already create
the Makefile.You also need to update the src/test/modules/meson.build and
src/test/modules/Makefile to include the new test/modules/autovacuum
path.OK, I should clarify this moment : modules/autovacuum is not a normal
test but a sandbox - just an example of how we can trigger parallel
index autovacuum. Also it may be used for debugging purposes.
In fact, 001_autovac_parallel.pl is not verifying anything.
I'll do as you asked (add all meson and Make stuff), but please don't
focus on it. The creation of the real test is still in progress. (I'll
try to complete it as soon as possible).In this letter I will divide the patch into 2 parts : implementation
and sandbox. What do you think about implementation?
Thank you for updating the patches. I have some comments on v2-0001 patch:
+ {
+ {"autovacuum_reserved_workers_num", PGC_USERSET,
RESOURCES_WORKER_PROCESSES,
+ gettext_noop("Number of worker processes, reserved for
participation in parallel index processing during autovacuum."),
+ gettext_noop("This parameter is depending on
\"max_worker_processes\" (not on \"autovacuum_max_workers\"). "
+ "*Only* autovacuum workers can use these
additional processes. "
+ "Also, these processes are taken into account
in \"max_parallel_workers\"."),
+ },
+ &av_reserved_workers_num,
+ 0, 0, MAX_BACKENDS,
+ check_autovacuum_reserved_workers_num, NULL, NULL
+ },
I find that the name "autovacuum_reserved_workers_num" is generic. It
would be better to have a more specific name for parallel vacuum such
as autovacuum_max_parallel_workers. This parameter is related to
neither autovacuum_worker_slots nor autovacuum_max_workers, which
seems fine to me. Also, max_parallel_maintenance_workers doesn't
affect this parameter.
Which number does this parameter mean to specify: the maximum number
of parallel vacuum workers that can be used during autovacuum or the
maximum number of parallel vacuum workers that each autovacuum can
use?
---
The patch includes the changes to bgworker.c so that we can reserve
some slots for autovacuums. I guess that this change is not
necessarily necessary because if the user sets the related GUC
parameters correctly the autovacuum workers can use parallel vacuum as
expected. Even if we need this change, I would suggest implementing
it as a separate patch.
---
+ {
+ {
+ "parallel_idx_autovac_enabled",
+ "Allows autovacuum to process indexes of this table in
parallel mode",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ false
+ },
The proposed reloption name doesn't align with our naming conventions.
Looking at our existing reloptions, we typically write out full words
rather than using abbreviations like 'autovac' or 'idx'.
The new reloption name seems not to follow the conventional naming
style for existing reloption. For instance, we don't use abbreviations
such as 'autovac' and 'idx'.
I guess we can implement this parameter as an integer parameter so
that the user can specify the number of parallel vacuum workers for
the table. For example, we can have a reloption
autovacuum_parallel_workers. Setting 0 (by default) means to disable
parallel vacuum during autovacuum, and setting special value -1 means
to let PostgreSQL calculate the parallel degree for the table (same as
the default VACUUM command behavior).
I've also considered some alternative names. If we were to use
parallel_maintenance_workers, it sounds like it controls the parallel
degree for all operations using max_parallel_maintenance_workers,
including CREATE INDEX. Similarly, vacuum_parallel_workers could be
interpreted as affecting both autovacuum and manual VACUUM commands,
suggesting that when users run "VACUUM (PARALLEL) t", the system would
use their specified value for the parallel degree. I prefer
autovacuum_parallel_workers or vacuum_parallel_workers.
---
+ /*
+ * If we are running autovacuum - decide whether we need to process indexes
+ * of table with given oid in parallel.
+ */
+ if (AmAutoVacuumWorkerProcess() &&
+ params->index_cleanup != VACOPTVALUE_DISABLED &&
+ RelationAllowsParallelIdxAutovac(rel))
I think that this should be done in autovacuum code.
---
+/*
+ * Minimum number of dead tuples required for the table's indexes to be
+ * processed in parallel during autovacuum.
+ */
+#define AV_PARALLEL_DEADTUP_THRESHOLD 1024
+
+/*
+ * How many indexes should process each parallel worker during autovacuum.
+ */
+#define NUM_INDEXES_PER_PARALLEL_WORKER 30
These fixed values really useful in common cases? I think we already
have an optimization where we skip vacuum indexes if the table has
fewer dead tuples (see BYPASS_THRESHOLD_PAGES). Given that we rely on
users' heuristics which table needs to use parallel vacuum during
autovacuum, I think we don't need to apply these conditions.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Hi,
On Wed, May 21, 2025 at 5:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I have some comments on v2-0001 patch
Thank you for reviewing this patch!
+ { + {"autovacuum_reserved_workers_num", PGC_USERSET, RESOURCES_WORKER_PROCESSES, + gettext_noop("Number of worker processes, reserved for participation in parallel index processing during autovacuum."), + gettext_noop("This parameter is depending on \"max_worker_processes\" (not on \"autovacuum_max_workers\"). " + "*Only* autovacuum workers can use these additional processes. " + "Also, these processes are taken into account in \"max_parallel_workers\"."), + }, + &av_reserved_workers_num, + 0, 0, MAX_BACKENDS, + check_autovacuum_reserved_workers_num, NULL, NULL + },I find that the name "autovacuum_reserved_workers_num" is generic. It
would be better to have a more specific name for parallel vacuum such
as autovacuum_max_parallel_workers. This parameter is related to
neither autovacuum_worker_slots nor autovacuum_max_workers, which
seems fine to me. Also, max_parallel_maintenance_workers doesn't
affect this parameter.
.......
I've also considered some alternative names. If we were to use
parallel_maintenance_workers, it sounds like it controls the parallel
degree for all operations using max_parallel_maintenance_workers,
including CREATE INDEX. Similarly, vacuum_parallel_workers could be
interpreted as affecting both autovacuum and manual VACUUM commands,
suggesting that when users run "VACUUM (PARALLEL) t", the system would
use their specified value for the parallel degree. I prefer
autovacuum_parallel_workers or vacuum_parallel_workers.
This was my headache when I created names for variables. Autovacuum
initially implies parallelism, because we have several parallel a/v
workers. So I think that parameter like
`autovacuum_max_parallel_workers` will confuse somebody.
If we want to have a more specific name, I would prefer
`max_parallel_index_autovacuum_workers`.
Which number does this parameter mean to specify: the maximum number
of parallel vacuum workers that can be used during autovacuum or the
maximum number of parallel vacuum workers that each autovacuum can
use?
First variant. I will concrete this in the variable's description.
+ { + { + "parallel_idx_autovac_enabled", + "Allows autovacuum to process indexes of this table in parallel mode", + RELOPT_KIND_HEAP, + ShareUpdateExclusiveLock + }, + false + },The proposed reloption name doesn't align with our naming conventions.
Looking at our existing reloptions, we typically write out full words
rather than using abbreviations like 'autovac' or 'idx'.The new reloption name seems not to follow the conventional naming
style for existing reloption. For instance, we don't use abbreviations
such as 'autovac' and 'idx'.
OK, I'll fix it.
+ /* + * If we are running autovacuum - decide whether we need to process indexes + * of table with given oid in parallel. + */ + if (AmAutoVacuumWorkerProcess() && + params->index_cleanup != VACOPTVALUE_DISABLED && + RelationAllowsParallelIdxAutovac(rel))I think that this should be done in autovacuum code.
We need params->index cleanup variable to decide whether we need to
use parallel index a/v. In autovacuum.c we have this code :
***
/*
* index_cleanup and truncate are unspecified at first in autovacuum.
* They will be filled in with usable values using their reloptions
* (or reloption defaults) later.
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
***
This variable is filled in inside the `vacuum_rel` function, so I
think we should keep the above logic in vacuum.c.
+#define AV_PARALLEL_DEADTUP_THRESHOLD 1024
These fixed values really useful in common cases? I think we already
have an optimization where we skip vacuum indexes if the table has
fewer dead tuples (see BYPASS_THRESHOLD_PAGES).
When we allocate dead items (and optionally init parallel autocuum) we
don't have sane value for `vacrel->lpdead_item_pages` (which should be
compared with BYPASS_THRESHOLD_PAGES).
The only criterion that we can focus on is the number of dead tuples
indicated in the PgStat_StatTabEntry.
----
I guess we can implement this parameter as an integer parameter so that the user can specify the number of parallel vacuum workers for the table. For example, we can have a reloption autovacuum_parallel_workers. Setting 0 (by default) means to disable parallel vacuum during autovacuum, and setting special value -1 means to let PostgreSQL calculate the parallel degree for the table (same as the default VACUUM command behavior). ........... The patch includes the changes to bgworker.c so that we can reserve some slots for autovacuums. I guess that this change is not necessarily necessary because if the user sets the related GUC parameters correctly the autovacuum workers can use parallel vacuum as expected. Even if we need this change, I would suggest implementing it as a separate patch. .......... +#define AV_PARALLEL_DEADTUP_THRESHOLD 1024 +#define NUM_INDEXES_PER_PARALLEL_WORKER 30These fixed values really useful in common cases? Given that we rely on
users' heuristics which table needs to use parallel vacuum during
autovacuum, I think we don't need to apply these conditions.
..........
I grouped these comments together, because they all relate to a single
question : how much freedom will we give to the user?
Your opinion (as far as I understand) is that we allow users to
specify any number of parallel workers for tables, and it is the
user's responsibility to configure appropriate GUC variables, so that
autovacuum can always process indexes in parallel.
And we don't need to think about thresholds. Even if the table has a
small number of indexes and dead rows - if the user specified table
option, we must do a parallel index a/v with requested number of
parallel workers.
Please correct me if I messed something up.
I think that this logic is well suited for the `VACUUM (PARALLEL)` sql
command, which is manually called by the user.
But autovacuum (as I think) should work as stable as possible and
`unnoticed` by other processes. Thus, we must :
1) Compute resources (such as the number of parallel workers for a
single table's indexes vacuuming) as efficiently as possible.
2) Provide a guarantee that as many tables as possible (among
requested) will be processed in parallel.
(1) can be achieved by calculating the parameters on the fly.
NUM_INDEXES_PER_PARALLEL_WORKER is a rough mock. I can provide more
accurate value in the near future.
(2) can be achieved by workers reserving - we know that N workers
(from bgworkers pool) are *always* at our disposal. And when we use
such workers we are not dependent on other operations in the cluster
and we don't interfere with other operations by taking resources away
from them.
If we give the user too much freedom in parallel index a/v tuning, all
these requirements may be violated.
This is only my opinion, and I can agree with yours. Maybe we need
another person to judge us?
Please see v3 patches that contain changes related to GUC parameter
and table option (no changes in global logic by now).
--
Best regards,
Daniil Davydov
Attachments:
v3-0001-Parallel-index-autovacuum-with-bgworkers.patchtext/x-patch; charset=US-ASCII; name=v3-0001-Parallel-index-autovacuum-with-bgworkers.patchDownload
From 2223da7a9b2ef8c8d71780ad72b24eaf6d6c1141 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Fri, 16 May 2025 11:58:40 +0700
Subject: [PATCH v3 1/2] Parallel index autovacuum with bgworkers
---
src/backend/access/common/reloptions.c | 11 ++++
src/backend/commands/vacuum.c | 55 +++++++++++++++++++
src/backend/commands/vacuumparallel.c | 46 ++++++++++------
src/backend/postmaster/autovacuum.c | 14 ++++-
src/backend/postmaster/bgworker.c | 33 ++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc_tables.c | 12 ++++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/miscadmin.h | 1 +
src/include/utils/guc_hooks.h | 2 +
src/include/utils/rel.h | 10 ++++
11 files changed, 166 insertions(+), 20 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 46c1dce222d..730096002b1 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -166,6 +166,15 @@ static relopt_bool boolRelOpts[] =
},
true
},
+ {
+ {
+ "parallel_index_autovacuum_enabled",
+ "Allows autovacuum to process indexes of this table in parallel mode",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ false
+ },
/* list terminator */
{{NULL}}
};
@@ -1863,6 +1872,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"parallel_index_autovacuum_enabled", RELOPT_TYPE_BOOL,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, parallel_index_autovacuum_enabled)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 33a33bf6b1c..6c2f49f203f 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -57,9 +57,21 @@
#include "utils/guc.h"
#include "utils/guc_hooks.h"
#include "utils/memutils.h"
+#include "utils/rel.h"
#include "utils/snapmgr.h"
#include "utils/syscache.h"
+/*
+ * Minimum number of dead tuples required for the table's indexes to be
+ * processed in parallel during autovacuum.
+ */
+#define AV_PARALLEL_DEADTUP_THRESHOLD 1024
+
+/*
+ * How many indexes should process each parallel worker during autovacuum.
+ */
+#define NUM_INDEXES_PER_PARALLEL_WORKER 30
+
/*
* Minimum interval for cost-based vacuum delay reports from a parallel worker.
* This aims to avoid sending too many messages and waking up the leader too
@@ -2234,6 +2246,49 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params,
else
toast_relid = InvalidOid;
+ /*
+ * If we are running autovacuum - decide whether we need to process indexes
+ * of table with given oid in parallel.
+ */
+ if (AmAutoVacuumWorkerProcess() &&
+ params->index_cleanup != VACOPTVALUE_DISABLED &&
+ RelationAllowsParallelIdxAutovac(rel))
+ {
+ PgStat_StatTabEntry *tabentry;
+
+ /* fetch the pgstat table entry */
+ tabentry = pgstat_fetch_stat_tabentry_ext(rel->rd_rel->relisshared,
+ rel->rd_id);
+ if (tabentry && tabentry->dead_tuples >= AV_PARALLEL_DEADTUP_THRESHOLD)
+ {
+ List *indexes = RelationGetIndexList(rel);
+ int num_indexes = list_length(indexes);
+
+ list_free(indexes);
+
+ if (pia_reserved_workers > 0)
+ {
+ /*
+ * We request at least one parallel worker, if user set
+ * 'parallel_idx_autovac_enabled' option. The total number of
+ * additional parallel workers depends on how many indexes the
+ * table has. For now we assume that each parallel worker should
+ * process NUM_INDEXES_PER_PARALLEL_WORKER indexes.
+ */
+ params->nworkers =
+ Min((num_indexes / NUM_INDEXES_PER_PARALLEL_WORKER) + 1,
+ pia_reserved_workers);
+ }
+ else
+ ereport(WARNING,
+ (errcode(ERRCODE_CONFIGURATION_LIMIT_EXCEEDED),
+ errmsg("Cannot launch any supportive workers for parallel index cleanup of rel %s",
+ RelationGetRelationName(rel)),
+ errhint("You might need to set parameter \"pia_reserved_workers\" to a value > 0")));
+
+ }
+ }
+
/*
* Switch to the table owner's userid, so that any index functions are run
* as that user. Also lock down security-restricted operations and
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 2b9d548cdeb..5c48a1e740e 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,15 +1,15 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel [auto]vacuum execution.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
*
- * In a parallel vacuum, we perform both index bulk deletion and index cleanup
- * with parallel worker processes. Individual indexes are processed by one
- * vacuum process. ParallelVacuumState contains shared information as well as
- * the memory space for storing dead items allocated in the DSA area. We
+ * In a parallel [auto]vacuum, we perform both index bulk deletion and index
+ * cleanup with parallel worker processes. Individual indexes are processed by
+ * one vacuum process. ParallelVacuumState contains shared information as well
+ * as the memory space for storing dead items allocated in the DSA area. We
* launch parallel worker processes at the start of parallel index
* bulk-deletion and index cleanup and once all indexes are processed, the
* parallel worker processes exit. Each time we process indexes in parallel,
@@ -34,6 +34,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -157,7 +158,8 @@ typedef struct PVIndStats
} PVIndStats;
/*
- * Struct for maintaining a parallel vacuum state. typedef appears in vacuum.h.
+ * Struct for maintaining a parallel [auto]vacuum state. typedef appears in
+ * vacuum.h.
*/
struct ParallelVacuumState
{
@@ -371,10 +373,18 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->relid = RelationGetRelid(rel);
shared->elevel = elevel;
shared->queryid = pgstat_get_my_query_id();
- shared->maintenance_work_mem_worker =
- (nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+
+ if (AmAutoVacuumWorkerProcess())
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) :
+ autovacuum_work_mem;
+ else
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+ maintenance_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -558,7 +568,9 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster ||
+ (pia_reserved_workers == 0 && AmAutoVacuumWorkerProcess()) ||
+ (max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess()))
return 0;
/*
@@ -597,15 +609,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = AmAutoVacuumWorkerProcess() ?
+ Min(parallel_workers, pia_reserved_workers) :
+ Min(parallel_workers, max_parallel_maintenance_workers);
return parallel_workers;
}
/*
* Perform index vacuum or index cleanup with parallel workers. This function
- * must be used by the parallel vacuum leader process.
+ * must be used by the parallel [auto]vacuum leader process.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -982,8 +996,8 @@ parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
/*
* Perform work within a launched parallel process.
*
- * Since parallel vacuum workers perform only index vacuum or index cleanup,
- * we don't need to report progress information.
+ * Since parallel [auto]vacuum workers perform only index vacuum or index
+ * cleanup, we don't need to report progress information.
*/
void
parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 4d4a1a3197e..59fb52aa443 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2824,7 +2824,11 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
+ /*
+ * Don't request parallel mode by now. nworkers might be set to
+ * positive value if we will meet appropriate for parallel index
+ * processing table.
+ */
tab->at_params.nworkers = -1;
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
@@ -3406,6 +3410,14 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
return true;
}
+bool
+check_pia_reserved_workers(int *newval, void **extra, GucSource source)
+{
+ if (*newval > (max_worker_processes - 8))
+ return false;
+ return true;
+}
+
/*
* Returns whether there is a free autovacuum worker slot available.
*/
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 116ddf7b835..e62076939ec 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -1046,6 +1046,8 @@ RegisterDynamicBackgroundWorker(BackgroundWorker *worker,
BackgroundWorkerHandle **handle)
{
int slotno;
+ int from;
+ int upto;
bool success = false;
bool parallel;
uint64 generation = 0;
@@ -1088,10 +1090,23 @@ RegisterDynamicBackgroundWorker(BackgroundWorker *worker,
return false;
}
+ /*
+ * Determine range of workers in pool, that we can use (last
+ * 'pia_reserved_workers' is reserved for autovacuum workers).
+ */
+
+ from = AmAutoVacuumWorkerProcess() ?
+ BackgroundWorkerData->total_slots - pia_reserved_workers :
+ 0;
+
+ upto = AmAutoVacuumWorkerProcess() ?
+ BackgroundWorkerData->total_slots :
+ BackgroundWorkerData->total_slots - pia_reserved_workers;
+
/*
* Look for an unused slot. If we find one, grab it.
*/
- for (slotno = 0; slotno < BackgroundWorkerData->total_slots; ++slotno)
+ for (slotno = from; slotno < upto; ++slotno)
{
BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
@@ -1159,7 +1174,13 @@ GetBackgroundWorkerPid(BackgroundWorkerHandle *handle, pid_t *pidp)
BackgroundWorkerSlot *slot;
pid_t pid;
- Assert(handle->slot < max_worker_processes);
+ /* Only autovacuum can use last 'pia_reserved_workers' workers in pool. */
+ if (!AmAutoVacuumWorkerProcess())
+ Assert(handle->slot < max_worker_processes - pia_reserved_workers);
+ else
+ Assert(handle->slot < max_worker_processes &&
+ handle->slot >= max_worker_processes - pia_reserved_workers);
+
slot = &BackgroundWorkerData->slot[handle->slot];
/*
@@ -1298,7 +1319,13 @@ TerminateBackgroundWorker(BackgroundWorkerHandle *handle)
BackgroundWorkerSlot *slot;
bool signal_postmaster = false;
- Assert(handle->slot < max_worker_processes);
+ /* Only autovacuum can use last 'pia_reserved_workers' workers in pool. */
+ if (!AmAutoVacuumWorkerProcess())
+ Assert(handle->slot < max_worker_processes - pia_reserved_workers);
+ else
+ Assert(handle->slot < max_worker_processes &&
+ handle->slot >= max_worker_processes - pia_reserved_workers);
+
slot = &BackgroundWorkerData->slot[handle->slot];
/* Set terminate flag in shared memory, unless slot has been reused. */
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 92b0446b80c..a6fdcd2de5b 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -144,6 +144,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int pia_reserved_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 2f8cbd86759..dfc18095d7b 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3604,6 +3604,18 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"parallel_index_autovacuum_reserved_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
+ gettext_noop("Maximum number of worker processes (from bgworkers pool), reserved for participation in parallel index autovacuum."),
+ gettext_noop("This parameter is depending on \"max_worker_processes\" (not on \"autovacuum_max_workers\"). "
+ "*Only* autovacuum workers can use these supportive processes. "
+ "Also, these processes are taken into account in \"max_parallel_workers\"."),
+ },
+ &pia_reserved_workers,
+ 0, 0, MAX_BACKENDS,
+ check_pia_reserved_workers, NULL, NULL
+ },
+
{
{"max_parallel_maintenance_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
gettext_noop("Sets the maximum number of parallel processes per maintenance operation."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 34826d01380..3d96af1547f 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -223,6 +223,7 @@
#max_parallel_maintenance_workers = 2 # limited by max_parallel_workers
#max_parallel_workers = 8 # number of max_worker_processes that
# can be used in parallel operations
+#parallel_index_autovacuum_reserved_workers = 0 # disabled by default and limited by max_parallel_workers
#parallel_leader_participation = on
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1e59a7f910f..465dfe25009 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int NBuffers;
extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
+extern PGDLLIMPORT int pia_reserved_workers;
extern PGDLLIMPORT int max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 799fa7ace68..8507f95b2ea 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -31,6 +31,8 @@ extern void assign_application_name(const char *newval, void *extra);
extern const char *show_archive_command(void);
extern bool check_autovacuum_work_mem(int *newval, void **extra,
GucSource source);
+extern bool check_pia_reserved_workers(int *newval, void **extra,
+ GucSource source);
extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
GucSource source);
extern bool check_backtrace_functions(char **newval, void **extra,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..980c3459469 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,7 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+ bool parallel_index_autovacuum_enabled;
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
@@ -409,6 +410,15 @@ typedef struct StdRdOptions
((relation)->rd_options ? \
((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
+/*
+ * RelationAllowsParallelIdxAutovac
+ * Returns whether the relation's indexes can be processed in parallel
+ * during autovacuum. Note multiple eval of argument!
+ */
+#define RelationAllowsParallelIdxAutovac(relation) \
+ ((relation)->rd_options ? \
+ ((StdRdOptions *) (relation)->rd_options)->autovacuum.parallel_index_autovacuum_enabled : false)
+
/* ViewOptions->check_option values */
typedef enum ViewOptCheckOption
{
--
2.43.0
v3-0002-Sandbox-for-parallel-index-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v3-0002-Sandbox-for-parallel-index-autovacuum.patchDownload
From d17a01ef2ace5fc6cfd1d22930454d90cfbe63dd Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Fri, 16 May 2025 11:59:03 +0700
Subject: [PATCH v3 2/2] Sandbox for parallel index autovacuum
---
src/test/modules/Makefile | 1 +
src/test/modules/autovacuum/.gitignore | 1 +
src/test/modules/autovacuum/Makefile | 14 ++
src/test/modules/autovacuum/meson.build | 12 ++
.../autovacuum/t/001_autovac_parallel.pl | 129 ++++++++++++++++++
src/test/modules/meson.build | 1 +
6 files changed, 158 insertions(+)
create mode 100644 src/test/modules/autovacuum/.gitignore
create mode 100644 src/test/modules/autovacuum/Makefile
create mode 100644 src/test/modules/autovacuum/meson.build
create mode 100644 src/test/modules/autovacuum/t/001_autovac_parallel.pl
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index aa1d27bbed3..b7f3e342e82 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -5,6 +5,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
SUBDIRS = \
+ autovacuum \
brin \
commit_ts \
delay_execution \
diff --git a/src/test/modules/autovacuum/.gitignore b/src/test/modules/autovacuum/.gitignore
new file mode 100644
index 00000000000..0b54641bceb
--- /dev/null
+++ b/src/test/modules/autovacuum/.gitignore
@@ -0,0 +1 @@
+/tmp_check/
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/Makefile b/src/test/modules/autovacuum/Makefile
new file mode 100644
index 00000000000..90c00ff350b
--- /dev/null
+++ b/src/test/modules/autovacuum/Makefile
@@ -0,0 +1,14 @@
+# src/test/modules/autovacuum/Makefile
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/meson.build b/src/test/modules/autovacuum/meson.build
new file mode 100644
index 00000000000..f91c1a14d2b
--- /dev/null
+++ b/src/test/modules/autovacuum/meson.build
@@ -0,0 +1,12 @@
+# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+
+tests += {
+ 'name': 'autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_autovac_parallel.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/autovacuum/t/001_autovac_parallel.pl b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
new file mode 100644
index 00000000000..5aea3f10e38
--- /dev/null
+++ b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
@@ -0,0 +1,129 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+$node->append_conf('postgresql.conf', qq{
+ autovacuum = off
+ max_wal_size = 4096
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ parallel_index_autovacuum_reserved_workers = 1
+});
+$node->start;
+
+my $indexes_num = 80;
+my $initial_rows_num = 100_000;
+
+# Create big table and create specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER, col_5 INTEGER,
+ col_6 INTEGER, col_7 INTEGER, col_8 INTEGER, col_9 INTEGER, col_10 INTEGER,
+ col_11 INTEGER, col_12 INTEGER, col_13 INTEGER, col_14 INTEGER, col_15 INTEGER,
+ col_16 INTEGER, col_17 INTEGER, col_18 INTEGER, col_19 INTEGER, col_20 INTEGER,
+ col_21 INTEGER, col_22 INTEGER, col_23 INTEGER, col_24 INTEGER, col_25 INTEGER,
+ col_26 INTEGER, col_27 INTEGER, col_28 INTEGER, col_29 INTEGER, col_30 INTEGER,
+ col_31 INTEGER, col_32 INTEGER, col_33 INTEGER, col_34 INTEGER, col_35 INTEGER,
+ col_36 INTEGER, col_37 INTEGER, col_38 INTEGER, col_39 INTEGER, col_40 INTEGER,
+ col_41 INTEGER, col_42 INTEGER, col_43 INTEGER, col_44 INTEGER, col_45 INTEGER,
+ col_46 INTEGER, col_47 INTEGER, col_48 INTEGER, col_49 INTEGER, col_50 INTEGER,
+ col_51 INTEGER, col_52 INTEGER, col_53 INTEGER, col_54 INTEGER, col_55 INTEGER,
+ col_56 INTEGER, col_57 INTEGER, col_58 INTEGER, col_59 INTEGER, col_60 INTEGER,
+ col_61 INTEGER, col_62 INTEGER, col_63 INTEGER, col_64 INTEGER, col_65 INTEGER,
+ col_66 INTEGER, col_67 INTEGER, col_68 INTEGER, col_69 INTEGER, col_70 INTEGER,
+ col_71 INTEGER, col_72 INTEGER, col_73 INTEGER, col_74 INTEGER, col_75 INTEGER,
+ col_76 INTEGER, col_77 INTEGER, col_78 INTEGER, col_79 INTEGER, col_80 INTEGER,
+ col_81 INTEGER, col_82 INTEGER, col_83 INTEGER, col_84 INTEGER, col_85 INTEGER,
+ col_86 INTEGER, col_87 INTEGER, col_88 INTEGER, col_89 INTEGER, col_90 INTEGER,
+ col_91 INTEGER, col_92 INTEGER, col_93 INTEGER, col_94 INTEGER, col_95 INTEGER,
+ col_96 INTEGER, col_97 INTEGER, col_98 INTEGER, col_99 INTEGER, col_100 INTEGER
+ ) WITH (parallel_idx_autovac_enabled = true);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM pg_index i
+ JOIN pg_class c ON c.oid = i.indrelid
+ WHERE c.relname = 'test_autovac';",
+ stdout => \$psql_out
+);
+is($psql_out, $indexes_num + 1, "All indexes created successfully");
+
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac (
+ col_1, col_2, col_3, col_4, col_5, col_6, col_7, col_8, col_9, col_10,
+ col_11, col_12, col_13, col_14, col_15, col_16, col_17, col_18, col_19, col_20,
+ col_21, col_22, col_23, col_24, col_25, col_26, col_27, col_28, col_29, col_30,
+ col_31, col_32, col_33, col_34, col_35, col_36, col_37, col_38, col_39, col_40,
+ col_41, col_42, col_43, col_44, col_45, col_46, col_47, col_48, col_49, col_50,
+ col_51, col_52, col_53, col_54, col_55, col_56, col_57, col_58, col_59, col_60,
+ col_61, col_62, col_63, col_64, col_65, col_66, col_67, col_68, col_69, col_70,
+ col_71, col_72, col_73, col_74, col_75, col_76, col_77, col_78, col_79, col_80,
+ col_81, col_82, col_83, col_84, col_85, col_86, col_87, col_88, col_89, col_90,
+ col_91, col_92, col_93, col_94, col_95, col_96, col_97, col_98, col_99, col_100
+ ) VALUES (
+ i, i + 1, i + 2, i + 3, i + 4, i + 5, i + 6, i + 7, i + 8, i + 9,
+ i + 10, i + 11, i + 12, i + 13, i + 14, i + 15, i + 16, i + 17, i + 18, i + 19,
+ i + 20, i + 21, i + 22, i + 23, i + 24, i + 25, i + 26, i + 27, i + 28, i + 29,
+ i + 30, i + 31, i + 32, i + 33, i + 34, i + 35, i + 36, i + 37, i + 38, i + 39,
+ i + 40, i + 41, i + 42, i + 43, i + 44, i + 45, i + 46, i + 47, i + 48, i + 49,
+ i + 50, i + 51, i + 52, i + 53, i + 54, i + 55, i + 56, i + 57, i + 58, i + 59,
+ i + 60, i + 61, i + 62, i + 63, i + 64, i + 65, i + 66, i + 67, i + 68, i + 69,
+ i + 70, i + 71, i + 72, i + 73, i + 74, i + 75, i + 76, i + 77, i + 78, i + 79,
+ i + 80, i + 81, i + 82, i + 83, i + 84, i + 85, i + 86, i + 87, i + 88, i + 89,
+ i + 90, i + 91, i + 92, i + 93, i + 94, i + 95, i + 96, i + 97, i + 98, i + 99
+ );
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM test_autovac;",
+ stdout => \$psql_out
+);
+is($psql_out, $initial_rows_num, "All data inserted into table successfully");
+
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Reduce autovacuum_work_mem, so leader process will perform parallel indexi
+# vacuum phase several times
+$node->append_conf('postgresql.conf', qq{
+ autovacuum_naptime = '1s'
+ autovacuum_vacuum_threshold = 1
+ autovacuum_analyze_threshold = 1
+ autovacuum_vacuum_scale_factor = 0.1
+ autovacuum_analyze_scale_factor = 0.1
+ autovacuum = on
+});
+
+$node->restart;
+
+# sleep(3600);
+
+ok(1, "There are no segfaults");
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 9de0057bd1d..7f2ad810ca0 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -1,5 +1,6 @@
# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+subdir('autovacuum')
subdir('brin')
subdir('commit_ts')
subdir('delay_execution')
--
2.43.0
I started looking at the patch but I have some high level thoughts I would
like to share before looking further.
I find that the name "autovacuum_reserved_workers_num" is generic. It
would be better to have a more specific name for parallel vacuum such
as autovacuum_max_parallel_workers. This parameter is related to
neither autovacuum_worker_slots nor autovacuum_max_workers, which
seems fine to me. Also, max_parallel_maintenance_workers doesn't
affect this parameter.
.......
I've also considered some alternative names. If we were to use
parallel_maintenance_workers, it sounds like it controls the parallel
degree for all operations using max_parallel_maintenance_workers,
including CREATE INDEX. Similarly, vacuum_parallel_workers could be
interpreted as affecting both autovacuum and manual VACUUM commands,
suggesting that when users run "VACUUM (PARALLEL) t", the system would
use their specified value for the parallel degree. I prefer
autovacuum_parallel_workers or vacuum_parallel_workers.This was my headache when I created names for variables. Autovacuum
initially implies parallelism, because we have several parallel a/v
workers. So I think that parameter like
`autovacuum_max_parallel_workers` will confuse somebody.
If we want to have a more specific name, I would prefer
`max_parallel_index_autovacuum_workers`.
I don't think we should have a separate pool of parallel workers for those
that are used to support parallel autovacuum. At the end of the day, these
are parallel workers and they should be capped by max_parallel_workers. I think
it will be confusing if we claim these are parallel workers, but they
are coming from
a different pool.
I envision we have another GUC such as "max_parallel_autovacuum_workers"
(which I think is a better name) that matches the behavior of
"max_parallel_maintenance_worker". Meaning that the autovacuum workers
still maintain their existing behavior ( launching a worker per table
), and if they do need
to vacuum in parallel, they can draw from a pool of parallel workers.
With the above said, I therefore think the reloption should actually be a number
of parallel workers rather than a boolean. Let's take an example of a
user that has 3 tables
they wish to (auto)vacuum can process in parallel, and if available
they wish each of these tables
could be autovacuumed with 4 parallel workers. However, as to not
overload the system, they
cap the 'max_parallel_maintenance_worker' to something like 8. If it
so happens that all
3 tables are auto-vacuumed at the same time, there may not be enough
parallel workers,
so one table will be a loser and be vacuumed in serial. That is
acceptable, and a/v logging
( and perhaps other stat views ) should display this behavior: workers
planned vs workers launched.
thoughts?
--
Sami Imseih
Amazon Web Services (AWS)
On Thu, May 22, 2025 at 12:44 AM Daniil Davydov <3danissimo@gmail.com> wrote:
Hi,
On Wed, May 21, 2025 at 5:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I have some comments on v2-0001 patch
Thank you for reviewing this patch!
+ { + {"autovacuum_reserved_workers_num", PGC_USERSET, RESOURCES_WORKER_PROCESSES, + gettext_noop("Number of worker processes, reserved for participation in parallel index processing during autovacuum."), + gettext_noop("This parameter is depending on \"max_worker_processes\" (not on \"autovacuum_max_workers\"). " + "*Only* autovacuum workers can use these additional processes. " + "Also, these processes are taken into account in \"max_parallel_workers\"."), + }, + &av_reserved_workers_num, + 0, 0, MAX_BACKENDS, + check_autovacuum_reserved_workers_num, NULL, NULL + },I find that the name "autovacuum_reserved_workers_num" is generic. It
would be better to have a more specific name for parallel vacuum such
as autovacuum_max_parallel_workers. This parameter is related to
neither autovacuum_worker_slots nor autovacuum_max_workers, which
seems fine to me. Also, max_parallel_maintenance_workers doesn't
affect this parameter.
.......
I've also considered some alternative names. If we were to use
parallel_maintenance_workers, it sounds like it controls the parallel
degree for all operations using max_parallel_maintenance_workers,
including CREATE INDEX. Similarly, vacuum_parallel_workers could be
interpreted as affecting both autovacuum and manual VACUUM commands,
suggesting that when users run "VACUUM (PARALLEL) t", the system would
use their specified value for the parallel degree. I prefer
autovacuum_parallel_workers or vacuum_parallel_workers.This was my headache when I created names for variables. Autovacuum
initially implies parallelism, because we have several parallel a/v
workers.
I'm not sure if it's parallelism. We can have multiple autovacuum
workers simultaneously working on different tables, which seems not
parallelism to me.
So I think that parameter like
`autovacuum_max_parallel_workers` will confuse somebody.
If we want to have a more specific name, I would prefer
`max_parallel_index_autovacuum_workers`.
It's better not to use 'index' as we're trying to extend parallel
vacuum to heap scanning/vacuuming as well[1]/messages/by-id/CAD21AoAEfCNv-GgaDheDJ+s-p_Lv1H24AiJeNoPGCmZNSwL1YA@mail.gmail.com.
+ /* + * If we are running autovacuum - decide whether we need to process indexes + * of table with given oid in parallel. + */ + if (AmAutoVacuumWorkerProcess() && + params->index_cleanup != VACOPTVALUE_DISABLED && + RelationAllowsParallelIdxAutovac(rel))I think that this should be done in autovacuum code.
We need params->index cleanup variable to decide whether we need to
use parallel index a/v. In autovacuum.c we have this code :
***
/*
* index_cleanup and truncate are unspecified at first in autovacuum.
* They will be filled in with usable values using their reloptions
* (or reloption defaults) later.
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
***
This variable is filled in inside the `vacuum_rel` function, so I
think we should keep the above logic in vacuum.c.
I guess that we can specify the parallel degree even if index_cleanup
is still UNSPECIFIED. vacuum_rel() would then decide whether to use
index vacuuming and vacuumlazy.c would decide whether to use parallel
vacuum based on the specified parallel degree and index_cleanup value.
+#define AV_PARALLEL_DEADTUP_THRESHOLD 1024
These fixed values really useful in common cases? I think we already
have an optimization where we skip vacuum indexes if the table has
fewer dead tuples (see BYPASS_THRESHOLD_PAGES).When we allocate dead items (and optionally init parallel autocuum) we
don't have sane value for `vacrel->lpdead_item_pages` (which should be
compared with BYPASS_THRESHOLD_PAGES).
The only criterion that we can focus on is the number of dead tuples
indicated in the PgStat_StatTabEntry.
My point is that this criterion might not be useful. We have the
bypass optimization for index vacuuming and having many dead tuples
doesn't necessarily mean index vacuuming taking a long time. For
example, even if the table has a few dead tuples, index vacuuming
could take a very long time and parallel index vacuuming would help
the situation, if the table is very large and has many indexes.
----
I guess we can implement this parameter as an integer parameter so that the user can specify the number of parallel vacuum workers for the table. For example, we can have a reloption autovacuum_parallel_workers. Setting 0 (by default) means to disable parallel vacuum during autovacuum, and setting special value -1 means to let PostgreSQL calculate the parallel degree for the table (same as the default VACUUM command behavior). ........... The patch includes the changes to bgworker.c so that we can reserve some slots for autovacuums. I guess that this change is not necessarily necessary because if the user sets the related GUC parameters correctly the autovacuum workers can use parallel vacuum as expected. Even if we need this change, I would suggest implementing it as a separate patch. .......... +#define AV_PARALLEL_DEADTUP_THRESHOLD 1024 +#define NUM_INDEXES_PER_PARALLEL_WORKER 30These fixed values really useful in common cases? Given that we rely on
users' heuristics which table needs to use parallel vacuum during
autovacuum, I think we don't need to apply these conditions.
..........I grouped these comments together, because they all relate to a single
question : how much freedom will we give to the user?
Your opinion (as far as I understand) is that we allow users to
specify any number of parallel workers for tables, and it is the
user's responsibility to configure appropriate GUC variables, so that
autovacuum can always process indexes in parallel.
And we don't need to think about thresholds. Even if the table has a
small number of indexes and dead rows - if the user specified table
option, we must do a parallel index a/v with requested number of
parallel workers.
Please correct me if I messed something up.I think that this logic is well suited for the `VACUUM (PARALLEL)` sql
command, which is manually called by the user.
The current idea that users can use parallel vacuum on particular
tables based on their heuristic makes sense to me as the first
implementation.
But autovacuum (as I think) should work as stable as possible and
`unnoticed` by other processes. Thus, we must :
1) Compute resources (such as the number of parallel workers for a
single table's indexes vacuuming) as efficiently as possible.
2) Provide a guarantee that as many tables as possible (among
requested) will be processed in parallel.
I think these ideas could be implemented on top of the current idea.
(1) can be achieved by calculating the parameters on the fly.
NUM_INDEXES_PER_PARALLEL_WORKER is a rough mock. I can provide more
accurate value in the near future.
I think it requires more things than the number of indexes on the
table to achieve (1). Suppose that there is a very large table that
gets updates heavily and has a few indexes. If users want to avoid the
table from being bloated, it would be a reasonable idea to use
parallel vacuum during autovacuum and it would not be a good idea to
disallow using parallel vacuum solely because it doesn't have more
than 30 indexes. On the other hand, if the table had got many updates
but not so now, users might want to use resources for autovacuums on
other tables. We might need to consider autovacuum frequencies per
table, the statistics of the previous autovacuum, or system loads etc.
So I think that in order to achieve (1) we might need more statistics
and using only NUM_INDEXES_PER_PARALLEL_WORKER would not work fine.
(2) can be achieved by workers reserving - we know that N workers
(from bgworkers pool) are *always* at our disposal. And when we use
such workers we are not dependent on other operations in the cluster
and we don't interfere with other operations by taking resources away
from them.
Reserving some bgworkers for autovacuum could make sense. But I think
it's better to implement it in a general way as it could be useful in
other use cases too. That is, it might be a good to implement
infrastructure so that any PostgreSQL code (possibly including
extensions) can request allocating a pool of bgworkers for specific
usage and use bgworkers from them.
Regards,
[1]: /messages/by-id/CAD21AoAEfCNv-GgaDheDJ+s-p_Lv1H24AiJeNoPGCmZNSwL1YA@mail.gmail.com
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Thu, May 22, 2025 at 10:48 AM Sami Imseih <samimseih@gmail.com> wrote:
I started looking at the patch but I have some high level thoughts I would
like to share before looking further.I find that the name "autovacuum_reserved_workers_num" is generic. It
would be better to have a more specific name for parallel vacuum such
as autovacuum_max_parallel_workers. This parameter is related to
neither autovacuum_worker_slots nor autovacuum_max_workers, which
seems fine to me. Also, max_parallel_maintenance_workers doesn't
affect this parameter.
.......
I've also considered some alternative names. If we were to use
parallel_maintenance_workers, it sounds like it controls the parallel
degree for all operations using max_parallel_maintenance_workers,
including CREATE INDEX. Similarly, vacuum_parallel_workers could be
interpreted as affecting both autovacuum and manual VACUUM commands,
suggesting that when users run "VACUUM (PARALLEL) t", the system would
use their specified value for the parallel degree. I prefer
autovacuum_parallel_workers or vacuum_parallel_workers.This was my headache when I created names for variables. Autovacuum
initially implies parallelism, because we have several parallel a/v
workers. So I think that parameter like
`autovacuum_max_parallel_workers` will confuse somebody.
If we want to have a more specific name, I would prefer
`max_parallel_index_autovacuum_workers`.I don't think we should have a separate pool of parallel workers for those
that are used to support parallel autovacuum. At the end of the day, these
are parallel workers and they should be capped by max_parallel_workers. I think
it will be confusing if we claim these are parallel workers, but they
are coming from
a different pool.
I agree that parallel vacuum workers used during autovacuum should be
capped by the max_parallel_workers.
I envision we have another GUC such as "max_parallel_autovacuum_workers"
(which I think is a better name) that matches the behavior of
"max_parallel_maintenance_worker". Meaning that the autovacuum workers
still maintain their existing behavior ( launching a worker per table
), and if they do need
to vacuum in parallel, they can draw from a pool of parallel workers.With the above said, I therefore think the reloption should actually be a number
of parallel workers rather than a boolean. Let's take an example of a
user that has 3 tables
they wish to (auto)vacuum can process in parallel, and if available
they wish each of these tables
could be autovacuumed with 4 parallel workers. However, as to not
overload the system, they
cap the 'max_parallel_maintenance_worker' to something like 8. If it
so happens that all
3 tables are auto-vacuumed at the same time, there may not be enough
parallel workers,
so one table will be a loser and be vacuumed in serial.
+1 for the reloption having a number of parallel workers, leaving
aside the name competition.
That is
acceptable, and a/v logging
( and perhaps other stat views ) should display this behavior: workers
planned vs workers launched.
Agreed. The workers planned vs. launched is reported only with VERBOSE
option so we need to change it so that autovacuum can log it at least.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Hi,
On Fri, May 23, 2025 at 6:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, May 22, 2025 at 12:44 AM Daniil Davydov <3danissimo@gmail.com> wrote:
On Wed, May 21, 2025 at 5:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I find that the name "autovacuum_reserved_workers_num" is generic. It
would be better to have a more specific name for parallel vacuum such
as autovacuum_max_parallel_workers. This parameter is related to
neither autovacuum_worker_slots nor autovacuum_max_workers, which
seems fine to me. Also, max_parallel_maintenance_workers doesn't
affect this parameter.This was my headache when I created names for variables. Autovacuum
initially implies parallelism, because we have several parallel a/v
workers.I'm not sure if it's parallelism. We can have multiple autovacuum
workers simultaneously working on different tables, which seems not
parallelism to me.
Hm, I didn't thought about the 'parallelism' definition in this way.
But I see your point - the next v4 patch will contain the naming that
you suggest.
So I think that parameter like
`autovacuum_max_parallel_workers` will confuse somebody.
If we want to have a more specific name, I would prefer
`max_parallel_index_autovacuum_workers`.It's better not to use 'index' as we're trying to extend parallel
vacuum to heap scanning/vacuuming as well[1].
OK, I'll fix it.
+ /* + * If we are running autovacuum - decide whether we need to process indexes + * of table with given oid in parallel. + */ + if (AmAutoVacuumWorkerProcess() && + params->index_cleanup != VACOPTVALUE_DISABLED && + RelationAllowsParallelIdxAutovac(rel))I think that this should be done in autovacuum code.
We need params->index cleanup variable to decide whether we need to
use parallel index a/v. In autovacuum.c we have this code :
***
/*
* index_cleanup and truncate are unspecified at first in autovacuum.
* They will be filled in with usable values using their reloptions
* (or reloption defaults) later.
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
***
This variable is filled in inside the `vacuum_rel` function, so I
think we should keep the above logic in vacuum.c.I guess that we can specify the parallel degree even if index_cleanup
is still UNSPECIFIED. vacuum_rel() would then decide whether to use
index vacuuming and vacuumlazy.c would decide whether to use parallel
vacuum based on the specified parallel degree and index_cleanup value.+#define AV_PARALLEL_DEADTUP_THRESHOLD 1024
These fixed values really useful in common cases? I think we already
have an optimization where we skip vacuum indexes if the table has
fewer dead tuples (see BYPASS_THRESHOLD_PAGES).When we allocate dead items (and optionally init parallel autocuum) we
don't have sane value for `vacrel->lpdead_item_pages` (which should be
compared with BYPASS_THRESHOLD_PAGES).
The only criterion that we can focus on is the number of dead tuples
indicated in the PgStat_StatTabEntry.My point is that this criterion might not be useful. We have the
bypass optimization for index vacuuming and having many dead tuples
doesn't necessarily mean index vacuuming taking a long time. For
example, even if the table has a few dead tuples, index vacuuming
could take a very long time and parallel index vacuuming would help
the situation, if the table is very large and has many indexes.
That sounds reasonable. I'll fix it.
But autovacuum (as I think) should work as stable as possible and
`unnoticed` by other processes. Thus, we must :
1) Compute resources (such as the number of parallel workers for a
single table's indexes vacuuming) as efficiently as possible.
2) Provide a guarantee that as many tables as possible (among
requested) will be processed in parallel.(1) can be achieved by calculating the parameters on the fly.
NUM_INDEXES_PER_PARALLEL_WORKER is a rough mock. I can provide more
accurate value in the near future.I think it requires more things than the number of indexes on the
table to achieve (1). Suppose that there is a very large table that
gets updates heavily and has a few indexes. If users want to avoid the
table from being bloated, it would be a reasonable idea to use
parallel vacuum during autovacuum and it would not be a good idea to
disallow using parallel vacuum solely because it doesn't have more
than 30 indexes. On the other hand, if the table had got many updates
but not so now, users might want to use resources for autovacuums on
other tables. We might need to consider autovacuum frequencies per
table, the statistics of the previous autovacuum, or system loads etc.
So I think that in order to achieve (1) we might need more statistics
and using only NUM_INDEXES_PER_PARALLEL_WORKER would not work fine.
It's hard for me to imagine exactly how extended statistics will help
us track such situations.
It seems that for any of our heuristics, it will be possible to come
up with a counter example.
Maybe we can give advices (via logs) to the user? But for such an
idea, tests should be conducted so that we can understand when
resource consumption becomes ineffective.
I guess that we need to agree on an implementation before conducting such tests.
(2) can be achieved by workers reserving - we know that N workers
(from bgworkers pool) are *always* at our disposal. And when we use
such workers we are not dependent on other operations in the cluster
and we don't interfere with other operations by taking resources away
from them.Reserving some bgworkers for autovacuum could make sense. But I think
it's better to implement it in a general way as it could be useful in
other use cases too. That is, it might be a good to implement
infrastructure so that any PostgreSQL code (possibly including
extensions) can request allocating a pool of bgworkers for specific
usage and use bgworkers from them.
Reserving infrastructure is an ambitious idea. I am not sure that we
should implement it within this thread and feature.
Maybe we should create a separate thread for it and as a
justification, refer to parallel autovacuum?
-----
Thanks everybody for feedback! I attach a v4 patch to this letter.
Main features :
1) 'parallel_autovacuum_workers' reloption - integer value, that sets
the maximum number of parallel a/v workers that can be taken from
bgworkers pool in order to process this table.
2) 'max_parallel_autovacuum_workers' - GUC variable, that sets the
maximum total number of parallel a/v workers, that can be taken from
bgworkers pool.
3) Parallel autovacuum does not try to use thresholds like
NUM_INDEXES_PER_PARALLEL_WORKER and AV_PARALLEL_DEADTUP_THRESHOLD.
4) Parallel autovacuum now can report statistics like "planned vs. launched".
5) For now I got rid of the 'reserving' idea, so now autovacuum
leaders are competing with everyone for parallel workers from the
bgworkers pool.
What do you think about this implementation?
--
Best regards,
Daniil Davydov
Attachments:
v4-0001-Parallel-index-autovacuum-with-bgworkers.patchtext/x-patch; charset=US-ASCII; name=v4-0001-Parallel-index-autovacuum-with-bgworkers.patchDownload
From afa3f4c3d8993b775837cd04e5d170012b9d2691 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Fri, 16 May 2025 11:58:40 +0700
Subject: [PATCH v4 1/2] Parallel index autovacuum with bgworkers
---
src/backend/access/common/reloptions.c | 12 +++
src/backend/access/heap/vacuumlazy.c | 6 +-
src/backend/access/transam/parallel.c | 11 +++
src/backend/commands/vacuumparallel.c | 76 +++++++++++++------
src/backend/postmaster/autovacuum.c | 76 ++++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc_tables.c | 10 +++
src/backend/utils/misc/postgresql.conf.sample | 2 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 4 +
src/include/utils/guc_hooks.h | 2 +
src/include/utils/rel.h | 12 +++
12 files changed, 186 insertions(+), 27 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 46c1dce222d..6ba8da62546 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,16 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "parallel_autovacuum_workers",
+ "Maximum number of parallel autovacuum workers that can be taken from bgworkers pool for processing this table. "
+ "If value is 0 then parallel degree will computed based on number of indexes.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1863,6 +1873,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"parallel_autovacuum_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, parallel_autovacuum_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index f28326bad09..2614ceba139 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -3487,6 +3487,10 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
autovacuum_work_mem != -1 ?
autovacuum_work_mem : maintenance_work_mem;
+ int elevel = AmAutoVacuumWorkerProcess() ||
+ vacrel->verbose ?
+ INFO : DEBUG2;
+
/*
* Initialize state for a parallel vacuum. As of now, only one worker can
* be used for an index, so we invoke parallelism only if there are at
@@ -3513,7 +3517,7 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
vacrel->pvs = parallel_vacuum_init(vacrel->rel, vacrel->indrels,
vacrel->nindexes, nworkers,
vac_work_mem,
- vacrel->verbose ? INFO : DEBUG2,
+ elevel,
vacrel->bstrategy);
/*
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 94db1ec3012..d3313774a4b 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -34,6 +34,7 @@
#include "miscadmin.h"
#include "optimizer/optimizer.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/ipc.h"
#include "storage/predicate.h"
#include "storage/spin.h"
@@ -514,6 +515,11 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
{
WaitForParallelWorkersToFinish(pcxt);
WaitForParallelWorkersToExit(pcxt);
+
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+ if (AmAutoVacuumWorkerProcess())
+ ParallelAutoVacuumReleaseWorkers(pcxt->nworkers_launched);
+
pcxt->nworkers_launched = 0;
if (pcxt->known_attached_workers)
{
@@ -1002,6 +1008,11 @@ DestroyParallelContext(ParallelContext *pcxt)
*/
HOLD_INTERRUPTS();
WaitForParallelWorkersToExit(pcxt);
+
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+ if (AmAutoVacuumWorkerProcess())
+ ParallelAutoVacuumReleaseWorkers(pcxt->nworkers_launched);
+
RESUME_INTERRUPTS();
/* Free the worker array itself. */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 2b9d548cdeb..c63830fd2a5 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,16 +1,16 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel [auto]vacuum execution.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
*
- * In a parallel vacuum, we perform both index bulk deletion and index cleanup
- * with parallel worker processes. Individual indexes are processed by one
- * vacuum process. ParallelVacuumState contains shared information as well as
- * the memory space for storing dead items allocated in the DSA area. We
- * launch parallel worker processes at the start of parallel index
+ * In a parallel [auto]vacuum, we perform both index bulk deletion and index
+ * cleanup with parallel worker processes. Individual indexes are processed by
+ * one [auto]vacuum process. ParallelVacuumState contains shared information
+ * as well as the memory space for storing dead items allocated in the DSA area.
+ * We launch parallel worker processes at the start of parallel index
* bulk-deletion and index cleanup and once all indexes are processed, the
* parallel worker processes exit. Each time we process indexes in parallel,
* the parallel context is re-initialized so that the same DSM can be used for
@@ -34,6 +34,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -157,7 +158,8 @@ typedef struct PVIndStats
} PVIndStats;
/*
- * Struct for maintaining a parallel vacuum state. typedef appears in vacuum.h.
+ * Struct for maintaining a parallel [auto]vacuum state. typedef appears in
+ * vacuum.h.
*/
struct ParallelVacuumState
{
@@ -371,10 +373,18 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->relid = RelationGetRelid(rel);
shared->elevel = elevel;
shared->queryid = pgstat_get_my_query_id();
- shared->maintenance_work_mem_worker =
- (nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+
+ if (AmAutoVacuumWorkerProcess())
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) :
+ autovacuum_work_mem;
+ else
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+ maintenance_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -541,7 +551,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*
* nrequested is the number of parallel workers that user requested. If
* nrequested is 0, we compute the parallel degree based on nindexes, that is
- * the number of indexes that support parallel vacuum. This function also
+ * the number of indexes that support parallel [auto]vacuum. This function also
* sets will_parallel_vacuum to remember indexes that participate in parallel
* vacuum.
*/
@@ -558,7 +568,9 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster ||
+ (max_parallel_autovacuum_workers == 0 && AmAutoVacuumWorkerProcess()) ||
+ (max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess()))
return 0;
/*
@@ -597,15 +609,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = AmAutoVacuumWorkerProcess() ?
+ Min(parallel_workers, max_parallel_autovacuum_workers) :
+ Min(parallel_workers, max_parallel_maintenance_workers);
return parallel_workers;
}
/*
* Perform index vacuum or index cleanup with parallel workers. This function
- * must be used by the parallel vacuum leader process.
+ * must be used by the parallel [auto]vacuum leader process.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -666,6 +680,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
/* Reset the parallel index processing and progress counters */
pg_atomic_write_u32(&(pvs->shared->idx), 0);
+ /* Check how many workers can provide autovacuum. */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = ParallelAutoVacuumReserveWorkers(nworkers);
+
/* Setup the shared cost-based vacuum delay and launch workers */
if (nworkers > 0)
{
@@ -690,6 +708,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ ParallelAutoVacuumReleaseWorkers(pvs->pcxt->nworkers_launched - nworkers);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -706,16 +734,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
if (vacuum)
ereport(pvs->shared->elevel,
- (errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
- "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+ (errmsg(ngettext("launched %d parallel %svacuum worker for index vacuuming (planned: %d)",
+ "launched %d parallel %svacuum workers for index vacuuming (planned: %d)",
pvs->pcxt->nworkers_launched),
- pvs->pcxt->nworkers_launched, nworkers)));
+ pvs->pcxt->nworkers_launched, AmAutoVacuumWorkerProcess() ? "auto" : "", nworkers)));
else
ereport(pvs->shared->elevel,
- (errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
- "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+ (errmsg(ngettext("launched %d parallel %svacuum worker for index cleanup (planned: %d)",
+ "launched %d parallel %svacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
- pvs->pcxt->nworkers_launched, nworkers)));
+ pvs->pcxt->nworkers_launched, AmAutoVacuumWorkerProcess() ? "auto" : "", nworkers)));
}
/* Vacuum the indexes that can be processed by only leader process */
@@ -982,8 +1010,8 @@ parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
/*
* Perform work within a launched parallel process.
*
- * Since parallel vacuum workers perform only index vacuum or index cleanup,
- * we don't need to report progress information.
+ * Since parallel [auto]vacuum workers perform only index vacuum or index
+ * cleanup, we don't need to report progress information.
*/
void
parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 981be42e3af..7f34e202589 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -285,6 +285,7 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_active_parallel_workers the number of active parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +300,7 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_active_parallel_workers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -2840,8 +2842,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->parallel_autovacuum_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3322,6 +3328,61 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'max_parallel_autovacuum_workers' limit, leader worker
+ * must call this function. It returns the number of parallel workers that
+ * actually can be launched and reserves (if any) these workers in global
+ * autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+ParallelAutoVacuumReserveWorkers(int nworkers)
+{
+ int can_launch;
+
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ if (AutoVacuumShmem->av_active_parallel_workers < nworkers)
+ {
+ /* Provide as many workers as we can. */
+ can_launch = AutoVacuumShmem->av_active_parallel_workers;
+ AutoVacuumShmem->av_active_parallel_workers = 0;
+ }
+ else
+ {
+ /* OK, we can provide all requested workers. */
+ can_launch = nworkers;
+ AutoVacuumShmem->av_active_parallel_workers -= nworkers;
+ }
+ LWLockRelease(AutovacuumLock);
+
+ return can_launch;
+}
+
+/*
+ * When parallel autovacuum worker die, leader worker must call this function
+ * in order to refresh global autovacuum state. Thus, other leaders will be able
+ * to use these workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+ParallelAutoVacuumReleaseWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ AutoVacuumShmem->av_active_parallel_workers += nworkers;
+ Assert(AutoVacuumShmem->av_active_parallel_workers <=
+ max_parallel_autovacuum_workers);
+ LWLockRelease(AutovacuumLock);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3382,6 +3443,8 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_active_parallel_workers =
+ max_parallel_autovacuum_workers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3432,6 +3495,15 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
return true;
}
+bool
+check_max_parallel_autovacuum_workers(int *newval, void **extra,
+ GucSource source)
+{
+ if (*newval >= max_worker_processes)
+ return false;
+ return true;
+}
+
/*
* Returns whether there is a free autovacuum worker slot available.
*/
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..40a92ceecd5 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int max_parallel_autovacuum_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 2f8cbd86759..950b4300100 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3604,6 +3604,16 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"max_parallel_autovacuum_workers", PGC_POSTMASTER, RESOURCES_WORKER_PROCESSES,
+ gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."),
+ gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."),
+ },
+ &max_parallel_autovacuum_workers,
+ 0, 0, MAX_BACKENDS,
+ check_max_parallel_autovacuum_workers, NULL, NULL
+ },
+
{
{"max_parallel_maintenance_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
gettext_noop("Sets the maximum number of parallel processes per maintenance operation."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 63f991c4f93..23f5c890f78 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -221,6 +221,8 @@
#max_parallel_maintenance_workers = 2 # limited by max_parallel_workers
#max_parallel_workers = 8 # number of max_worker_processes that
# can be used in parallel operations
+#max_parallel_autovacuum_workers = 0 # disabled by default and limited by max_parallel_workers
+ # (change requires restart)
#parallel_leader_participation = on
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..7c3575b6849 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int max_parallel_autovacuum_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e8135f41a1c..b5763e6ac36 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -64,6 +64,10 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern int ParallelAutoVacuumReserveWorkers(int nworkers);
+extern void ParallelAutoVacuumReleaseWorkers(int nworkers);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 799fa7ace68..d4e6170d45c 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -31,6 +31,8 @@ extern void assign_application_name(const char *newval, void *extra);
extern const char *show_archive_command(void);
extern bool check_autovacuum_work_mem(int *newval, void **extra,
GucSource source);
+extern bool check_max_parallel_autovacuum_workers(int *newval, void **extra,
+ GucSource source);
extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
GucSource source);
extern bool check_backtrace_functions(char **newval, void **extra,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..16091e6a773 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,8 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+ int parallel_autovacuum_workers; /* max number of parallel
+ autovacuum workers */
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
@@ -409,6 +411,16 @@ typedef struct StdRdOptions
((relation)->rd_options ? \
((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
+/*
+ * RelationGetParallelAutovacuumWorkers
+ * Returns the relation's parallel_autovacuum_workers reloption setting.
+ * Note multiple eval of argument!
+ */
+#define RelationGetParallelAutovacuumWorkers(relation, defaultpw) \
+ ((relation)->rd_options ? \
+ ((StdRdOptions *) (relation)->rd_options)->autovacuum.parallel_autovacuum_workers : \
+ (defaultpw))
+
/* ViewOptions->check_option values */
typedef enum ViewOptCheckOption
{
--
2.43.0
v4-0002-Sandbox-for-parallel-index-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v4-0002-Sandbox-for-parallel-index-autovacuum.patchDownload
From 4a027ce082b0b0964fc2f2f1e7c341adff14f43b Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Fri, 16 May 2025 11:59:03 +0700
Subject: [PATCH v4 2/2] Sandbox for parallel index autovacuum
---
src/test/modules/Makefile | 1 +
src/test/modules/autovacuum/.gitignore | 1 +
src/test/modules/autovacuum/Makefile | 14 ++
src/test/modules/autovacuum/meson.build | 12 ++
.../autovacuum/t/001_autovac_parallel.pl | 131 ++++++++++++++++++
src/test/modules/meson.build | 1 +
6 files changed, 160 insertions(+)
create mode 100644 src/test/modules/autovacuum/.gitignore
create mode 100644 src/test/modules/autovacuum/Makefile
create mode 100644 src/test/modules/autovacuum/meson.build
create mode 100644 src/test/modules/autovacuum/t/001_autovac_parallel.pl
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index aa1d27bbed3..b7f3e342e82 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -5,6 +5,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
SUBDIRS = \
+ autovacuum \
brin \
commit_ts \
delay_execution \
diff --git a/src/test/modules/autovacuum/.gitignore b/src/test/modules/autovacuum/.gitignore
new file mode 100644
index 00000000000..0b54641bceb
--- /dev/null
+++ b/src/test/modules/autovacuum/.gitignore
@@ -0,0 +1 @@
+/tmp_check/
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/Makefile b/src/test/modules/autovacuum/Makefile
new file mode 100644
index 00000000000..90c00ff350b
--- /dev/null
+++ b/src/test/modules/autovacuum/Makefile
@@ -0,0 +1,14 @@
+# src/test/modules/autovacuum/Makefile
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/meson.build b/src/test/modules/autovacuum/meson.build
new file mode 100644
index 00000000000..f91c1a14d2b
--- /dev/null
+++ b/src/test/modules/autovacuum/meson.build
@@ -0,0 +1,12 @@
+# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+
+tests += {
+ 'name': 'autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_autovac_parallel.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/autovacuum/t/001_autovac_parallel.pl b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
new file mode 100644
index 00000000000..b4022f23948
--- /dev/null
+++ b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
@@ -0,0 +1,131 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+$node->append_conf('postgresql.conf', qq{
+ autovacuum = off
+ max_wal_size = 4096
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ max_parallel_autovacuum_workers = 10
+ log_min_messages = info
+});
+$node->start;
+
+my $indexes_num = 80;
+my $initial_rows_num = 100_000;
+my $parallel_autovacuum_workers = 5;
+
+# Create big table and create specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER, col_5 INTEGER,
+ col_6 INTEGER, col_7 INTEGER, col_8 INTEGER, col_9 INTEGER, col_10 INTEGER,
+ col_11 INTEGER, col_12 INTEGER, col_13 INTEGER, col_14 INTEGER, col_15 INTEGER,
+ col_16 INTEGER, col_17 INTEGER, col_18 INTEGER, col_19 INTEGER, col_20 INTEGER,
+ col_21 INTEGER, col_22 INTEGER, col_23 INTEGER, col_24 INTEGER, col_25 INTEGER,
+ col_26 INTEGER, col_27 INTEGER, col_28 INTEGER, col_29 INTEGER, col_30 INTEGER,
+ col_31 INTEGER, col_32 INTEGER, col_33 INTEGER, col_34 INTEGER, col_35 INTEGER,
+ col_36 INTEGER, col_37 INTEGER, col_38 INTEGER, col_39 INTEGER, col_40 INTEGER,
+ col_41 INTEGER, col_42 INTEGER, col_43 INTEGER, col_44 INTEGER, col_45 INTEGER,
+ col_46 INTEGER, col_47 INTEGER, col_48 INTEGER, col_49 INTEGER, col_50 INTEGER,
+ col_51 INTEGER, col_52 INTEGER, col_53 INTEGER, col_54 INTEGER, col_55 INTEGER,
+ col_56 INTEGER, col_57 INTEGER, col_58 INTEGER, col_59 INTEGER, col_60 INTEGER,
+ col_61 INTEGER, col_62 INTEGER, col_63 INTEGER, col_64 INTEGER, col_65 INTEGER,
+ col_66 INTEGER, col_67 INTEGER, col_68 INTEGER, col_69 INTEGER, col_70 INTEGER,
+ col_71 INTEGER, col_72 INTEGER, col_73 INTEGER, col_74 INTEGER, col_75 INTEGER,
+ col_76 INTEGER, col_77 INTEGER, col_78 INTEGER, col_79 INTEGER, col_80 INTEGER,
+ col_81 INTEGER, col_82 INTEGER, col_83 INTEGER, col_84 INTEGER, col_85 INTEGER,
+ col_86 INTEGER, col_87 INTEGER, col_88 INTEGER, col_89 INTEGER, col_90 INTEGER,
+ col_91 INTEGER, col_92 INTEGER, col_93 INTEGER, col_94 INTEGER, col_95 INTEGER,
+ col_96 INTEGER, col_97 INTEGER, col_98 INTEGER, col_99 INTEGER, col_100 INTEGER
+ ) WITH (parallel_autovacuum_workers = $parallel_autovacuum_workers);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM pg_index i
+ JOIN pg_class c ON c.oid = i.indrelid
+ WHERE c.relname = 'test_autovac';",
+ stdout => \$psql_out
+);
+is($psql_out, $indexes_num + 1, "All indexes created successfully");
+
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac (
+ col_1, col_2, col_3, col_4, col_5, col_6, col_7, col_8, col_9, col_10,
+ col_11, col_12, col_13, col_14, col_15, col_16, col_17, col_18, col_19, col_20,
+ col_21, col_22, col_23, col_24, col_25, col_26, col_27, col_28, col_29, col_30,
+ col_31, col_32, col_33, col_34, col_35, col_36, col_37, col_38, col_39, col_40,
+ col_41, col_42, col_43, col_44, col_45, col_46, col_47, col_48, col_49, col_50,
+ col_51, col_52, col_53, col_54, col_55, col_56, col_57, col_58, col_59, col_60,
+ col_61, col_62, col_63, col_64, col_65, col_66, col_67, col_68, col_69, col_70,
+ col_71, col_72, col_73, col_74, col_75, col_76, col_77, col_78, col_79, col_80,
+ col_81, col_82, col_83, col_84, col_85, col_86, col_87, col_88, col_89, col_90,
+ col_91, col_92, col_93, col_94, col_95, col_96, col_97, col_98, col_99, col_100
+ ) VALUES (
+ i, i + 1, i + 2, i + 3, i + 4, i + 5, i + 6, i + 7, i + 8, i + 9,
+ i + 10, i + 11, i + 12, i + 13, i + 14, i + 15, i + 16, i + 17, i + 18, i + 19,
+ i + 20, i + 21, i + 22, i + 23, i + 24, i + 25, i + 26, i + 27, i + 28, i + 29,
+ i + 30, i + 31, i + 32, i + 33, i + 34, i + 35, i + 36, i + 37, i + 38, i + 39,
+ i + 40, i + 41, i + 42, i + 43, i + 44, i + 45, i + 46, i + 47, i + 48, i + 49,
+ i + 50, i + 51, i + 52, i + 53, i + 54, i + 55, i + 56, i + 57, i + 58, i + 59,
+ i + 60, i + 61, i + 62, i + 63, i + 64, i + 65, i + 66, i + 67, i + 68, i + 69,
+ i + 70, i + 71, i + 72, i + 73, i + 74, i + 75, i + 76, i + 77, i + 78, i + 79,
+ i + 80, i + 81, i + 82, i + 83, i + 84, i + 85, i + 86, i + 87, i + 88, i + 89,
+ i + 90, i + 91, i + 92, i + 93, i + 94, i + 95, i + 96, i + 97, i + 98, i + 99
+ );
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM test_autovac;",
+ stdout => \$psql_out
+);
+is($psql_out, $initial_rows_num, "All data inserted into table successfully");
+
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Reduce autovacuum_work_mem, so leader process will perform parallel indexi
+# vacuum phase several times
+$node->append_conf('postgresql.conf', qq{
+ autovacuum_naptime = '1s'
+ autovacuum_vacuum_threshold = 1
+ autovacuum_analyze_threshold = 1
+ autovacuum_vacuum_scale_factor = 0.1
+ autovacuum_analyze_scale_factor = 0.1
+ autovacuum = on
+});
+
+$node->restart;
+
+# sleep(3600);
+
+ok(1, "There are no segfaults");
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 9de0057bd1d..7f2ad810ca0 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -1,5 +1,6 @@
# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+subdir('autovacuum')
subdir('brin')
subdir('commit_ts')
subdir('delay_execution')
--
2.43.0
On Sun, May 25, 2025 at 10:22 AM Daniil Davydov <3danissimo@gmail.com> wrote:
Hi,
On Fri, May 23, 2025 at 6:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, May 22, 2025 at 12:44 AM Daniil Davydov <3danissimo@gmail.com> wrote:
On Wed, May 21, 2025 at 5:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I find that the name "autovacuum_reserved_workers_num" is generic. It
would be better to have a more specific name for parallel vacuum such
as autovacuum_max_parallel_workers. This parameter is related to
neither autovacuum_worker_slots nor autovacuum_max_workers, which
seems fine to me. Also, max_parallel_maintenance_workers doesn't
affect this parameter.This was my headache when I created names for variables. Autovacuum
initially implies parallelism, because we have several parallel a/v
workers.I'm not sure if it's parallelism. We can have multiple autovacuum
workers simultaneously working on different tables, which seems not
parallelism to me.Hm, I didn't thought about the 'parallelism' definition in this way.
But I see your point - the next v4 patch will contain the naming that
you suggest.So I think that parameter like
`autovacuum_max_parallel_workers` will confuse somebody.
If we want to have a more specific name, I would prefer
`max_parallel_index_autovacuum_workers`.It's better not to use 'index' as we're trying to extend parallel
vacuum to heap scanning/vacuuming as well[1].OK, I'll fix it.
+ /* + * If we are running autovacuum - decide whether we need to process indexes + * of table with given oid in parallel. + */ + if (AmAutoVacuumWorkerProcess() && + params->index_cleanup != VACOPTVALUE_DISABLED && + RelationAllowsParallelIdxAutovac(rel))I think that this should be done in autovacuum code.
We need params->index cleanup variable to decide whether we need to
use parallel index a/v. In autovacuum.c we have this code :
***
/*
* index_cleanup and truncate are unspecified at first in autovacuum.
* They will be filled in with usable values using their reloptions
* (or reloption defaults) later.
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
***
This variable is filled in inside the `vacuum_rel` function, so I
think we should keep the above logic in vacuum.c.I guess that we can specify the parallel degree even if index_cleanup
is still UNSPECIFIED. vacuum_rel() would then decide whether to use
index vacuuming and vacuumlazy.c would decide whether to use parallel
vacuum based on the specified parallel degree and index_cleanup value.+#define AV_PARALLEL_DEADTUP_THRESHOLD 1024
These fixed values really useful in common cases? I think we already
have an optimization where we skip vacuum indexes if the table has
fewer dead tuples (see BYPASS_THRESHOLD_PAGES).When we allocate dead items (and optionally init parallel autocuum) we
don't have sane value for `vacrel->lpdead_item_pages` (which should be
compared with BYPASS_THRESHOLD_PAGES).
The only criterion that we can focus on is the number of dead tuples
indicated in the PgStat_StatTabEntry.My point is that this criterion might not be useful. We have the
bypass optimization for index vacuuming and having many dead tuples
doesn't necessarily mean index vacuuming taking a long time. For
example, even if the table has a few dead tuples, index vacuuming
could take a very long time and parallel index vacuuming would help
the situation, if the table is very large and has many indexes.That sounds reasonable. I'll fix it.
But autovacuum (as I think) should work as stable as possible and
`unnoticed` by other processes. Thus, we must :
1) Compute resources (such as the number of parallel workers for a
single table's indexes vacuuming) as efficiently as possible.
2) Provide a guarantee that as many tables as possible (among
requested) will be processed in parallel.(1) can be achieved by calculating the parameters on the fly.
NUM_INDEXES_PER_PARALLEL_WORKER is a rough mock. I can provide more
accurate value in the near future.I think it requires more things than the number of indexes on the
table to achieve (1). Suppose that there is a very large table that
gets updates heavily and has a few indexes. If users want to avoid the
table from being bloated, it would be a reasonable idea to use
parallel vacuum during autovacuum and it would not be a good idea to
disallow using parallel vacuum solely because it doesn't have more
than 30 indexes. On the other hand, if the table had got many updates
but not so now, users might want to use resources for autovacuums on
other tables. We might need to consider autovacuum frequencies per
table, the statistics of the previous autovacuum, or system loads etc.
So I think that in order to achieve (1) we might need more statistics
and using only NUM_INDEXES_PER_PARALLEL_WORKER would not work fine.It's hard for me to imagine exactly how extended statistics will help
us track such situations.
It seems that for any of our heuristics, it will be possible to come
up with a counter example.
Maybe we can give advices (via logs) to the user? But for such an
idea, tests should be conducted so that we can understand when
resource consumption becomes ineffective.
I guess that we need to agree on an implementation before conducting such tests.(2) can be achieved by workers reserving - we know that N workers
(from bgworkers pool) are *always* at our disposal. And when we use
such workers we are not dependent on other operations in the cluster
and we don't interfere with other operations by taking resources away
from them.Reserving some bgworkers for autovacuum could make sense. But I think
it's better to implement it in a general way as it could be useful in
other use cases too. That is, it might be a good to implement
infrastructure so that any PostgreSQL code (possibly including
extensions) can request allocating a pool of bgworkers for specific
usage and use bgworkers from them.Reserving infrastructure is an ambitious idea. I am not sure that we
should implement it within this thread and feature.
Maybe we should create a separate thread for it and as a
justification, refer to parallel autovacuum?-----
Thanks everybody for feedback! I attach a v4 patch to this letter.
Main features :
1) 'parallel_autovacuum_workers' reloption - integer value, that sets
the maximum number of parallel a/v workers that can be taken from
bgworkers pool in order to process this table.
2) 'max_parallel_autovacuum_workers' - GUC variable, that sets the
maximum total number of parallel a/v workers, that can be taken from
bgworkers pool.
3) Parallel autovacuum does not try to use thresholds like
NUM_INDEXES_PER_PARALLEL_WORKER and AV_PARALLEL_DEADTUP_THRESHOLD.
4) Parallel autovacuum now can report statistics like "planned vs. launched".
5) For now I got rid of the 'reserving' idea, so now autovacuum
leaders are competing with everyone for parallel workers from the
bgworkers pool.What do you think about this implementation?
I think it basically makes sense to me. A few comments:
---
The patch implements max_parallel_autovacuum_workers as a
PGC_POSTMASTER parameter but can we make it PGC_SIGHUP? I think we
don't necessarily need to make it a PGC_POSTMATER since it actually
doesn't affect how much shared memory we need to allocate.
---
I think it's better to have the prefix "autovacuum" for the new GUC
parameter for better consistency with other autovacuum-related GUC
parameters.
---
#include "storage/spin.h"
@@ -514,6 +515,11 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
{
WaitForParallelWorkersToFinish(pcxt);
WaitForParallelWorkersToExit(pcxt);
+
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+ if (AmAutoVacuumWorkerProcess())
+ ParallelAutoVacuumReleaseWorkers(pcxt->nworkers_launched);
+
pcxt->nworkers_launched = 0;
if (pcxt->known_attached_workers)
{
@@ -1002,6 +1008,11 @@ DestroyParallelContext(ParallelContext *pcxt)
*/
HOLD_INTERRUPTS();
WaitForParallelWorkersToExit(pcxt);
+
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+ if (AmAutoVacuumWorkerProcess())
+ ParallelAutoVacuumReleaseWorkers(pcxt->nworkers_launched);
+
RESUME_INTERRUPTS();
I think that it's better to release workers in vacuumparallel.c rather
than parallel.c.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Hi,
On Wed, Jun 18, 2025 at 5:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sun, May 25, 2025 at 10:22 AM Daniil Davydov <3danissimo@gmail.com> wrote:
Thanks everybody for feedback! I attach a v4 patch to this letter.
Main features :
1) 'parallel_autovacuum_workers' reloption - integer value, that sets
the maximum number of parallel a/v workers that can be taken from
bgworkers pool in order to process this table.
2) 'max_parallel_autovacuum_workers' - GUC variable, that sets the
maximum total number of parallel a/v workers, that can be taken from
bgworkers pool.
3) Parallel autovacuum does not try to use thresholds like
NUM_INDEXES_PER_PARALLEL_WORKER and AV_PARALLEL_DEADTUP_THRESHOLD.
4) Parallel autovacuum now can report statistics like "planned vs. launched".
5) For now I got rid of the 'reserving' idea, so now autovacuum
leaders are competing with everyone for parallel workers from the
bgworkers pool.What do you think about this implementation?
I think it basically makes sense to me. A few comments:
---
The patch implements max_parallel_autovacuum_workers as a
PGC_POSTMASTER parameter but can we make it PGC_SIGHUP? I think we
don't necessarily need to make it a PGC_POSTMATER since it actually
doesn't affect how much shared memory we need to allocate.
Yep, there's nothing stopping us from doing that. This is a usable
feature, I'll implement it in the v5 patch.
---
I think it's better to have the prefix "autovacuum" for the new GUC
parameter for better consistency with other autovacuum-related GUC
parameters.--- #include "storage/spin.h" @@ -514,6 +515,11 @@ ReinitializeParallelDSM(ParallelContext *pcxt) { WaitForParallelWorkersToFinish(pcxt); WaitForParallelWorkersToExit(pcxt); + + /* Release all launched (i.e. reserved) parallel autovacuum workers. */ + if (AmAutoVacuumWorkerProcess()) + ParallelAutoVacuumReleaseWorkers(pcxt->nworkers_launched); + pcxt->nworkers_launched = 0; if (pcxt->known_attached_workers) { @@ -1002,6 +1008,11 @@ DestroyParallelContext(ParallelContext *pcxt) */ HOLD_INTERRUPTS(); WaitForParallelWorkersToExit(pcxt); + + /* Release all launched (i.e. reserved) parallel autovacuum workers. */ + if (AmAutoVacuumWorkerProcess()) + ParallelAutoVacuumReleaseWorkers(pcxt->nworkers_launched); + RESUME_INTERRUPTS();I think that it's better to release workers in vacuumparallel.c rather
than parallel.c.
Agree with both comments.
Thanks for the review! Please, see v5 patch :
1) GUC variable and field in autovacuum shmem are renamed
2) ParallelAutoVacuumReleaseWorkers call moved from parallel.c to
vacuumparallel.c
3) max_parallel_autovacuum_workers is now PGC_SIGHUP parameter
4) Fix little bug (ParallelAutoVacuumReleaseWorkers in autovacuum.c:735)
--
Best regards,
Daniil Davydov
Attachments:
v5-0002-Sandbox-for-parallel-index-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v5-0002-Sandbox-for-parallel-index-autovacuum.patchDownload
From 144c2dfda58103638435bccc55e8fe8d27dd1fad Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Fri, 16 May 2025 11:59:03 +0700
Subject: [PATCH v5 2/2] Sandbox for parallel index autovacuum
---
src/test/modules/Makefile | 1 +
src/test/modules/autovacuum/.gitignore | 1 +
src/test/modules/autovacuum/Makefile | 14 ++
src/test/modules/autovacuum/meson.build | 12 ++
.../autovacuum/t/001_autovac_parallel.pl | 131 ++++++++++++++++++
src/test/modules/meson.build | 1 +
6 files changed, 160 insertions(+)
create mode 100644 src/test/modules/autovacuum/.gitignore
create mode 100644 src/test/modules/autovacuum/Makefile
create mode 100644 src/test/modules/autovacuum/meson.build
create mode 100644 src/test/modules/autovacuum/t/001_autovac_parallel.pl
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index aa1d27bbed3..b7f3e342e82 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -5,6 +5,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
SUBDIRS = \
+ autovacuum \
brin \
commit_ts \
delay_execution \
diff --git a/src/test/modules/autovacuum/.gitignore b/src/test/modules/autovacuum/.gitignore
new file mode 100644
index 00000000000..0b54641bceb
--- /dev/null
+++ b/src/test/modules/autovacuum/.gitignore
@@ -0,0 +1 @@
+/tmp_check/
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/Makefile b/src/test/modules/autovacuum/Makefile
new file mode 100644
index 00000000000..90c00ff350b
--- /dev/null
+++ b/src/test/modules/autovacuum/Makefile
@@ -0,0 +1,14 @@
+# src/test/modules/autovacuum/Makefile
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/meson.build b/src/test/modules/autovacuum/meson.build
new file mode 100644
index 00000000000..f91c1a14d2b
--- /dev/null
+++ b/src/test/modules/autovacuum/meson.build
@@ -0,0 +1,12 @@
+# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+
+tests += {
+ 'name': 'autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_autovac_parallel.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/autovacuum/t/001_autovac_parallel.pl b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
new file mode 100644
index 00000000000..ae892e5b4de
--- /dev/null
+++ b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
@@ -0,0 +1,131 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+$node->append_conf('postgresql.conf', qq{
+ autovacuum = off
+ max_wal_size = 4096
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 10
+ log_min_messages = info
+});
+$node->start;
+
+my $indexes_num = 80;
+my $initial_rows_num = 100_000;
+my $parallel_autovacuum_workers = 5;
+
+# Create big table and create specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER, col_5 INTEGER,
+ col_6 INTEGER, col_7 INTEGER, col_8 INTEGER, col_9 INTEGER, col_10 INTEGER,
+ col_11 INTEGER, col_12 INTEGER, col_13 INTEGER, col_14 INTEGER, col_15 INTEGER,
+ col_16 INTEGER, col_17 INTEGER, col_18 INTEGER, col_19 INTEGER, col_20 INTEGER,
+ col_21 INTEGER, col_22 INTEGER, col_23 INTEGER, col_24 INTEGER, col_25 INTEGER,
+ col_26 INTEGER, col_27 INTEGER, col_28 INTEGER, col_29 INTEGER, col_30 INTEGER,
+ col_31 INTEGER, col_32 INTEGER, col_33 INTEGER, col_34 INTEGER, col_35 INTEGER,
+ col_36 INTEGER, col_37 INTEGER, col_38 INTEGER, col_39 INTEGER, col_40 INTEGER,
+ col_41 INTEGER, col_42 INTEGER, col_43 INTEGER, col_44 INTEGER, col_45 INTEGER,
+ col_46 INTEGER, col_47 INTEGER, col_48 INTEGER, col_49 INTEGER, col_50 INTEGER,
+ col_51 INTEGER, col_52 INTEGER, col_53 INTEGER, col_54 INTEGER, col_55 INTEGER,
+ col_56 INTEGER, col_57 INTEGER, col_58 INTEGER, col_59 INTEGER, col_60 INTEGER,
+ col_61 INTEGER, col_62 INTEGER, col_63 INTEGER, col_64 INTEGER, col_65 INTEGER,
+ col_66 INTEGER, col_67 INTEGER, col_68 INTEGER, col_69 INTEGER, col_70 INTEGER,
+ col_71 INTEGER, col_72 INTEGER, col_73 INTEGER, col_74 INTEGER, col_75 INTEGER,
+ col_76 INTEGER, col_77 INTEGER, col_78 INTEGER, col_79 INTEGER, col_80 INTEGER,
+ col_81 INTEGER, col_82 INTEGER, col_83 INTEGER, col_84 INTEGER, col_85 INTEGER,
+ col_86 INTEGER, col_87 INTEGER, col_88 INTEGER, col_89 INTEGER, col_90 INTEGER,
+ col_91 INTEGER, col_92 INTEGER, col_93 INTEGER, col_94 INTEGER, col_95 INTEGER,
+ col_96 INTEGER, col_97 INTEGER, col_98 INTEGER, col_99 INTEGER, col_100 INTEGER
+ ) WITH (parallel_autovacuum_workers = $parallel_autovacuum_workers);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM pg_index i
+ JOIN pg_class c ON c.oid = i.indrelid
+ WHERE c.relname = 'test_autovac';",
+ stdout => \$psql_out
+);
+is($psql_out, $indexes_num + 1, "All indexes created successfully");
+
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac (
+ col_1, col_2, col_3, col_4, col_5, col_6, col_7, col_8, col_9, col_10,
+ col_11, col_12, col_13, col_14, col_15, col_16, col_17, col_18, col_19, col_20,
+ col_21, col_22, col_23, col_24, col_25, col_26, col_27, col_28, col_29, col_30,
+ col_31, col_32, col_33, col_34, col_35, col_36, col_37, col_38, col_39, col_40,
+ col_41, col_42, col_43, col_44, col_45, col_46, col_47, col_48, col_49, col_50,
+ col_51, col_52, col_53, col_54, col_55, col_56, col_57, col_58, col_59, col_60,
+ col_61, col_62, col_63, col_64, col_65, col_66, col_67, col_68, col_69, col_70,
+ col_71, col_72, col_73, col_74, col_75, col_76, col_77, col_78, col_79, col_80,
+ col_81, col_82, col_83, col_84, col_85, col_86, col_87, col_88, col_89, col_90,
+ col_91, col_92, col_93, col_94, col_95, col_96, col_97, col_98, col_99, col_100
+ ) VALUES (
+ i, i + 1, i + 2, i + 3, i + 4, i + 5, i + 6, i + 7, i + 8, i + 9,
+ i + 10, i + 11, i + 12, i + 13, i + 14, i + 15, i + 16, i + 17, i + 18, i + 19,
+ i + 20, i + 21, i + 22, i + 23, i + 24, i + 25, i + 26, i + 27, i + 28, i + 29,
+ i + 30, i + 31, i + 32, i + 33, i + 34, i + 35, i + 36, i + 37, i + 38, i + 39,
+ i + 40, i + 41, i + 42, i + 43, i + 44, i + 45, i + 46, i + 47, i + 48, i + 49,
+ i + 50, i + 51, i + 52, i + 53, i + 54, i + 55, i + 56, i + 57, i + 58, i + 59,
+ i + 60, i + 61, i + 62, i + 63, i + 64, i + 65, i + 66, i + 67, i + 68, i + 69,
+ i + 70, i + 71, i + 72, i + 73, i + 74, i + 75, i + 76, i + 77, i + 78, i + 79,
+ i + 80, i + 81, i + 82, i + 83, i + 84, i + 85, i + 86, i + 87, i + 88, i + 89,
+ i + 90, i + 91, i + 92, i + 93, i + 94, i + 95, i + 96, i + 97, i + 98, i + 99
+ );
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM test_autovac;",
+ stdout => \$psql_out
+);
+is($psql_out, $initial_rows_num, "All data inserted into table successfully");
+
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Reduce autovacuum_work_mem, so leader process will perform parallel indexi
+# vacuum phase several times
+$node->append_conf('postgresql.conf', qq{
+ autovacuum_naptime = '1s'
+ autovacuum_vacuum_threshold = 1
+ autovacuum_analyze_threshold = 1
+ autovacuum_vacuum_scale_factor = 0.1
+ autovacuum_analyze_scale_factor = 0.1
+ autovacuum = on
+});
+
+$node->restart;
+
+# sleep(3600);
+
+ok(1, "There are no segfaults");
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 9de0057bd1d..7f2ad810ca0 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -1,5 +1,6 @@
# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+subdir('autovacuum')
subdir('brin')
subdir('commit_ts')
subdir('delay_execution')
--
2.43.0
v5-0001-Parallel-index-autovacuum-with-bgworkers.patchtext/x-patch; charset=US-ASCII; name=v5-0001-Parallel-index-autovacuum-with-bgworkers.patchDownload
From 88e55d49895ebc287213a415c242b4733cdecba8 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Fri, 16 May 2025 11:58:40 +0700
Subject: [PATCH v5 1/2] Parallel index autovacuum with bgworkers
---
src/backend/access/common/reloptions.c | 12 ++
src/backend/access/heap/vacuumlazy.c | 6 +-
src/backend/commands/vacuumparallel.c | 93 ++++++++---
src/backend/postmaster/autovacuum.c | 144 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc_tables.c | 10 ++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 4 +
src/include/utils/guc_hooks.h | 2 +
src/include/utils/rel.h | 12 ++
11 files changed, 259 insertions(+), 27 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 50747c16396..e36d59f632b 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,16 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "parallel_autovacuum_workers",
+ "Maximum number of parallel autovacuum workers that can be taken from bgworkers pool for processing this table. "
+ "If value is 0 then parallel degree will computed based on number of indexes.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1872,6 +1882,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"parallel_autovacuum_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, parallel_autovacuum_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 09416450af9..b89b1563444 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -3493,6 +3493,10 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
autovacuum_work_mem != -1 ?
autovacuum_work_mem : maintenance_work_mem;
+ int elevel = AmAutoVacuumWorkerProcess() ||
+ vacrel->verbose ?
+ INFO : DEBUG2;
+
/*
* Initialize state for a parallel vacuum. As of now, only one worker can
* be used for an index, so we invoke parallelism only if there are at
@@ -3519,7 +3523,7 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
vacrel->pvs = parallel_vacuum_init(vacrel->rel, vacrel->indrels,
vacrel->nindexes, nworkers,
vac_work_mem,
- vacrel->verbose ? INFO : DEBUG2,
+ elevel,
vacrel->bstrategy);
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..bd314d23298 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,16 +1,16 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel [auto]vacuum execution.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
*
- * In a parallel vacuum, we perform both index bulk deletion and index cleanup
- * with parallel worker processes. Individual indexes are processed by one
- * vacuum process. ParallelVacuumState contains shared information as well as
- * the memory space for storing dead items allocated in the DSA area. We
- * launch parallel worker processes at the start of parallel index
+ * In a parallel [auto]vacuum, we perform both index bulk deletion and index
+ * cleanup with parallel worker processes. Individual indexes are processed by
+ * one [auto]vacuum process. ParallelVacuumState contains shared information
+ * as well as the memory space for storing dead items allocated in the DSA area.
+ * We launch parallel worker processes at the start of parallel index
* bulk-deletion and index cleanup and once all indexes are processed, the
* parallel worker processes exit. Each time we process indexes in parallel,
* the parallel context is re-initialized so that the same DSM can be used for
@@ -34,6 +34,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -157,7 +158,8 @@ typedef struct PVIndStats
} PVIndStats;
/*
- * Struct for maintaining a parallel vacuum state. typedef appears in vacuum.h.
+ * Struct for maintaining a parallel [auto]vacuum state. typedef appears in
+ * vacuum.h.
*/
struct ParallelVacuumState
{
@@ -371,10 +373,18 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->relid = RelationGetRelid(rel);
shared->elevel = elevel;
shared->queryid = pgstat_get_my_query_id();
- shared->maintenance_work_mem_worker =
- (nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+
+ if (AmAutoVacuumWorkerProcess())
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) :
+ autovacuum_work_mem;
+ else
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+ maintenance_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -435,6 +445,8 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
void
parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
{
+ int nlaunched_workers;
+
Assert(!IsParallelWorker());
/* Copy the updated statistics */
@@ -453,7 +465,13 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
TidStoreDestroy(pvs->dead_items);
+ nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember this value */
DestroyParallelContext(pvs->pcxt);
+
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+ if (AmAutoVacuumWorkerProcess())
+ ParallelAutoVacuumReleaseWorkers(nlaunched_workers);
+
ExitParallelMode();
pfree(pvs->will_parallel_vacuum);
@@ -541,7 +559,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*
* nrequested is the number of parallel workers that user requested. If
* nrequested is 0, we compute the parallel degree based on nindexes, that is
- * the number of indexes that support parallel vacuum. This function also
+ * the number of indexes that support parallel [auto]vacuum. This function also
* sets will_parallel_vacuum to remember indexes that participate in parallel
* vacuum.
*/
@@ -558,7 +576,9 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster ||
+ (autovacuum_max_parallel_workers == 0 && AmAutoVacuumWorkerProcess()) ||
+ (max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess()))
return 0;
/*
@@ -597,15 +617,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = AmAutoVacuumWorkerProcess() ?
+ Min(parallel_workers, autovacuum_max_parallel_workers) :
+ Min(parallel_workers, max_parallel_maintenance_workers);
return parallel_workers;
}
/*
* Perform index vacuum or index cleanup with parallel workers. This function
- * must be used by the parallel vacuum leader process.
+ * must be used by the parallel [auto]vacuum leader process.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -666,13 +688,26 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
/* Reset the parallel index processing and progress counters */
pg_atomic_write_u32(&(pvs->shared->idx), 0);
+ /* Check how many workers can provide autovacuum. */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = ParallelAutoVacuumReserveWorkers(nworkers);
+
/* Setup the shared cost-based vacuum delay and launch workers */
if (nworkers > 0)
{
/* Reinitialize parallel context to relaunch parallel workers */
if (num_index_scans > 0)
+ {
ReinitializeParallelDSM(pvs->pcxt);
+ /*
+ * Release all launched (i.e. reserved) parallel autovacuum
+ * workers.
+ */
+ if (AmAutoVacuumWorkerProcess())
+ ParallelAutoVacuumReleaseWorkers(pvs->pcxt->nworkers_launched);
+ }
+
/*
* Set up shared cost balance and the number of active workers for
* vacuum delay. We need to do this before launching workers as
@@ -690,6 +725,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ ParallelAutoVacuumReleaseWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -706,16 +751,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
if (vacuum)
ereport(pvs->shared->elevel,
- (errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
- "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+ (errmsg(ngettext("launched %d parallel %svacuum worker for index vacuuming (planned: %d)",
+ "launched %d parallel %svacuum workers for index vacuuming (planned: %d)",
pvs->pcxt->nworkers_launched),
- pvs->pcxt->nworkers_launched, nworkers)));
+ pvs->pcxt->nworkers_launched, AmAutoVacuumWorkerProcess() ? "auto" : "", nworkers)));
else
ereport(pvs->shared->elevel,
- (errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
- "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+ (errmsg(ngettext("launched %d parallel %svacuum worker for index cleanup (planned: %d)",
+ "launched %d parallel %svacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
- pvs->pcxt->nworkers_launched, nworkers)));
+ pvs->pcxt->nworkers_launched, AmAutoVacuumWorkerProcess() ? "auto" : "", nworkers)));
}
/* Vacuum the indexes that can be processed by only leader process */
@@ -982,8 +1027,8 @@ parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
/*
* Perform work within a launched parallel process.
*
- * Since parallel vacuum workers perform only index vacuum or index cleanup,
- * we don't need to report progress information.
+ * Since parallel [auto]vacuum workers perform only index vacuum or index
+ * cleanup, we don't need to report progress information.
*/
void
parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 451fb90a610..60600b9ff52 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -285,6 +285,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_available_parallel_workers the number of available parallel autovacuum
+ * workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +301,7 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_available_parallel_workers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -354,6 +357,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void check_parallel_av_gucs(int prev_max_parallel_workers);
@@ -753,7 +757,9 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev;
+ autovacuum_max_parallel_workers_prev = autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -769,6 +775,14 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ check_parallel_av_gucs(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -2847,8 +2861,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->parallel_autovacuum_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3329,6 +3347,72 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'max_parallel_autovacuum_workers' limit, leader worker
+ * must call this function. It returns the number of parallel workers that
+ * actually can be launched and reserves (if any) these workers in global
+ * autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+ParallelAutoVacuumReserveWorkers(int nworkers)
+{
+ int can_launch;
+
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ if (AutoVacuumShmem->av_available_parallel_workers < nworkers)
+ {
+ /* Provide as many workers as we can. */
+ can_launch = AutoVacuumShmem->av_available_parallel_workers;
+ AutoVacuumShmem->av_available_parallel_workers = 0;
+ }
+ else
+ {
+ /* OK, we can provide all requested workers. */
+ can_launch = nworkers;
+ AutoVacuumShmem->av_available_parallel_workers -= nworkers;
+ }
+ LWLockRelease(AutovacuumLock);
+
+ return can_launch;
+}
+
+/*
+ * When parallel autovacuum worker die, leader worker must call this function
+ * in order to refresh global autovacuum state. Thus, other leaders will be able
+ * to use these workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+ParallelAutoVacuumReleaseWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ AutoVacuumShmem->av_available_parallel_workers += nworkers;
+
+ /*
+ * If autovacuum_max_parallel_workers variable was reduced during parallel
+ * autovacuum execution, we must cap available workers number by its new
+ * value.
+ */
+ if (AutoVacuumShmem->av_available_parallel_workers >
+ autovacuum_max_parallel_workers)
+ {
+ AutoVacuumShmem->av_available_parallel_workers =
+ autovacuum_max_parallel_workers;
+ }
+
+ LWLockRelease(AutovacuumLock);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3389,6 +3473,8 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_available_parallel_workers =
+ autovacuum_max_parallel_workers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3439,6 +3525,15 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
return true;
}
+bool
+check_autovacuum_max_parallel_workers(int *newval, void **extra,
+ GucSource source)
+{
+ if (*newval >= max_worker_processes)
+ return false;
+ return true;
+}
+
/*
* Returns whether there is a free autovacuum worker slot available.
*/
@@ -3470,3 +3565,48 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of available parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+check_parallel_av_gucs(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ if (AutoVacuumShmem->av_available_parallel_workers >
+ autovacuum_max_parallel_workers)
+ {
+ Assert(prev_max_parallel_workers > autovacuum_max_parallel_workers);
+
+ /*
+ * Number of available workers must not exeed limit.
+ *
+ * Note, that if some parallel autovacuum workers are running at this
+ * moment, available workers number will not exeed limit after releasing
+ * them (see ParallelAutoVacuumReleaseWorkers).
+ */
+ AutoVacuumShmem->av_available_parallel_workers =
+ autovacuum_max_parallel_workers;
+ }
+ else if ((AutoVacuumShmem->av_available_parallel_workers <
+ autovacuum_max_parallel_workers) &&
+ (autovacuum_max_parallel_workers > prev_max_parallel_workers))
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of available workers in shmem.
+ */
+ AutoVacuumShmem->av_available_parallel_workers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+
+ /*
+ * Nothing to do when autovacuum_max_parallel_workers <
+ * prev_max_parallel_workers. Available workers number will be capped
+ * inside ParallelAutoVacuumReleaseWorkers.
+ */
+ }
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..977644978c1 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index f04bfedb2fd..be76263c431 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3604,6 +3604,16 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"autovacuum_max_parallel_workers", PGC_SIGHUP, VACUUM_AUTOVACUUM,
+ gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."),
+ gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."),
+ },
+ &autovacuum_max_parallel_workers,
+ 0, 0, MAX_BACKENDS,
+ check_autovacuum_max_parallel_workers, NULL, NULL
+ },
+
{
{"max_parallel_maintenance_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
gettext_noop("Sets the maximum number of parallel processes per maintenance operation."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 341f88adc87..f2b6ba7755e 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -683,6 +683,7 @@
autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_parallel_workers
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..85926415657 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e8135f41a1c..b5763e6ac36 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -64,6 +64,10 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern int ParallelAutoVacuumReserveWorkers(int nworkers);
+extern void ParallelAutoVacuumReleaseWorkers(int nworkers);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 799fa7ace68..5c66f37cd53 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -31,6 +31,8 @@ extern void assign_application_name(const char *newval, void *extra);
extern const char *show_archive_command(void);
extern bool check_autovacuum_work_mem(int *newval, void **extra,
GucSource source);
+extern bool check_autovacuum_max_parallel_workers(int *newval, void **extra,
+ GucSource source);
extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
GucSource source);
extern bool check_backtrace_functions(char **newval, void **extra,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..16091e6a773 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,8 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+ int parallel_autovacuum_workers; /* max number of parallel
+ autovacuum workers */
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
@@ -409,6 +411,16 @@ typedef struct StdRdOptions
((relation)->rd_options ? \
((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
+/*
+ * RelationGetParallelAutovacuumWorkers
+ * Returns the relation's parallel_autovacuum_workers reloption setting.
+ * Note multiple eval of argument!
+ */
+#define RelationGetParallelAutovacuumWorkers(relation, defaultpw) \
+ ((relation)->rd_options ? \
+ ((StdRdOptions *) (relation)->rd_options)->autovacuum.parallel_autovacuum_workers : \
+ (defaultpw))
+
/* ViewOptions->check_option values */
typedef enum ViewOptCheckOption
{
--
2.43.0
On Wed Jun 18, 2025 at 5:03 AM -03, Daniil Davydov wrote:
Thanks for the review! Please, see v5 patch :
1) GUC variable and field in autovacuum shmem are renamed
2) ParallelAutoVacuumReleaseWorkers call moved from parallel.c to
vacuumparallel.c
3) max_parallel_autovacuum_workers is now PGC_SIGHUP parameter
4) Fix little bug (ParallelAutoVacuumReleaseWorkers in autovacuum.c:735)
Thanks for the new version!
The "autovacuum_max_parallel_workers" declared on guc_tables.c mention
that is capped by "max_worker_process":
+ {
+ {"autovacuum_max_parallel_workers", PGC_SIGHUP, VACUUM_AUTOVACUUM,
+ gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."),
+ gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."),
+ },
+ &autovacuum_max_parallel_workers,
+ 0, 0, MAX_BACKENDS,
+ check_autovacuum_max_parallel_workers, NULL, NULL
+ },
But the postgresql.conf.sample say that it is limited by
max_parallel_workers:
+#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_parallel_workers
IIUC the code, it cap by "max_worker_process", but Masahiko has mention
on [1]/messages/by-id/CAD21AoAxTkpkLtJDgrH9dXg_h+yzOZpOZj3B-4FjW1Mr4qEdbQ@mail.gmail.com that it should be capped by max_parallel_workers.
---
We actually capping the autovacuum_max_parallel_workers by
max_worker_process-1, so we can't have 10 max_worker_process and 10
autovacuum_max_parallel_workers. Is that correct?
+bool
+check_autovacuum_max_parallel_workers(int *newval, void **extra,
+ GucSource source)
+{
+ if (*newval >= max_worker_processes)
+ return false;
+ return true;
+}
---
Locking unnecessary the AutovacuumLock if none if the if's is true can
cause some performance issue here? I don't think that this would be a
serious problem because this code will only be called if the
configuration file is changed during the autovacuum execution right? But
I could be wrong, so just sharing my thoughts on this (still learning
about [auto]vacuum code).
+
+/*
+ * Make sure that number of available parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+check_parallel_av_gucs(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ if (AutoVacuumShmem->av_available_parallel_workers >
+ autovacuum_max_parallel_workers)
+ {
+ Assert(prev_max_parallel_workers > autovacuum_max_parallel_workers);
+
Typo on "exeed"
+ /*
+ * Number of available workers must not exeed limit.
+ *
+ * Note, that if some parallel autovacuum workers are running at this
+ * moment, available workers number will not exeed limit after releasing
+ * them (see ParallelAutoVacuumReleaseWorkers).
+ */
---
I'm not seeing an usage of this macro?
+/*
+ * RelationGetParallelAutovacuumWorkers
+ * Returns the relation's parallel_autovacuum_workers reloption setting.
+ * Note multiple eval of argument!
+ */
+#define RelationGetParallelAutovacuumWorkers(relation, defaultpw) \
+ ((relation)->rd_options ? \
+ ((StdRdOptions *) (relation)->rd_options)->autovacuum.parallel_autovacuum_workers : \
+ (defaultpw))
+
---
Also pgindent is needed on some files.
---
I've made some tests and I can confirm that is working correctly for
what I can see. I think that would be to start include the documentation
changes, what do you think?
[1]: /messages/by-id/CAD21AoAxTkpkLtJDgrH9dXg_h+yzOZpOZj3B-4FjW1Mr4qEdbQ@mail.gmail.com
--
Matheus Alcantara
Hi,
On Fri, Jul 4, 2025 at 9:21 PM Matheus Alcantara
<matheusssilv97@gmail.com> wrote:
The "autovacuum_max_parallel_workers" declared on guc_tables.c mention that is capped by "max_worker_process": + { + {"autovacuum_max_parallel_workers", PGC_SIGHUP, VACUUM_AUTOVACUUM, + gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."), + gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."), + }, + &autovacuum_max_parallel_workers, + 0, 0, MAX_BACKENDS, + check_autovacuum_max_parallel_workers, NULL, NULL + },IIUC the code, it cap by "max_worker_process", but Masahiko has mention
on [1] that it should be capped by max_parallel_workers.
Thanks for looking into it!
To be honest, I don't think that this parameter should be explicitly
capped at all.
Other parallel operations (for example parallel index build or VACUUM
PARALLEL) just request as many workers as they want without looking at
'max_parallel_workers'.
And they will not complain, if not all requested workers were launched.
Thus, even if 'autovacuum_max_parallel_workers' is higher than
'max_parallel_workers' the worst that can happen is that not all
requested workers will be running (which is a common situation).
Users can handle it by looking for logs like "planned vs. launched"
and increasing 'max_parallel_workers' if needed.
On the other hand, obviously it doesn't make sense to request more
workers than 'max_worker_processes' (moreover, this parameter cannot
be changed as easily as 'max_parallel_workers').
I will keep the 'max_worker_processes' limit, so autovacuum will not
waste time initializing a parallel context if there is no chance that
the request will succeed.
But it's worth remembering that actually the
'autovacuum_max_parallel_workers' parameter will always be implicitly
capped by 'max_parallel_workers'.
What do you think about it?
But the postgresql.conf.sample say that it is limited by
max_parallel_workers:
+#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_parallel_workers
Good catch, I'll fix it.
---
We actually capping the autovacuum_max_parallel_workers by
max_worker_process-1, so we can't have 10 max_worker_process and 10
autovacuum_max_parallel_workers. Is that correct?
Yep. The explanation can be found just above in this letter.
---
Locking unnecessary the AutovacuumLock if none if the if's is true can
cause some performance issue here? I don't think that this would be a
serious problem because this code will only be called if the
configuration file is changed during the autovacuum execution right? But
I could be wrong, so just sharing my thoughts on this (still learning
about [auto]vacuum code).+ +/* + * Make sure that number of available parallel workers corresponds to the + * autovacuum_max_parallel_workers parameter (after it was changed). + */ +static void +check_parallel_av_gucs(int prev_max_parallel_workers) +{ + LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE); + + if (AutoVacuumShmem->av_available_parallel_workers > + autovacuum_max_parallel_workers) + { + Assert(prev_max_parallel_workers > autovacuum_max_parallel_workers); +
This function may be called by a/v launcher when we already have some
a/v workers running.
A/v workers can change the
AutoVacuumShmem->av_available_parallel_workers value, so I think we
should acquire appropriate lock before reading it.
Typo on "exeed"
+ /* + * Number of available workers must not exeed limit. + * + * Note, that if some parallel autovacuum workers are running at this + * moment, available workers number will not exeed limit after releasing + * them (see ParallelAutoVacuumReleaseWorkers). + */
Oops. I'll fix it.
---
I'm not seeing an usage of this macro? +/* + * RelationGetParallelAutovacuumWorkers + * Returns the relation's parallel_autovacuum_workers reloption setting. + * Note multiple eval of argument! + */ +#define RelationGetParallelAutovacuumWorkers(relation, defaultpw) \ + ((relation)->rd_options ? \ + ((StdRdOptions *) (relation)->rd_options)->autovacuum.parallel_autovacuum_workers : \ + (defaultpw)) +
Yes, this is the relic of a past implementation. I'll delete this macro.
I've made some tests and I can confirm that is working correctly for
what I can see. I think that would be to start include the documentation
changes, what do you think?
It sounds tempting :)
But perhaps first we should agree on the limitation of the
'autovacuum_max_parallel_workers' parameter.
Please, see v6 patches :
1) Fixed typos in autovacuum.c and postgresql.conf.sample
2) Removed unused macro 'RelationGetParallelAutovacuumWorkers'
--
Best regards,
Daniil Davydov
Attachments:
v6-0001-Parallel-index-autovacuum-with-bgworkers.patchtext/x-patch; charset=US-ASCII; name=v6-0001-Parallel-index-autovacuum-with-bgworkers.patchDownload
From 20ef6a60d7eb4bbfa2d3e36ff36301abb26e4622 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Fri, 16 May 2025 11:58:40 +0700
Subject: [PATCH v6 1/2] Parallel index autovacuum with bgworkers
---
src/backend/access/common/reloptions.c | 12 ++
src/backend/access/heap/vacuumlazy.c | 6 +-
src/backend/commands/vacuumparallel.c | 93 ++++++++---
src/backend/postmaster/autovacuum.c | 144 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc_tables.c | 10 ++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 4 +
src/include/utils/guc_hooks.h | 2 +
src/include/utils/rel.h | 2 +
11 files changed, 249 insertions(+), 27 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 50747c16396..e36d59f632b 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,16 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "parallel_autovacuum_workers",
+ "Maximum number of parallel autovacuum workers that can be taken from bgworkers pool for processing this table. "
+ "If value is 0 then parallel degree will computed based on number of indexes.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1872,6 +1882,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"parallel_autovacuum_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, parallel_autovacuum_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 09416450af9..b89b1563444 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -3493,6 +3493,10 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
autovacuum_work_mem != -1 ?
autovacuum_work_mem : maintenance_work_mem;
+ int elevel = AmAutoVacuumWorkerProcess() ||
+ vacrel->verbose ?
+ INFO : DEBUG2;
+
/*
* Initialize state for a parallel vacuum. As of now, only one worker can
* be used for an index, so we invoke parallelism only if there are at
@@ -3519,7 +3523,7 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
vacrel->pvs = parallel_vacuum_init(vacrel->rel, vacrel->indrels,
vacrel->nindexes, nworkers,
vac_work_mem,
- vacrel->verbose ? INFO : DEBUG2,
+ elevel,
vacrel->bstrategy);
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..bd314d23298 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,16 +1,16 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel [auto]vacuum execution.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
*
- * In a parallel vacuum, we perform both index bulk deletion and index cleanup
- * with parallel worker processes. Individual indexes are processed by one
- * vacuum process. ParallelVacuumState contains shared information as well as
- * the memory space for storing dead items allocated in the DSA area. We
- * launch parallel worker processes at the start of parallel index
+ * In a parallel [auto]vacuum, we perform both index bulk deletion and index
+ * cleanup with parallel worker processes. Individual indexes are processed by
+ * one [auto]vacuum process. ParallelVacuumState contains shared information
+ * as well as the memory space for storing dead items allocated in the DSA area.
+ * We launch parallel worker processes at the start of parallel index
* bulk-deletion and index cleanup and once all indexes are processed, the
* parallel worker processes exit. Each time we process indexes in parallel,
* the parallel context is re-initialized so that the same DSM can be used for
@@ -34,6 +34,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -157,7 +158,8 @@ typedef struct PVIndStats
} PVIndStats;
/*
- * Struct for maintaining a parallel vacuum state. typedef appears in vacuum.h.
+ * Struct for maintaining a parallel [auto]vacuum state. typedef appears in
+ * vacuum.h.
*/
struct ParallelVacuumState
{
@@ -371,10 +373,18 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->relid = RelationGetRelid(rel);
shared->elevel = elevel;
shared->queryid = pgstat_get_my_query_id();
- shared->maintenance_work_mem_worker =
- (nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+
+ if (AmAutoVacuumWorkerProcess())
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) :
+ autovacuum_work_mem;
+ else
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+ maintenance_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -435,6 +445,8 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
void
parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
{
+ int nlaunched_workers;
+
Assert(!IsParallelWorker());
/* Copy the updated statistics */
@@ -453,7 +465,13 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
TidStoreDestroy(pvs->dead_items);
+ nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember this value */
DestroyParallelContext(pvs->pcxt);
+
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+ if (AmAutoVacuumWorkerProcess())
+ ParallelAutoVacuumReleaseWorkers(nlaunched_workers);
+
ExitParallelMode();
pfree(pvs->will_parallel_vacuum);
@@ -541,7 +559,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*
* nrequested is the number of parallel workers that user requested. If
* nrequested is 0, we compute the parallel degree based on nindexes, that is
- * the number of indexes that support parallel vacuum. This function also
+ * the number of indexes that support parallel [auto]vacuum. This function also
* sets will_parallel_vacuum to remember indexes that participate in parallel
* vacuum.
*/
@@ -558,7 +576,9 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster ||
+ (autovacuum_max_parallel_workers == 0 && AmAutoVacuumWorkerProcess()) ||
+ (max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess()))
return 0;
/*
@@ -597,15 +617,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = AmAutoVacuumWorkerProcess() ?
+ Min(parallel_workers, autovacuum_max_parallel_workers) :
+ Min(parallel_workers, max_parallel_maintenance_workers);
return parallel_workers;
}
/*
* Perform index vacuum or index cleanup with parallel workers. This function
- * must be used by the parallel vacuum leader process.
+ * must be used by the parallel [auto]vacuum leader process.
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -666,13 +688,26 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
/* Reset the parallel index processing and progress counters */
pg_atomic_write_u32(&(pvs->shared->idx), 0);
+ /* Check how many workers can provide autovacuum. */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = ParallelAutoVacuumReserveWorkers(nworkers);
+
/* Setup the shared cost-based vacuum delay and launch workers */
if (nworkers > 0)
{
/* Reinitialize parallel context to relaunch parallel workers */
if (num_index_scans > 0)
+ {
ReinitializeParallelDSM(pvs->pcxt);
+ /*
+ * Release all launched (i.e. reserved) parallel autovacuum
+ * workers.
+ */
+ if (AmAutoVacuumWorkerProcess())
+ ParallelAutoVacuumReleaseWorkers(pvs->pcxt->nworkers_launched);
+ }
+
/*
* Set up shared cost balance and the number of active workers for
* vacuum delay. We need to do this before launching workers as
@@ -690,6 +725,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ ParallelAutoVacuumReleaseWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -706,16 +751,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
if (vacuum)
ereport(pvs->shared->elevel,
- (errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
- "launched %d parallel vacuum workers for index vacuuming (planned: %d)",
+ (errmsg(ngettext("launched %d parallel %svacuum worker for index vacuuming (planned: %d)",
+ "launched %d parallel %svacuum workers for index vacuuming (planned: %d)",
pvs->pcxt->nworkers_launched),
- pvs->pcxt->nworkers_launched, nworkers)));
+ pvs->pcxt->nworkers_launched, AmAutoVacuumWorkerProcess() ? "auto" : "", nworkers)));
else
ereport(pvs->shared->elevel,
- (errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
- "launched %d parallel vacuum workers for index cleanup (planned: %d)",
+ (errmsg(ngettext("launched %d parallel %svacuum worker for index cleanup (planned: %d)",
+ "launched %d parallel %svacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
- pvs->pcxt->nworkers_launched, nworkers)));
+ pvs->pcxt->nworkers_launched, AmAutoVacuumWorkerProcess() ? "auto" : "", nworkers)));
}
/* Vacuum the indexes that can be processed by only leader process */
@@ -982,8 +1027,8 @@ parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
/*
* Perform work within a launched parallel process.
*
- * Since parallel vacuum workers perform only index vacuum or index cleanup,
- * we don't need to report progress information.
+ * Since parallel [auto]vacuum workers perform only index vacuum or index
+ * cleanup, we don't need to report progress information.
*/
void
parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 451fb90a610..9e8b00ae0cb 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -285,6 +285,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_available_parallel_workers the number of available parallel autovacuum
+ * workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +301,7 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_available_parallel_workers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -354,6 +357,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void check_parallel_av_gucs(int prev_max_parallel_workers);
@@ -753,7 +757,9 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev;
+ autovacuum_max_parallel_workers_prev = autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -769,6 +775,14 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ check_parallel_av_gucs(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -2847,8 +2861,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->parallel_autovacuum_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3329,6 +3347,72 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'max_parallel_autovacuum_workers' limit, leader worker
+ * must call this function. It returns the number of parallel workers that
+ * actually can be launched and reserves (if any) these workers in global
+ * autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+ParallelAutoVacuumReserveWorkers(int nworkers)
+{
+ int can_launch;
+
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ if (AutoVacuumShmem->av_available_parallel_workers < nworkers)
+ {
+ /* Provide as many workers as we can. */
+ can_launch = AutoVacuumShmem->av_available_parallel_workers;
+ AutoVacuumShmem->av_available_parallel_workers = 0;
+ }
+ else
+ {
+ /* OK, we can provide all requested workers. */
+ can_launch = nworkers;
+ AutoVacuumShmem->av_available_parallel_workers -= nworkers;
+ }
+ LWLockRelease(AutovacuumLock);
+
+ return can_launch;
+}
+
+/*
+ * When parallel autovacuum worker die, leader worker must call this function
+ * in order to refresh global autovacuum state. Thus, other leaders will be able
+ * to use these workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+ParallelAutoVacuumReleaseWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ AutoVacuumShmem->av_available_parallel_workers += nworkers;
+
+ /*
+ * If autovacuum_max_parallel_workers variable was reduced during parallel
+ * autovacuum execution, we must cap available workers number by its new
+ * value.
+ */
+ if (AutoVacuumShmem->av_available_parallel_workers >
+ autovacuum_max_parallel_workers)
+ {
+ AutoVacuumShmem->av_available_parallel_workers =
+ autovacuum_max_parallel_workers;
+ }
+
+ LWLockRelease(AutovacuumLock);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3389,6 +3473,8 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_available_parallel_workers =
+ autovacuum_max_parallel_workers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3439,6 +3525,15 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
return true;
}
+bool
+check_autovacuum_max_parallel_workers(int *newval, void **extra,
+ GucSource source)
+{
+ if (*newval >= max_worker_processes)
+ return false;
+ return true;
+}
+
/*
* Returns whether there is a free autovacuum worker slot available.
*/
@@ -3470,3 +3565,48 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of available parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+check_parallel_av_gucs(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ if (AutoVacuumShmem->av_available_parallel_workers >
+ autovacuum_max_parallel_workers)
+ {
+ Assert(prev_max_parallel_workers > autovacuum_max_parallel_workers);
+
+ /*
+ * Number of available workers must not exceed limit.
+ *
+ * Note, that if some parallel autovacuum workers are running at this
+ * moment, available workers number will not exceed limit after
+ * releasing them (see ParallelAutoVacuumReleaseWorkers).
+ */
+ AutoVacuumShmem->av_available_parallel_workers =
+ autovacuum_max_parallel_workers;
+ }
+ else if ((AutoVacuumShmem->av_available_parallel_workers <
+ autovacuum_max_parallel_workers) &&
+ (autovacuum_max_parallel_workers > prev_max_parallel_workers))
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of available workers in shmem.
+ */
+ AutoVacuumShmem->av_available_parallel_workers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+
+ /*
+ * Nothing to do when autovacuum_max_parallel_workers <
+ * prev_max_parallel_workers. Available workers number will be capped
+ * inside ParallelAutoVacuumReleaseWorkers.
+ */
+ }
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..977644978c1 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index f04bfedb2fd..be76263c431 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3604,6 +3604,16 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"autovacuum_max_parallel_workers", PGC_SIGHUP, VACUUM_AUTOVACUUM,
+ gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."),
+ gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."),
+ },
+ &autovacuum_max_parallel_workers,
+ 0, 0, MAX_BACKENDS,
+ check_autovacuum_max_parallel_workers, NULL, NULL
+ },
+
{
{"max_parallel_maintenance_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
gettext_noop("Sets the maximum number of parallel processes per maintenance operation."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 341f88adc87..3fbcbf8ef4f 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -683,6 +683,7 @@
autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..85926415657 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e8135f41a1c..b5763e6ac36 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -64,6 +64,10 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern int ParallelAutoVacuumReserveWorkers(int nworkers);
+extern void ParallelAutoVacuumReleaseWorkers(int nworkers);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 799fa7ace68..5c66f37cd53 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -31,6 +31,8 @@ extern void assign_application_name(const char *newval, void *extra);
extern const char *show_archive_command(void);
extern bool check_autovacuum_work_mem(int *newval, void **extra,
GucSource source);
+extern bool check_autovacuum_max_parallel_workers(int *newval, void **extra,
+ GucSource source);
extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
GucSource source);
extern bool check_backtrace_functions(char **newval, void **extra,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..29c32f75780 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,8 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+ int parallel_autovacuum_workers; /* max number of parallel
+ autovacuum workers */
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
v6-0002-Sandbox-for-parallel-index-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v6-0002-Sandbox-for-parallel-index-autovacuum.patchDownload
From 6164c2cd633e9f3f95682e02d819c890519eef7c Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Fri, 16 May 2025 11:59:03 +0700
Subject: [PATCH v6 2/2] Sandbox for parallel index autovacuum
---
src/test/modules/Makefile | 1 +
src/test/modules/autovacuum/.gitignore | 1 +
src/test/modules/autovacuum/Makefile | 14 ++
src/test/modules/autovacuum/meson.build | 12 ++
.../autovacuum/t/001_autovac_parallel.pl | 131 ++++++++++++++++++
src/test/modules/meson.build | 1 +
6 files changed, 160 insertions(+)
create mode 100644 src/test/modules/autovacuum/.gitignore
create mode 100644 src/test/modules/autovacuum/Makefile
create mode 100644 src/test/modules/autovacuum/meson.build
create mode 100644 src/test/modules/autovacuum/t/001_autovac_parallel.pl
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index aa1d27bbed3..b7f3e342e82 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -5,6 +5,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
SUBDIRS = \
+ autovacuum \
brin \
commit_ts \
delay_execution \
diff --git a/src/test/modules/autovacuum/.gitignore b/src/test/modules/autovacuum/.gitignore
new file mode 100644
index 00000000000..0b54641bceb
--- /dev/null
+++ b/src/test/modules/autovacuum/.gitignore
@@ -0,0 +1 @@
+/tmp_check/
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/Makefile b/src/test/modules/autovacuum/Makefile
new file mode 100644
index 00000000000..90c00ff350b
--- /dev/null
+++ b/src/test/modules/autovacuum/Makefile
@@ -0,0 +1,14 @@
+# src/test/modules/autovacuum/Makefile
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/meson.build b/src/test/modules/autovacuum/meson.build
new file mode 100644
index 00000000000..f91c1a14d2b
--- /dev/null
+++ b/src/test/modules/autovacuum/meson.build
@@ -0,0 +1,12 @@
+# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+
+tests += {
+ 'name': 'autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_autovac_parallel.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/autovacuum/t/001_autovac_parallel.pl b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
new file mode 100644
index 00000000000..ae892e5b4de
--- /dev/null
+++ b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
@@ -0,0 +1,131 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+$node->append_conf('postgresql.conf', qq{
+ autovacuum = off
+ max_wal_size = 4096
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 10
+ log_min_messages = info
+});
+$node->start;
+
+my $indexes_num = 80;
+my $initial_rows_num = 100_000;
+my $parallel_autovacuum_workers = 5;
+
+# Create big table and create specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER, col_5 INTEGER,
+ col_6 INTEGER, col_7 INTEGER, col_8 INTEGER, col_9 INTEGER, col_10 INTEGER,
+ col_11 INTEGER, col_12 INTEGER, col_13 INTEGER, col_14 INTEGER, col_15 INTEGER,
+ col_16 INTEGER, col_17 INTEGER, col_18 INTEGER, col_19 INTEGER, col_20 INTEGER,
+ col_21 INTEGER, col_22 INTEGER, col_23 INTEGER, col_24 INTEGER, col_25 INTEGER,
+ col_26 INTEGER, col_27 INTEGER, col_28 INTEGER, col_29 INTEGER, col_30 INTEGER,
+ col_31 INTEGER, col_32 INTEGER, col_33 INTEGER, col_34 INTEGER, col_35 INTEGER,
+ col_36 INTEGER, col_37 INTEGER, col_38 INTEGER, col_39 INTEGER, col_40 INTEGER,
+ col_41 INTEGER, col_42 INTEGER, col_43 INTEGER, col_44 INTEGER, col_45 INTEGER,
+ col_46 INTEGER, col_47 INTEGER, col_48 INTEGER, col_49 INTEGER, col_50 INTEGER,
+ col_51 INTEGER, col_52 INTEGER, col_53 INTEGER, col_54 INTEGER, col_55 INTEGER,
+ col_56 INTEGER, col_57 INTEGER, col_58 INTEGER, col_59 INTEGER, col_60 INTEGER,
+ col_61 INTEGER, col_62 INTEGER, col_63 INTEGER, col_64 INTEGER, col_65 INTEGER,
+ col_66 INTEGER, col_67 INTEGER, col_68 INTEGER, col_69 INTEGER, col_70 INTEGER,
+ col_71 INTEGER, col_72 INTEGER, col_73 INTEGER, col_74 INTEGER, col_75 INTEGER,
+ col_76 INTEGER, col_77 INTEGER, col_78 INTEGER, col_79 INTEGER, col_80 INTEGER,
+ col_81 INTEGER, col_82 INTEGER, col_83 INTEGER, col_84 INTEGER, col_85 INTEGER,
+ col_86 INTEGER, col_87 INTEGER, col_88 INTEGER, col_89 INTEGER, col_90 INTEGER,
+ col_91 INTEGER, col_92 INTEGER, col_93 INTEGER, col_94 INTEGER, col_95 INTEGER,
+ col_96 INTEGER, col_97 INTEGER, col_98 INTEGER, col_99 INTEGER, col_100 INTEGER
+ ) WITH (parallel_autovacuum_workers = $parallel_autovacuum_workers);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM pg_index i
+ JOIN pg_class c ON c.oid = i.indrelid
+ WHERE c.relname = 'test_autovac';",
+ stdout => \$psql_out
+);
+is($psql_out, $indexes_num + 1, "All indexes created successfully");
+
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac (
+ col_1, col_2, col_3, col_4, col_5, col_6, col_7, col_8, col_9, col_10,
+ col_11, col_12, col_13, col_14, col_15, col_16, col_17, col_18, col_19, col_20,
+ col_21, col_22, col_23, col_24, col_25, col_26, col_27, col_28, col_29, col_30,
+ col_31, col_32, col_33, col_34, col_35, col_36, col_37, col_38, col_39, col_40,
+ col_41, col_42, col_43, col_44, col_45, col_46, col_47, col_48, col_49, col_50,
+ col_51, col_52, col_53, col_54, col_55, col_56, col_57, col_58, col_59, col_60,
+ col_61, col_62, col_63, col_64, col_65, col_66, col_67, col_68, col_69, col_70,
+ col_71, col_72, col_73, col_74, col_75, col_76, col_77, col_78, col_79, col_80,
+ col_81, col_82, col_83, col_84, col_85, col_86, col_87, col_88, col_89, col_90,
+ col_91, col_92, col_93, col_94, col_95, col_96, col_97, col_98, col_99, col_100
+ ) VALUES (
+ i, i + 1, i + 2, i + 3, i + 4, i + 5, i + 6, i + 7, i + 8, i + 9,
+ i + 10, i + 11, i + 12, i + 13, i + 14, i + 15, i + 16, i + 17, i + 18, i + 19,
+ i + 20, i + 21, i + 22, i + 23, i + 24, i + 25, i + 26, i + 27, i + 28, i + 29,
+ i + 30, i + 31, i + 32, i + 33, i + 34, i + 35, i + 36, i + 37, i + 38, i + 39,
+ i + 40, i + 41, i + 42, i + 43, i + 44, i + 45, i + 46, i + 47, i + 48, i + 49,
+ i + 50, i + 51, i + 52, i + 53, i + 54, i + 55, i + 56, i + 57, i + 58, i + 59,
+ i + 60, i + 61, i + 62, i + 63, i + 64, i + 65, i + 66, i + 67, i + 68, i + 69,
+ i + 70, i + 71, i + 72, i + 73, i + 74, i + 75, i + 76, i + 77, i + 78, i + 79,
+ i + 80, i + 81, i + 82, i + 83, i + 84, i + 85, i + 86, i + 87, i + 88, i + 89,
+ i + 90, i + 91, i + 92, i + 93, i + 94, i + 95, i + 96, i + 97, i + 98, i + 99
+ );
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM test_autovac;",
+ stdout => \$psql_out
+);
+is($psql_out, $initial_rows_num, "All data inserted into table successfully");
+
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Reduce autovacuum_work_mem, so leader process will perform parallel indexi
+# vacuum phase several times
+$node->append_conf('postgresql.conf', qq{
+ autovacuum_naptime = '1s'
+ autovacuum_vacuum_threshold = 1
+ autovacuum_analyze_threshold = 1
+ autovacuum_vacuum_scale_factor = 0.1
+ autovacuum_analyze_scale_factor = 0.1
+ autovacuum = on
+});
+
+$node->restart;
+
+# sleep(3600);
+
+ok(1, "There are no segfaults");
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 9de0057bd1d..7f2ad810ca0 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -1,5 +1,6 @@
# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+subdir('autovacuum')
subdir('brin')
subdir('commit_ts')
subdir('delay_execution')
--
2.43.0
On Sun Jul 6, 2025 at 5:00 AM -03, Daniil Davydov wrote:
The "autovacuum_max_parallel_workers" declared on guc_tables.c mention that is capped by "max_worker_process": + { + {"autovacuum_max_parallel_workers", PGC_SIGHUP, VACUUM_AUTOVACUUM, + gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."), + gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."), + }, + &autovacuum_max_parallel_workers, + 0, 0, MAX_BACKENDS, + check_autovacuum_max_parallel_workers, NULL, NULL + },IIUC the code, it cap by "max_worker_process", but Masahiko has mention
on [1] that it should be capped by max_parallel_workers.To be honest, I don't think that this parameter should be explicitly
capped at all.
Other parallel operations (for example parallel index build or VACUUM
PARALLEL) just request as many workers as they want without looking at
'max_parallel_workers'.
And they will not complain, if not all requested workers were launched.Thus, even if 'autovacuum_max_parallel_workers' is higher than
'max_parallel_workers' the worst that can happen is that not all
requested workers will be running (which is a common situation).
Users can handle it by looking for logs like "planned vs. launched"
and increasing 'max_parallel_workers' if needed.On the other hand, obviously it doesn't make sense to request more
workers than 'max_worker_processes' (moreover, this parameter cannot
be changed as easily as 'max_parallel_workers').I will keep the 'max_worker_processes' limit, so autovacuum will not
waste time initializing a parallel context if there is no chance that
the request will succeed.
But it's worth remembering that actually the
'autovacuum_max_parallel_workers' parameter will always be implicitly
capped by 'max_parallel_workers'.What do you think about it?
It make sense to me. The main benefit that I see on capping
autovacuum_max_parallel_workers parameter is that users will see
"invalid value for parameter "autovacuum_max_parallel_workers"" error on
logs instead of need to search for "planned vs. launched", which can be
trick if log_min_messages is not set to at least the info level (the
default warning level will not show this log message). If we decide to
not cap this on code I think that at least would be good to mention this
on documentation.
I've made some tests and I can confirm that is working correctly for
what I can see. I think that would be to start include the documentation
changes, what do you think?It sounds tempting :)
But perhaps first we should agree on the limitation of the
'autovacuum_max_parallel_workers' parameter.
Agree
Please, see v6 patches :
1) Fixed typos in autovacuum.c and postgresql.conf.sample
2) Removed unused macro 'RelationGetParallelAutovacuumWorkers'
Thanks!
--
Matheus Alcantara
Hi,
On Tue, Jul 8, 2025 at 10:20 PM Matheus Alcantara
<matheusssilv97@gmail.com> wrote:
On Sun Jul 6, 2025 at 5:00 AM -03, Daniil Davydov wrote:
I will keep the 'max_worker_processes' limit, so autovacuum will not
waste time initializing a parallel context if there is no chance that
the request will succeed.
But it's worth remembering that actually the
'autovacuum_max_parallel_workers' parameter will always be implicitly
capped by 'max_parallel_workers'.What do you think about it?
It make sense to me. The main benefit that I see on capping
autovacuum_max_parallel_workers parameter is that users will see
"invalid value for parameter "autovacuum_max_parallel_workers"" error on
logs instead of need to search for "planned vs. launched", which can be
trick if log_min_messages is not set to at least the info level (the
default warning level will not show this log message).
I think I can refer to (for example) 'max_parallel_workers_per_gather'
parameter, which allows
setting values higher than 'max_parallel_workers' without throwing an
error or warning.
'autovacuum_max_parallel_workers' will behave the same way.
If we decide to not cap this on code I think that at least would be good to mention this
on documentation.
Sure, it is worth noticing in documentation.
--
Best regards,
Daniil Davydov
On Sun, Jul 6, 2025 at 5:00 PM Daniil Davydov <3danissimo@gmail.com> wrote:
Hi,
On Fri, Jul 4, 2025 at 9:21 PM Matheus Alcantara
<matheusssilv97@gmail.com> wrote:The "autovacuum_max_parallel_workers" declared on guc_tables.c mention that is capped by "max_worker_process": + { + {"autovacuum_max_parallel_workers", PGC_SIGHUP, VACUUM_AUTOVACUUM, + gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."), + gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."), + }, + &autovacuum_max_parallel_workers, + 0, 0, MAX_BACKENDS, + check_autovacuum_max_parallel_workers, NULL, NULL + },IIUC the code, it cap by "max_worker_process", but Masahiko has mention
on [1] that it should be capped by max_parallel_workers.Thanks for looking into it!
To be honest, I don't think that this parameter should be explicitly
capped at all.
Other parallel operations (for example parallel index build or VACUUM
PARALLEL) just request as many workers as they want without looking at
'max_parallel_workers'.
And they will not complain, if not all requested workers were launched.Thus, even if 'autovacuum_max_parallel_workers' is higher than
'max_parallel_workers' the worst that can happen is that not all
requested workers will be running (which is a common situation).
Users can handle it by looking for logs like "planned vs. launched"
and increasing 'max_parallel_workers' if needed.On the other hand, obviously it doesn't make sense to request more
workers than 'max_worker_processes' (moreover, this parameter cannot
be changed as easily as 'max_parallel_workers').I will keep the 'max_worker_processes' limit, so autovacuum will not
waste time initializing a parallel context if there is no chance that
the request will succeed.
But it's worth remembering that actually the
'autovacuum_max_parallel_workers' parameter will always be implicitly
capped by 'max_parallel_workers'.What do you think about it?
But the postgresql.conf.sample say that it is limited by
max_parallel_workers:
+#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_parallel_workersGood catch, I'll fix it.
---
We actually capping the autovacuum_max_parallel_workers by
max_worker_process-1, so we can't have 10 max_worker_process and 10
autovacuum_max_parallel_workers. Is that correct?Yep. The explanation can be found just above in this letter.
---
Locking unnecessary the AutovacuumLock if none if the if's is true can
cause some performance issue here? I don't think that this would be a
serious problem because this code will only be called if the
configuration file is changed during the autovacuum execution right? But
I could be wrong, so just sharing my thoughts on this (still learning
about [auto]vacuum code).+ +/* + * Make sure that number of available parallel workers corresponds to the + * autovacuum_max_parallel_workers parameter (after it was changed). + */ +static void +check_parallel_av_gucs(int prev_max_parallel_workers) +{ + LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE); + + if (AutoVacuumShmem->av_available_parallel_workers > + autovacuum_max_parallel_workers) + { + Assert(prev_max_parallel_workers > autovacuum_max_parallel_workers); +This function may be called by a/v launcher when we already have some
a/v workers running.
A/v workers can change the
AutoVacuumShmem->av_available_parallel_workers value, so I think we
should acquire appropriate lock before reading it.Typo on "exeed"
+ /* + * Number of available workers must not exeed limit. + * + * Note, that if some parallel autovacuum workers are running at this + * moment, available workers number will not exeed limit after releasing + * them (see ParallelAutoVacuumReleaseWorkers). + */Oops. I'll fix it.
---
I'm not seeing an usage of this macro? +/* + * RelationGetParallelAutovacuumWorkers + * Returns the relation's parallel_autovacuum_workers reloption setting. + * Note multiple eval of argument! + */ +#define RelationGetParallelAutovacuumWorkers(relation, defaultpw) \ + ((relation)->rd_options ? \ + ((StdRdOptions *) (relation)->rd_options)->autovacuum.parallel_autovacuum_workers : \ + (defaultpw)) +Yes, this is the relic of a past implementation. I'll delete this macro.
I've made some tests and I can confirm that is working correctly for
what I can see. I think that would be to start include the documentation
changes, what do you think?It sounds tempting :)
But perhaps first we should agree on the limitation of the
'autovacuum_max_parallel_workers' parameter.Please, see v6 patches :
1) Fixed typos in autovacuum.c and postgresql.conf.sample
2) Removed unused macro 'RelationGetParallelAutovacuumWorkers'
Thank you for updating the patch! Here are some review comments:
---
- shared->maintenance_work_mem_worker =
- (nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+
+ if (AmAutoVacuumWorkerProcess())
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) :
+ autovacuum_work_mem;
+ else
+ shared->maintenance_work_mem_worker =
+ (nindexes_mwm > 0) ?
+ maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+ maintenance_work_mem;
Since we have a similar code in dead_items_alloc() I think it's better
to follow it:
int vac_work_mem = AmAutoVacuumWorkerProcess() &&
autovacuum_work_mem != -1 ?
autovacuum_work_mem : maintenance_work_mem;
That is, we calculate vac_work_mem first and then calculate
shared->maintenance_work_mem_worker. I think it's more straightforward
as the formula of maintenance_work_mem_worker is the same whereas the
amount of memory used for vacuum and autovacuum varies.
---
+ nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember this value */
DestroyParallelContext(pvs->pcxt);
+
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+ if (AmAutoVacuumWorkerProcess())
+ ParallelAutoVacuumReleaseWorkers(nlaunched_workers);
+
Why don't we release workers before destroying the parallel context?
---
@@ -558,7 +576,9 @@ parallel_vacuum_compute_workers(Relation *indrels,
int nindexes, int nrequested,
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster ||
+ (autovacuum_max_parallel_workers == 0 && AmAutoVacuumWorkerProcess()) ||
+ (max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess()))
return 0;
/*
@@ -597,15 +617,17 @@ parallel_vacuum_compute_workers(Relation
*indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = AmAutoVacuumWorkerProcess() ?
+ Min(parallel_workers, autovacuum_max_parallel_workers) :
+ Min(parallel_workers, max_parallel_maintenance_workers);
return parallel_workers;
How about calculating the maximum number of workers once and using it
in the above both places?
---
+ /* Check how many workers can provide autovacuum. */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = ParallelAutoVacuumReserveWorkers(nworkers);
+
I think it's better to move this code to right after setting "nworkers
= Min(nworkers, pvs->pcxt->nworkers);" as it's a more related code.
The comment needs to be updated as it doesn't match what the function
actually does (i.e. reserving the workers).
---
/* Reinitialize parallel context to relaunch parallel workers */
if (num_index_scans > 0)
+ {
ReinitializeParallelDSM(pvs->pcxt);
+ /*
+ * Release all launched (i.e. reserved) parallel autovacuum
+ * workers.
+ */
+ if (AmAutoVacuumWorkerProcess())
+ ParallelAutoVacuumReleaseWorkers(pvs->pcxt->nworkers_launched);
+ }
Why do we need to release all workers here? If there is a reason, we
should mention it as a comment.
---
@@ -706,16 +751,16 @@
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int
num_index_scan
if (vacuum)
ereport(pvs->shared->elevel,
- (errmsg(ngettext("launched %d parallel vacuum
worker for index vacuuming (planned: %d)",
- "launched %d parallel vacuum
workers for index vacuuming (planned: %d)",
+ (errmsg(ngettext("launched %d parallel %svacuum
worker for index vacuuming (planned: %d)",
+ "launched %d parallel %svacuum
workers for index vacuuming (planned: %d)",
pvs->pcxt->nworkers_launched),
- pvs->pcxt->nworkers_launched, nworkers)));
+ pvs->pcxt->nworkers_launched,
AmAutoVacuumWorkerProcess() ? "auto" : "", nworkers)));
The "%svacuum" part doesn't work in terms of translation. We need to
construct the whole sentence instead. But do we need this log message
change in the first place? IIUC autovacuums write logs only when the
execution time exceed the log_autovacuum_min_duration (or its
reloption). The patch unconditionally sets LOG level for autovacuums
but I'm not sure it's consistent with other autovacuum logging
behavior:
+ int elevel = AmAutoVacuumWorkerProcess() ||
+ vacrel->verbose ?
+ INFO : DEBUG2;
---
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel [auto]vacuum execution.
The patch includes the change of "vacuum" -> "[auto]vacuum" in many
places. While I think we need to mention that vacuumparallel.c
supports autovacuums I'm not sure we really need all of them. If we
accept this style, we would require for all subsequent changes to
follow it, which could increase maintenance costs.
---
@@ -299,6 +301,7 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_available_parallel_workers;
Other field names seem to have consistent naming rules; 'av_' prefix
followed by name in camel case. So how about renaming it to
av_freeParallelWorkers or something along those lines?
---
+int
+ParallelAutoVacuumReserveWorkers(int nworkers)
+{
Other exposed functions have "AutoVacuum" prefix, so how about
renaming it to AutoVacuumReserveParallelWorkers() or something along
those lines?
---
+ if (AutoVacuumShmem->av_available_parallel_workers < nworkers)
+ {
+ /* Provide as many workers as we can. */
+ can_launch = AutoVacuumShmem->av_available_parallel_workers;
+ AutoVacuumShmem->av_available_parallel_workers = 0;
+ }
+ else
+ {
+ /* OK, we can provide all requested workers. */
+ can_launch = nworkers;
+ AutoVacuumShmem->av_available_parallel_workers -= nworkers;
+ }
Can we simplify this logic as follows?
can_launch = Min(AutoVacuumShmem->av_available_parallel_workers, nworkers);
AutoVacuumShmem->av_available_parallel_workers -= can_launch;
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Hi,
On Mon, Jul 14, 2025 at 2:10 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
--- - shared->maintenance_work_mem_worker = - (nindexes_mwm > 0) ? - maintenance_work_mem / Min(parallel_workers, nindexes_mwm) : - maintenance_work_mem; + + if (AmAutoVacuumWorkerProcess()) + shared->maintenance_work_mem_worker = + (nindexes_mwm > 0) ? + autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) : + autovacuum_work_mem; + else + shared->maintenance_work_mem_worker = + (nindexes_mwm > 0) ? + maintenance_work_mem / Min(parallel_workers, nindexes_mwm) : + maintenance_work_mem;Since we have a similar code in dead_items_alloc() I think it's better
to follow it:int vac_work_mem = AmAutoVacuumWorkerProcess() &&
autovacuum_work_mem != -1 ?
autovacuum_work_mem : maintenance_work_mem;That is, we calculate vac_work_mem first and then calculate
shared->maintenance_work_mem_worker. I think it's more straightforward
as the formula of maintenance_work_mem_worker is the same whereas the
amount of memory used for vacuum and autovacuum varies.
I was confused by the fact that initially maintenance_work_mem was used
for calculations, not vac_work_mem. I agree that we should better use
already calculated vac_work_mem value.
--- + nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember this value */ DestroyParallelContext(pvs->pcxt); + + /* Release all launched (i.e. reserved) parallel autovacuum workers. */ + if (AmAutoVacuumWorkerProcess()) + ParallelAutoVacuumReleaseWorkers(nlaunched_workers); +Why don't we release workers before destroying the parallel context?
Destroying parallel context includes waiting for all workers to exit (after
which, other operations can use them).
If we first call ParallelAutoVacuumReleaseWorkers, some operation can
reasonably request all released workers. But this request can fail,
because there is no guarantee that workers managed to finish.
Actually, there's nothing wrong with that, but I think releasing workers
only after finishing work is a more logical approach.
--- @@ -558,7 +576,9 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested, * We don't allow performing parallel operation in standalone backend or * when parallelism is disabled. */ - if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0) + if (!IsUnderPostmaster || + (autovacuum_max_parallel_workers == 0 && AmAutoVacuumWorkerProcess()) || + (max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess())) return 0;/*
@@ -597,15 +617,17 @@ parallel_vacuum_compute_workers(Relation
*indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;- /* Cap by max_parallel_maintenance_workers */ - parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers); + /* Cap by GUC variable */ + parallel_workers = AmAutoVacuumWorkerProcess() ? + Min(parallel_workers, autovacuum_max_parallel_workers) : + Min(parallel_workers, max_parallel_maintenance_workers);return parallel_workers;
How about calculating the maximum number of workers once and using it
in the above both places?
Agree. Good idea.
--- + /* Check how many workers can provide autovacuum. */ + if (AmAutoVacuumWorkerProcess() && nworkers > 0) + nworkers = ParallelAutoVacuumReserveWorkers(nworkers); +I think it's better to move this code to right after setting "nworkers
= Min(nworkers, pvs->pcxt->nworkers);" as it's a more related code.The comment needs to be updated as it doesn't match what the function
actually does (i.e. reserving the workers).
You are right, I'll fix it.
---
/* Reinitialize parallel context to relaunch parallel workers */
if (num_index_scans > 0)
+ {
ReinitializeParallelDSM(pvs->pcxt);+ /* + * Release all launched (i.e. reserved) parallel autovacuum + * workers. + */ + if (AmAutoVacuumWorkerProcess()) + ParallelAutoVacuumReleaseWorkers(pvs->pcxt->nworkers_launched); + }Why do we need to release all workers here? If there is a reason, we
should mention it as a comment.
Hm, I guess it was left over from previous patch versions. Actually
we don't need to release workers here, as we will try to launch them
immediately. It is a bug, thank you for noticing it.
--- @@ -706,16 +751,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scanif (vacuum) ereport(pvs->shared->elevel, - (errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)", - "launched %d parallel vacuum workers for index vacuuming (planned: %d)", + (errmsg(ngettext("launched %d parallel %svacuum worker for index vacuuming (planned: %d)", + "launched %d parallel %svacuum workers for index vacuuming (planned: %d)", pvs->pcxt->nworkers_launched), - pvs->pcxt->nworkers_launched, nworkers))); + pvs->pcxt->nworkers_launched, AmAutoVacuumWorkerProcess() ? "auto" : "", nworkers)));The "%svacuum" part doesn't work in terms of translation. We need to
construct the whole sentence instead.
But do we need this log message
change in the first place? IIUC autovacuums write logs only when the
execution time exceed the log_autovacuum_min_duration (or its
reloption). The patch unconditionally sets LOG level for autovacuums
but I'm not sure it's consistent with other autovacuum logging
behavior:+ int elevel = AmAutoVacuumWorkerProcess() || + vacrel->verbose ? + INFO : DEBUG2;
This log level is used only "for messages about parallel workers launched".
I think that such logs relate more to the parallel workers module than
autovacuum itself. Moreover, if we emit log "planned vs. launched" each
time, it will simplify the task of selecting the optimal value of
'autovacuum_max_parallel_workers' parameter. What do you think?
About "%svacuum" - I guess we need to clarify what exactly the workers
were launched for. I'll add errhint to this log, but I don't know whether such
approach is acceptable.
- * Support routines for parallel vacuum execution. + * Support routines for parallel [auto]vacuum execution.The patch includes the change of "vacuum" -> "[auto]vacuum" in many
places. While I think we need to mention that vacuumparallel.c
supports autovacuums I'm not sure we really need all of them. If we
accept this style, we would require for all subsequent changes to
follow it, which could increase maintenance costs.
Agree. I'll leave a comment which says that vacuumparallel also supports
parallel autovacuum. All other changes like "[auto]vacuum" will be deleted.
--- @@ -299,6 +301,7 @@ typedef struct WorkerInfo av_startingWorker; AutoVacuumWorkItem av_workItems[NUM_WORKITEMS]; pg_atomic_uint32 av_nworkersForBalance; + uint32 av_available_parallel_workers;Other field names seem to have consistent naming rules; 'av_' prefix
followed by name in camel case. So how about renaming it to
av_freeParallelWorkers or something along those lines?--- +int +ParallelAutoVacuumReserveWorkers(int nworkers) +{Other exposed functions have "AutoVacuum" prefix, so how about
renaming it to AutoVacuumReserveParallelWorkers() or something along
those lines?
Agreeing with both comments, I'll rename the structure field and functions.
--- + if (AutoVacuumShmem->av_available_parallel_workers < nworkers) + { + /* Provide as many workers as we can. */ + can_launch = AutoVacuumShmem->av_available_parallel_workers; + AutoVacuumShmem->av_available_parallel_workers = 0; + } + else + { + /* OK, we can provide all requested workers. */ + can_launch = nworkers; + AutoVacuumShmem->av_available_parallel_workers -= nworkers; + }Can we simplify this logic as follows?
can_launch = Min(AutoVacuumShmem->av_available_parallel_workers, nworkers);
AutoVacuumShmem->av_available_parallel_workers -= can_launch;
Sure, I'll simplify it.
---
Thank you very much for your comments! Please, see v7 patch :
1) Rename few functions and variables + get rid of comments like
"[auto]vacuum" in vacuumparallel.c
2) Simplified logic in 'parallel_vacuum_init' and
'AutoVacuumReserveParallelWorkers' functions
3) Refactor and bug fix in 'parallel_vacuum_process_all_indexes' function
4) Change "planned vs. launched" logging, so it can be translated
5) Rebased on newest commit in master branch
--
Best regards,
Daniil Davydov
Attachments:
v7-0002-Sandbox-for-parallel-index-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v7-0002-Sandbox-for-parallel-index-autovacuum.patchDownload
From 7af255b4d0a5e7927f6a1c212c4b2342d6b044a7 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Fri, 16 May 2025 11:59:03 +0700
Subject: [PATCH v7 2/2] Sandbox for parallel index autovacuum
---
src/test/modules/Makefile | 1 +
src/test/modules/autovacuum/.gitignore | 1 +
src/test/modules/autovacuum/Makefile | 14 ++
src/test/modules/autovacuum/meson.build | 12 ++
.../autovacuum/t/001_autovac_parallel.pl | 131 ++++++++++++++++++
src/test/modules/meson.build | 1 +
6 files changed, 160 insertions(+)
create mode 100644 src/test/modules/autovacuum/.gitignore
create mode 100644 src/test/modules/autovacuum/Makefile
create mode 100644 src/test/modules/autovacuum/meson.build
create mode 100644 src/test/modules/autovacuum/t/001_autovac_parallel.pl
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index aa1d27bbed3..b7f3e342e82 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -5,6 +5,7 @@ top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
SUBDIRS = \
+ autovacuum \
brin \
commit_ts \
delay_execution \
diff --git a/src/test/modules/autovacuum/.gitignore b/src/test/modules/autovacuum/.gitignore
new file mode 100644
index 00000000000..0b54641bceb
--- /dev/null
+++ b/src/test/modules/autovacuum/.gitignore
@@ -0,0 +1 @@
+/tmp_check/
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/Makefile b/src/test/modules/autovacuum/Makefile
new file mode 100644
index 00000000000..90c00ff350b
--- /dev/null
+++ b/src/test/modules/autovacuum/Makefile
@@ -0,0 +1,14 @@
+# src/test/modules/autovacuum/Makefile
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/meson.build b/src/test/modules/autovacuum/meson.build
new file mode 100644
index 00000000000..f91c1a14d2b
--- /dev/null
+++ b/src/test/modules/autovacuum/meson.build
@@ -0,0 +1,12 @@
+# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+
+tests += {
+ 'name': 'autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_autovac_parallel.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/autovacuum/t/001_autovac_parallel.pl b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
new file mode 100644
index 00000000000..ae892e5b4de
--- /dev/null
+++ b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
@@ -0,0 +1,131 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+$node->append_conf('postgresql.conf', qq{
+ autovacuum = off
+ max_wal_size = 4096
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 10
+ log_min_messages = info
+});
+$node->start;
+
+my $indexes_num = 80;
+my $initial_rows_num = 100_000;
+my $parallel_autovacuum_workers = 5;
+
+# Create big table and create specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER, col_5 INTEGER,
+ col_6 INTEGER, col_7 INTEGER, col_8 INTEGER, col_9 INTEGER, col_10 INTEGER,
+ col_11 INTEGER, col_12 INTEGER, col_13 INTEGER, col_14 INTEGER, col_15 INTEGER,
+ col_16 INTEGER, col_17 INTEGER, col_18 INTEGER, col_19 INTEGER, col_20 INTEGER,
+ col_21 INTEGER, col_22 INTEGER, col_23 INTEGER, col_24 INTEGER, col_25 INTEGER,
+ col_26 INTEGER, col_27 INTEGER, col_28 INTEGER, col_29 INTEGER, col_30 INTEGER,
+ col_31 INTEGER, col_32 INTEGER, col_33 INTEGER, col_34 INTEGER, col_35 INTEGER,
+ col_36 INTEGER, col_37 INTEGER, col_38 INTEGER, col_39 INTEGER, col_40 INTEGER,
+ col_41 INTEGER, col_42 INTEGER, col_43 INTEGER, col_44 INTEGER, col_45 INTEGER,
+ col_46 INTEGER, col_47 INTEGER, col_48 INTEGER, col_49 INTEGER, col_50 INTEGER,
+ col_51 INTEGER, col_52 INTEGER, col_53 INTEGER, col_54 INTEGER, col_55 INTEGER,
+ col_56 INTEGER, col_57 INTEGER, col_58 INTEGER, col_59 INTEGER, col_60 INTEGER,
+ col_61 INTEGER, col_62 INTEGER, col_63 INTEGER, col_64 INTEGER, col_65 INTEGER,
+ col_66 INTEGER, col_67 INTEGER, col_68 INTEGER, col_69 INTEGER, col_70 INTEGER,
+ col_71 INTEGER, col_72 INTEGER, col_73 INTEGER, col_74 INTEGER, col_75 INTEGER,
+ col_76 INTEGER, col_77 INTEGER, col_78 INTEGER, col_79 INTEGER, col_80 INTEGER,
+ col_81 INTEGER, col_82 INTEGER, col_83 INTEGER, col_84 INTEGER, col_85 INTEGER,
+ col_86 INTEGER, col_87 INTEGER, col_88 INTEGER, col_89 INTEGER, col_90 INTEGER,
+ col_91 INTEGER, col_92 INTEGER, col_93 INTEGER, col_94 INTEGER, col_95 INTEGER,
+ col_96 INTEGER, col_97 INTEGER, col_98 INTEGER, col_99 INTEGER, col_100 INTEGER
+ ) WITH (parallel_autovacuum_workers = $parallel_autovacuum_workers);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM pg_index i
+ JOIN pg_class c ON c.oid = i.indrelid
+ WHERE c.relname = 'test_autovac';",
+ stdout => \$psql_out
+);
+is($psql_out, $indexes_num + 1, "All indexes created successfully");
+
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac (
+ col_1, col_2, col_3, col_4, col_5, col_6, col_7, col_8, col_9, col_10,
+ col_11, col_12, col_13, col_14, col_15, col_16, col_17, col_18, col_19, col_20,
+ col_21, col_22, col_23, col_24, col_25, col_26, col_27, col_28, col_29, col_30,
+ col_31, col_32, col_33, col_34, col_35, col_36, col_37, col_38, col_39, col_40,
+ col_41, col_42, col_43, col_44, col_45, col_46, col_47, col_48, col_49, col_50,
+ col_51, col_52, col_53, col_54, col_55, col_56, col_57, col_58, col_59, col_60,
+ col_61, col_62, col_63, col_64, col_65, col_66, col_67, col_68, col_69, col_70,
+ col_71, col_72, col_73, col_74, col_75, col_76, col_77, col_78, col_79, col_80,
+ col_81, col_82, col_83, col_84, col_85, col_86, col_87, col_88, col_89, col_90,
+ col_91, col_92, col_93, col_94, col_95, col_96, col_97, col_98, col_99, col_100
+ ) VALUES (
+ i, i + 1, i + 2, i + 3, i + 4, i + 5, i + 6, i + 7, i + 8, i + 9,
+ i + 10, i + 11, i + 12, i + 13, i + 14, i + 15, i + 16, i + 17, i + 18, i + 19,
+ i + 20, i + 21, i + 22, i + 23, i + 24, i + 25, i + 26, i + 27, i + 28, i + 29,
+ i + 30, i + 31, i + 32, i + 33, i + 34, i + 35, i + 36, i + 37, i + 38, i + 39,
+ i + 40, i + 41, i + 42, i + 43, i + 44, i + 45, i + 46, i + 47, i + 48, i + 49,
+ i + 50, i + 51, i + 52, i + 53, i + 54, i + 55, i + 56, i + 57, i + 58, i + 59,
+ i + 60, i + 61, i + 62, i + 63, i + 64, i + 65, i + 66, i + 67, i + 68, i + 69,
+ i + 70, i + 71, i + 72, i + 73, i + 74, i + 75, i + 76, i + 77, i + 78, i + 79,
+ i + 80, i + 81, i + 82, i + 83, i + 84, i + 85, i + 86, i + 87, i + 88, i + 89,
+ i + 90, i + 91, i + 92, i + 93, i + 94, i + 95, i + 96, i + 97, i + 98, i + 99
+ );
+ END LOOP;
+ END \$\$;
+});
+
+$node->psql('postgres',
+ "SELECT COUNT(*) FROM test_autovac;",
+ stdout => \$psql_out
+);
+is($psql_out, $initial_rows_num, "All data inserted into table successfully");
+
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Reduce autovacuum_work_mem, so leader process will perform parallel indexi
+# vacuum phase several times
+$node->append_conf('postgresql.conf', qq{
+ autovacuum_naptime = '1s'
+ autovacuum_vacuum_threshold = 1
+ autovacuum_analyze_threshold = 1
+ autovacuum_vacuum_scale_factor = 0.1
+ autovacuum_analyze_scale_factor = 0.1
+ autovacuum = on
+});
+
+$node->restart;
+
+# sleep(3600);
+
+ok(1, "There are no segfaults");
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 9de0057bd1d..7f2ad810ca0 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -1,5 +1,6 @@
# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+subdir('autovacuum')
subdir('brin')
subdir('commit_ts')
subdir('delay_execution')
--
2.43.0
v7-0001-Parallel-index-autovacuum-with-bgworkers.patchtext/x-patch; charset=US-ASCII; name=v7-0001-Parallel-index-autovacuum-with-bgworkers.patchDownload
From 55b76f15bbc3991b7457de6c1d6998d39b16292c Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Fri, 16 May 2025 11:58:40 +0700
Subject: [PATCH v7 1/2] Parallel index autovacuum with bgworkers
---
src/backend/access/common/reloptions.c | 12 ++
src/backend/access/heap/vacuumlazy.c | 6 +-
src/backend/commands/vacuumparallel.c | 57 ++++++--
src/backend/postmaster/autovacuum.c | 135 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc_tables.c | 10 ++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 4 +
src/include/utils/guc_hooks.h | 2 +
src/include/utils/rel.h | 2 +
11 files changed, 220 insertions(+), 11 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 50747c16396..e36d59f632b 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,16 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "parallel_autovacuum_workers",
+ "Maximum number of parallel autovacuum workers that can be taken from bgworkers pool for processing this table. "
+ "If value is 0 then parallel degree will computed based on number of indexes.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1872,6 +1882,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"parallel_autovacuum_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, parallel_autovacuum_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 14036c27e87..7e0ae0184aa 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -3477,6 +3477,10 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
autovacuum_work_mem != -1 ?
autovacuum_work_mem : maintenance_work_mem;
+ int elevel = AmAutoVacuumWorkerProcess() ||
+ vacrel->verbose ?
+ INFO : DEBUG2;
+
/*
* Initialize state for a parallel vacuum. As of now, only one worker can
* be used for an index, so we invoke parallelism only if there are at
@@ -3503,7 +3507,7 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
vacrel->pvs = parallel_vacuum_init(vacrel->rel, vacrel->indrels,
vacrel->nindexes, nworkers,
vac_work_mem,
- vacrel->verbose ? INFO : DEBUG2,
+ elevel,
vacrel->bstrategy);
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..6ec610e29e4 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * future comments, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -371,10 +374,12 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->relid = RelationGetRelid(rel);
shared->elevel = elevel;
shared->queryid = pgstat_get_my_query_id();
+
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -435,6 +440,8 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
void
parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
{
+ int nlaunched_workers;
+
Assert(!IsParallelWorker());
/* Copy the updated statistics */
@@ -453,7 +460,13 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
TidStoreDestroy(pvs->dead_items);
+ nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember this value */
DestroyParallelContext(pvs->pcxt);
+
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+ if (AmAutoVacuumWorkerProcess())
+ AutoVacuumReleaseParallelWorkers(nlaunched_workers);
+
ExitParallelMode();
pfree(pvs->will_parallel_vacuum);
@@ -553,12 +566,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_parallel_workers;
+
+ max_parallel_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_parallel_workers == 0)
return 0;
/*
@@ -597,8 +615,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_parallel_workers);
return parallel_workers;
}
@@ -646,6 +664,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Also reserve workers in autovacuum global state. Note, that we may be
+ * given fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = AutoVacuumReserveParallelWorkers(nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +715,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -709,13 +744,19 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
(errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)",
"launched %d parallel vacuum workers for index vacuuming (planned: %d)",
pvs->pcxt->nworkers_launched),
- pvs->pcxt->nworkers_launched, nworkers)));
+ pvs->pcxt->nworkers_launched, nworkers),
+ AmAutoVacuumWorkerProcess() ?
+ errhint("workers were launched for parallel autovacuum") :
+ errhint("workers were launched for parallel vacuum")));
else
ereport(pvs->shared->elevel,
(errmsg(ngettext("launched %d parallel vacuum worker for index cleanup (planned: %d)",
"launched %d parallel vacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
- pvs->pcxt->nworkers_launched, nworkers)));
+ pvs->pcxt->nworkers_launched, nworkers),
+ AmAutoVacuumWorkerProcess() ?
+ errhint("workers were launched for parallel autovacuum") :
+ errhint("workers were launched for parallel vacuum")));
}
/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 9474095f271..98609ac8f8f 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -285,6 +285,7 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +300,7 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -354,6 +356,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void check_parallel_av_gucs(int prev_max_parallel_workers);
@@ -753,7 +756,9 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev;
+ autovacuum_max_parallel_workers_prev = autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -769,6 +774,14 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ check_parallel_av_gucs(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -2847,8 +2860,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->parallel_autovacuum_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3329,6 +3346,64 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader worker
+ * must call this function. It returns the number of parallel workers that
+ * actually can be launched and reserves (if any) these workers in global
+ * autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+AutoVacuumReserveParallelWorkers(int nworkers)
+{
+ int can_launch;
+
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ can_launch = Min(AutoVacuumShmem->av_freeParallelWorkers, nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+ return can_launch;
+}
+
+/*
+ * When parallel autovacuum worker die, leader worker must call this function
+ * in order to refresh global autovacuum state. Thus, other leaders will be able
+ * to use these workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ AutoVacuumShmem->av_freeParallelWorkers += nworkers;
+
+ /*
+ * If autovacuum_max_parallel_workers variable was reduced during parallel
+ * autovacuum execution, we must cap available workers number by its new
+ * value.
+ */
+ if (AutoVacuumShmem->av_freeParallelWorkers >
+ autovacuum_max_parallel_workers)
+ {
+ AutoVacuumShmem->av_freeParallelWorkers =
+ autovacuum_max_parallel_workers;
+ }
+
+ LWLockRelease(AutovacuumLock);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3389,6 +3464,8 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_freeParallelWorkers =
+ autovacuum_max_parallel_workers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3439,6 +3516,15 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
return true;
}
+bool
+check_autovacuum_max_parallel_workers(int *newval, void **extra,
+ GucSource source)
+{
+ if (*newval >= max_worker_processes)
+ return false;
+ return true;
+}
+
/*
* Returns whether there is a free autovacuum worker slot available.
*/
@@ -3470,3 +3556,48 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of available parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+check_parallel_av_gucs(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ if (AutoVacuumShmem->av_freeParallelWorkers >
+ autovacuum_max_parallel_workers)
+ {
+ Assert(prev_max_parallel_workers > autovacuum_max_parallel_workers);
+
+ /*
+ * Number of available workers must not exceed limit.
+ *
+ * Note, that if some parallel autovacuum workers are running at this
+ * moment, available workers number will not exceed limit after
+ * releasing them (see ParallelAutoVacuumReleaseWorkers).
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ autovacuum_max_parallel_workers;
+ }
+ else if ((AutoVacuumShmem->av_freeParallelWorkers <
+ autovacuum_max_parallel_workers) &&
+ (autovacuum_max_parallel_workers > prev_max_parallel_workers))
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of available workers in shmem.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+
+ /*
+ * Nothing to do when autovacuum_max_parallel_workers <
+ * prev_max_parallel_workers. Available workers number will be capped
+ * inside ParallelAutoVacuumReleaseWorkers.
+ */
+ }
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..977644978c1 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index d14b1678e7f..b6a192af8f8 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3604,6 +3604,16 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"autovacuum_max_parallel_workers", PGC_SIGHUP, VACUUM_AUTOVACUUM,
+ gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."),
+ gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."),
+ },
+ &autovacuum_max_parallel_workers,
+ 0, 0, MAX_BACKENDS,
+ check_autovacuum_max_parallel_workers, NULL, NULL
+ },
+
{
{"max_parallel_maintenance_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
gettext_noop("Sets the maximum number of parallel processes per maintenance operation."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index a9d8293474a..bbf5307000f 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -683,6 +683,7 @@
autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..85926415657 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e8135f41a1c..863d206f2bd 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -64,6 +64,10 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern int AutoVacuumReserveParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 82ac8646a8d..b45023a90b2 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -31,6 +31,8 @@ extern void assign_application_name(const char *newval, void *extra);
extern const char *show_archive_command(void);
extern bool check_autovacuum_work_mem(int *newval, void **extra,
GucSource source);
+extern bool check_autovacuum_max_parallel_workers(int *newval, void **extra,
+ GucSource source);
extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
GucSource source);
extern bool check_backtrace_functions(char **newval, void **extra,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..29c32f75780 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,8 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+ int parallel_autovacuum_workers; /* max number of parallel
+ autovacuum workers */
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
On Mon, Jul 14, 2025 at 3:49 AM Daniil Davydov <3danissimo@gmail.com> wrote:
--- + nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember this value */ DestroyParallelContext(pvs->pcxt); + + /* Release all launched (i.e. reserved) parallel autovacuum workers. */ + if (AmAutoVacuumWorkerProcess()) + ParallelAutoVacuumReleaseWorkers(nlaunched_workers); +Why don't we release workers before destroying the parallel context?
Destroying parallel context includes waiting for all workers to exit (after
which, other operations can use them).
If we first call ParallelAutoVacuumReleaseWorkers, some operation can
reasonably request all released workers. But this request can fail,
because there is no guarantee that workers managed to finish.Actually, there's nothing wrong with that, but I think releasing workers
only after finishing work is a more logical approach.
--- @@ -706,16 +751,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scanif (vacuum) ereport(pvs->shared->elevel, - (errmsg(ngettext("launched %d parallel vacuum worker for index vacuuming (planned: %d)", - "launched %d parallel vacuum workers for index vacuuming (planned: %d)", + (errmsg(ngettext("launched %d parallel %svacuum worker for index vacuuming (planned: %d)", + "launched %d parallel %svacuum workers for index vacuuming (planned: %d)", pvs->pcxt->nworkers_launched), - pvs->pcxt->nworkers_launched, nworkers))); + pvs->pcxt->nworkers_launched, AmAutoVacuumWorkerProcess() ? "auto" : "", nworkers)));The "%svacuum" part doesn't work in terms of translation. We need to
construct the whole sentence instead.
But do we need this log message
change in the first place? IIUC autovacuums write logs only when the
execution time exceed the log_autovacuum_min_duration (or its
reloption). The patch unconditionally sets LOG level for autovacuums
but I'm not sure it's consistent with other autovacuum logging
behavior:+ int elevel = AmAutoVacuumWorkerProcess() || + vacrel->verbose ? + INFO : DEBUG2;This log level is used only "for messages about parallel workers launched".
I think that such logs relate more to the parallel workers module than
autovacuum itself. Moreover, if we emit log "planned vs. launched" each
time, it will simplify the task of selecting the optimal value of
'autovacuum_max_parallel_workers' parameter. What do you think?
INFO level is normally not sent to the server log. And regarding
autovacuums, they don't write any log mentioning it started. If we
want to write planned vs. launched I think it's better to gather these
statistics during execution and write it together with other existing
logs.
About "%svacuum" - I guess we need to clarify what exactly the workers
were launched for. I'll add errhint to this log, but I don't know whether such
approach is acceptable.
I'm not sure errhint is an appropriate place. If we write such
information together with other existing autovacuum logs as I
suggested above, I think we don't need to add such information to this
log message.
I've reviewed v7 patch and here are some comments:
+ {
+ {
+ "parallel_autovacuum_workers",
+ "Maximum number of parallel autovacuum workers that can be
taken from bgworkers pool for processing this table. "
+ "If value is 0 then parallel degree will computed based on
number of indexes.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
Many autovacuum related reloptions have the prefix "autovacuum". So
how about renaming it to autovacuum_parallel_worker (change
check_parallel_av_gucs() name too accordingly)?
---
+bool
+check_autovacuum_max_parallel_workers(int *newval, void **extra,
+ GucSource source)
+{
+ if (*newval >= max_worker_processes)
+ return false;
+ return true;
+}
I think we don't need to strictly check the
autovacuum_max_parallel_workers value. Instead, we can accept any
integer value but internally cap by max_worker_processes.
---
+/*
+ * Make sure that number of available parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+check_parallel_av_gucs(int prev_max_parallel_workers)
+{
I think this function doesn't just check the value but does adjust the
number of available workers, so how about
adjust_free_parallel_workers() or something along these lines?
---
+ /*
+ * Number of available workers must not exceed limit.
+ *
+ * Note, that if some parallel autovacuum workers are running at this
+ * moment, available workers number will not exceed limit after
+ * releasing them (see ParallelAutoVacuumReleaseWorkers).
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ autovacuum_max_parallel_workers;
I think the comment refers to the following code in
AutoVacuumReleaseParallelWorkers():
+ /*
+ * If autovacuum_max_parallel_workers variable was reduced during parallel
+ * autovacuum execution, we must cap available workers number by its new
+ * value.
+ */
+ if (AutoVacuumShmem->av_freeParallelWorkers >
+ autovacuum_max_parallel_workers)
+ {
+ AutoVacuumShmem->av_freeParallelWorkers =
+ autovacuum_max_parallel_workers;
+ }
After the autovacuum launchers decreases av_freeParallelWorkers, it's
not guaranteed that the autovacuum worker already reflects the new
value from the config file when executing the
AutoVacuumReleaseParallelWorkers(), which leds to skips the above
codes. For example, suppose that autovacuum_max_parallel_workers is 10
and 3 parallel workers are running by one autovacuum worker (i.e.,
av_freeParallelWorkers = 7 now), if the user changes
autovacuum_max_parallel_workers to 5, the autovacuum launchers adjust
av_freeParallelWorkers to 5. However, if the worker doesn't reload the
config file and executes AutoVacuumReleaseParallelWorkers(), it
increases av_freeParallelWorkers to 8 and skips the adjusting logic.
I've not tested this scenarios so I might be missing something though.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Hi,
On Fri, Jul 18, 2025 at 2:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jul 14, 2025 at 3:49 AM Daniil Davydov <3danissimo@gmail.com> wrote:
This log level is used only "for messages about parallel workers launched".
I think that such logs relate more to the parallel workers module than
autovacuum itself. Moreover, if we emit log "planned vs. launched" each
time, it will simplify the task of selecting the optimal value of
'autovacuum_max_parallel_workers' parameter. What do you think?INFO level is normally not sent to the server log. And regarding
autovacuums, they don't write any log mentioning it started. If we
want to write planned vs. launched I think it's better to gather these
statistics during execution and write it together with other existing
logs.About "%svacuum" - I guess we need to clarify what exactly the workers
were launched for. I'll add errhint to this log, but I don't know whether such
approach is acceptable.I'm not sure errhint is an appropriate place. If we write such
information together with other existing autovacuum logs as I
suggested above, I think we don't need to add such information to this
log message.
I thought about it for some time and came up with this idea :
1)
When gathering such statistics, we need to take into account that users
might not want autovacuum to log something. Thus, we should collect statistics
in "higher" level that knows about log_min_duration.
2)
By analogy with the rest of the statistics, we can only accumulate a
total number
of planned and launched parallel workers. Alternatively, we could build an array
(one element for each index scan) of "planned vs. launched". But it will make
the code "dirty", and I don't sure that this will be useful.
This may be a discussion point, so I will separate it to another .patch file.
I've reviewed v7 patch and here are some comments:
+ { + { + "parallel_autovacuum_workers", + "Maximum number of parallel autovacuum workers that can be taken from bgworkers pool for processing this table. " + "If value is 0 then parallel degree will computed based on number of indexes.", + RELOPT_KIND_HEAP, + ShareUpdateExclusiveLock + }, + -1, -1, 1024 + },Many autovacuum related reloptions have the prefix "autovacuum". So
how about renaming it to autovacuum_parallel_worker (change
check_parallel_av_gucs() name too accordingly)?
I have no objections.
--- +bool +check_autovacuum_max_parallel_workers(int *newval, void **extra, + GucSource source) +{ + if (*newval >= max_worker_processes) + return false; + return true; +}I think we don't need to strictly check the
autovacuum_max_parallel_workers value. Instead, we can accept any
integer value but internally cap by max_worker_processes.
I don't think that such a limitation is excessive, but I don't see similar
behavior in other "max_parallel_..." GUCs, so I think we can get
rid of it. I'll replace the "check hook" with an "assign hook", where
autovacuum_max_parallel_workers will be limited.
--- +/* + * Make sure that number of available parallel workers corresponds to the + * autovacuum_max_parallel_workers parameter (after it was changed). + */ +static void +check_parallel_av_gucs(int prev_max_parallel_workers) +{I think this function doesn't just check the value but does adjust the
number of available workers, so how about
adjust_free_parallel_workers() or something along these lines?
I agree, it's better this way.
--- + /* + * Number of available workers must not exceed limit. + * + * Note, that if some parallel autovacuum workers are running at this + * moment, available workers number will not exceed limit after + * releasing them (see ParallelAutoVacuumReleaseWorkers). + */ + AutoVacuumShmem->av_freeParallelWorkers = + autovacuum_max_parallel_workers;I think the comment refers to the following code in
AutoVacuumReleaseParallelWorkers():+ /* + * If autovacuum_max_parallel_workers variable was reduced during parallel + * autovacuum execution, we must cap available workers number by its new + * value. + */ + if (AutoVacuumShmem->av_freeParallelWorkers > + autovacuum_max_parallel_workers) + { + AutoVacuumShmem->av_freeParallelWorkers = + autovacuum_max_parallel_workers; + }After the autovacuum launchers decreases av_freeParallelWorkers, it's
not guaranteed that the autovacuum worker already reflects the new
value from the config file when executing the
AutoVacuumReleaseParallelWorkers(), which leds to skips the above
codes. For example, suppose that autovacuum_max_parallel_workers is 10
and 3 parallel workers are running by one autovacuum worker (i.e.,
av_freeParallelWorkers = 7 now), if the user changes
autovacuum_max_parallel_workers to 5, the autovacuum launchers adjust
av_freeParallelWorkers to 5. However, if the worker doesn't reload the
config file and executes AutoVacuumReleaseParallelWorkers(), it
increases av_freeParallelWorkers to 8 and skips the adjusting logic.
I've not tested this scenarios so I might be missing something though.
Yes, this is a possible scenario. I'll rework av_freeParallelWorkers
calculation. Main change is that a/v worker now checks whether config
reload is pending. Thus, it will have the relevant value of the
autovacuum_max_parallel_workers parameter.
Thank you very much for your comments! Please, see v8 patches:
1) Rename table option.
2) Replace check_hook with assign_hook for autovacuum_max_parallel_workers.
3) Simplify and correct logic for handling
autovacuum_max_parallel_workers parameter change.
4) Rework logic with "planned vs. launched" statistics for autovacuum
(see second patch file).
5) Get rid of "sandbox" - I don't see the point in continuing to drag him along.
--
Best regards,
Daniil Davydov
Attachments:
v8-0001-Parallel-index-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v8-0001-Parallel-index-autovacuum.patchDownload
From 74329dfbaebff1878c443d70b45aa1b5f7f2ef74 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Sun, 20 Jul 2025 23:03:57 +0700
Subject: [PATCH v8 1/2] Parallel index autovacuum
---
src/backend/access/common/reloptions.c | 12 ++
src/backend/commands/vacuumparallel.c | 46 ++++++-
src/backend/postmaster/autovacuum.c | 120 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc_tables.c | 10 ++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 4 +
src/include/utils/guc_hooks.h | 1 +
src/include/utils/rel.h | 2 +
10 files changed, 190 insertions(+), 8 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 50747c16396..54abe7f21f5 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,16 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be taken from bgworkers pool for processing this table. "
+ "If value is 0 then parallel degree will computed based on number of indexes.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1872,6 +1882,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..38cd6f68105 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * future comments, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -373,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -435,6 +439,8 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
void
parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
{
+ int nlaunched_workers;
+
Assert(!IsParallelWorker());
/* Copy the updated statistics */
@@ -453,7 +459,13 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
TidStoreDestroy(pvs->dead_items);
+ nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember this value */
DestroyParallelContext(pvs->pcxt);
+
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+ if (AmAutoVacuumWorkerProcess())
+ AutoVacuumReleaseParallelWorkers(nlaunched_workers);
+
ExitParallelMode();
pfree(pvs->will_parallel_vacuum);
@@ -553,12 +565,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_parallel_workers;
+
+ max_parallel_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_parallel_workers == 0)
return 0;
/*
@@ -597,8 +614,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_parallel_workers);
return parallel_workers;
}
@@ -646,6 +663,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Also reserve workers in autovacuum global state. Note, that we may be
+ * given fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = AutoVacuumReserveParallelWorkers(nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +714,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 9474095f271..61a50c9eca8 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -285,6 +285,7 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +300,7 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -354,6 +356,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
@@ -753,6 +756,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -769,6 +774,14 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -2847,8 +2860,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3329,6 +3346,68 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader worker
+ * must call this function. It returns the number of parallel workers that
+ * actually can be launched and reserves (if any) these workers in global
+ * autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+AutoVacuumReserveParallelWorkers(int nworkers)
+{
+ int can_launch;
+
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ can_launch = Min(AutoVacuumShmem->av_freeParallelWorkers, nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+ return can_launch;
+}
+
+/*
+ * When parallel autovacuum worker die, leader worker must call this function
+ * in order to refresh global autovacuum state. Thus, other leaders will be
+ * able to use these workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ /* Refresh autovacuum_max_parallel_workers paremeter */
+ CHECK_FOR_INTERRUPTS();
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If autovacuum_max_parallel_workers parameter was reduced during parallel
+ * autovacuum execution, we must cap available workers number by its new
+ * value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ autovacuum_max_parallel_workers);
+
+ LWLockRelease(AutovacuumLock);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3389,6 +3468,8 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_freeParallelWorkers =
+ autovacuum_max_parallel_workers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3439,6 +3520,12 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
return true;
}
+void
+assign_autovacuum_max_parallel_workers(int newval, void *extra)
+{
+ autovacuum_max_parallel_workers = Min(newval, max_worker_processes);
+}
+
/*
* Returns whether there is a free autovacuum worker slot available.
*/
@@ -3470,3 +3557,32 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of free parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap the number of free workers by new parameter's value, if needed.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers,
+ autovacuum_max_parallel_workers);
+
+ if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of free workers.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+ }
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..977644978c1 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index d14b1678e7f..4941ad976df 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3604,6 +3604,16 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"autovacuum_max_parallel_workers", PGC_SIGHUP, VACUUM_AUTOVACUUM,
+ gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."),
+ gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."),
+ },
+ &autovacuum_max_parallel_workers,
+ 0, 0, MAX_BACKENDS,
+ NULL, assign_autovacuum_max_parallel_workers, NULL
+ },
+
{
{"max_parallel_maintenance_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
gettext_noop("Sets the maximum number of parallel processes per maintenance operation."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index a9d8293474a..bbf5307000f 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -683,6 +683,7 @@
autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..85926415657 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e8135f41a1c..863d206f2bd 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -64,6 +64,10 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern int AutoVacuumReserveParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 82ac8646a8d..04833b4f147 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -31,6 +31,7 @@ extern void assign_application_name(const char *newval, void *extra);
extern const char *show_archive_command(void);
extern bool check_autovacuum_work_mem(int *newval, void **extra,
GucSource source);
+extern void assign_autovacuum_max_parallel_workers(int newval, void *extra);
extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
GucSource source);
extern bool check_backtrace_functions(char **newval, void **extra,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..377000199d7 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,8 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+ int autovacuum_parallel_workers; /* max number of parallel
+ autovacuum workers */
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
v8-0002-Logging-for-parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v8-0002-Logging-for-parallel-autovacuum.patchDownload
From 27b2c7d0dfb193aadd9d0199647e5909de3ac0aa Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Sun, 20 Jul 2025 23:26:13 +0700
Subject: [PATCH v8 2/2] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 26 ++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 20 ++++++++++++++------
src/include/commands/vacuum.h | 16 ++++++++++++++--
3 files changed, 52 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 14036c27e87..11dc2c48a7e 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -348,6 +348,11 @@ typedef struct LVRelState
/* Instrumentation counters */
int num_index_scans;
+ /*
+ * Number of planned and actually launched parallel workers for all index
+ * scans, or NULL
+ */
+ PVWorkersUsage *workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -688,6 +693,16 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
indnames = palloc(sizeof(char *) * vacrel->nindexes);
for (int i = 0; i < vacrel->nindexes; i++)
indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
+
+ /*
+ * Allocate space for workers usage statistics. Thus, we explicitly
+ * make clear that such statistics must be accumulated.
+ * For now, this is used only by autovacuum leader worker, because it
+ * must log it in the end of table processing.
+ */
+ vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
+ (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
+ NULL;
}
/*
@@ -1012,6 +1027,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->relnamespace,
vacrel->relname,
vacrel->num_index_scans);
+ if (vacrel->workers_usage)
+ appendStringInfo(&buf,
+ _("workers usage statistics for all of index scans : launched in total = %d, planned in total = %d\n"),
+ vacrel->workers_usage->nlaunched,
+ vacrel->workers_usage->nplanned);
appendStringInfo(&buf, _("pages: %u removed, %u remain, %u scanned (%.2f%% of total), %u eagerly scanned\n"),
vacrel->removed_pages,
new_rel_pages,
@@ -2634,7 +2654,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3047,7 +3068,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 38cd6f68105..831cc64b529 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersUsage *wusage);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -510,7 +510,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -521,7 +521,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
}
/*
@@ -529,7 +529,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -541,7 +542,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
/*
@@ -626,7 +627,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersUsage *wusage)
{
int nworkers;
PVIndVacStatus new_status;
@@ -750,6 +751,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
"launched %d parallel vacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
pvs->pcxt->nworkers_launched, nworkers)));
+
+ /* Remember these values, if we asked to. */
+ if (wusage != NULL)
+ {
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
+ wusage->nplanned += nworkers;
+ }
}
/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 14eeccbd718..64b23687506 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -295,6 +295,16 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * PVWorkersUsage stores information about total number of launched and planned
+ * workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+ int nlaunched;
+ int nplanned;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -389,11 +399,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
--
2.43.0
Thanks for the patches!
I have only reviewed the v8-0001-Parallel-index-autovacuum.patch so far and
have a few comments from my initial pass.
1/ Please run pgindent.
2/ Documentation is missing. There may be more, but here are the places I
found that likely need updates for the new behavior, reloptions, GUC, etc.
Including docs in the patch early would help clarify expected behavior.
https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-BASICS
https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM
https://www.postgresql.org/docs/current/runtime-config-autovacuum.html
https://www.postgresql.org/docs/current/sql-createtable.html
https://www.postgresql.org/docs/current/sql-altertable.html
https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES
https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-PARALLEL-MAINTENANCE-WORKERS
One thing I am unclear on is the interaction between max_worker_processes,
max_parallel_workers, and max_parallel_maintenance_workers. For example, does
the following change mean that manual VACUUM PARALLEL is no longer capped by
max_parallel_maintenance_workers?
@@ -597,8 +614,8 @@ parallel_vacuum_compute_workers(Relation *indrels,
int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers,
max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_parallel_workers);
3/ Shouldn't this be "max_parallel_workers" instead of "bgworkers pool" ?
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers
that can be taken from bgworkers pool for processing this table. "
4/ The comment "When parallel autovacuum worker die" suggests an abnormal
exit. "Terminates" seems clearer, since this applies to both normal and
abnormal exits.
instead of:
+ * When parallel autovacuum worker die,
how about this:
* When parallel autovacuum worker terminates,
5/ Any reason AutoVacuumReleaseParallelWorkers cannot be called before
DestroyParallelContext?
+ nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember
this value */
DestroyParallelContext(pvs->pcxt);
+
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+ if (AmAutoVacuumWorkerProcess())
+ AutoVacuumReleaseParallelWorkers(nlaunched_workers);
6/ Also, would it be cleaner to move AmAutoVacuumWorkerProcess() inside
AutoVacuumReleaseParallelWorkers()?
if (!AmAutoVacuumWorkerProcess())
return;
7/ It looks like the psql tab completion for autovacuum_parallel_workers is
missing:
test=# alter table t set (autovacuum_
autovacuum_analyze_scale_factor
autovacuum_analyze_threshold
autovacuum_enabled
autovacuum_freeze_max_age
autovacuum_freeze_min_age
autovacuum_freeze_table_age
autovacuum_multixact_freeze_max_age
autovacuum_multixact_freeze_min_age
autovacuum_multixact_freeze_table_age
autovacuum_vacuum_cost_delay
autovacuum_vacuum_cost_limit
autovacuum_vacuum_insert_scale_factor
autovacuum_vacuum_insert_threshold
autovacuum_vacuum_max_threshold
autovacuum_vacuum_scale_factor
autovacuum_vacuum_threshold
--
Sami Imseih
Amazon Web Services (AWS)
Hi,
On Mon, Jul 21, 2025 at 11:40 PM Sami Imseih <samimseih@gmail.com> wrote:
I have only reviewed the v8-0001-Parallel-index-autovacuum.patch so far and
have a few comments from my initial pass.1/ Please run pgindent.
OK, I'll do it.
2/ Documentation is missing. There may be more, but here are the places I
found that likely need updates for the new behavior, reloptions, GUC, etc.
Including docs in the patch early would help clarify expected behavior.https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-BASICS
https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM
https://www.postgresql.org/docs/current/runtime-config-autovacuum.html
https://www.postgresql.org/docs/current/sql-createtable.html
https://www.postgresql.org/docs/current/sql-altertable.html
https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES
https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-PARALLEL-MAINTENANCE-WORKERS
Thanks for gathering it all together. I'll update the documentation so
it will reflect changes in autovacuum daemon, reloptions and GUC
parameters. So far, I don't see what we can add to vacuum-basics
and alter-table paragraphs.
I'll create separate .patch file for changes in documentation.
One thing I am unclear on is the interaction between max_worker_processes,
max_parallel_workers, and max_parallel_maintenance_workers. For example, does
the following change mean that manual VACUUM PARALLEL is no longer capped by
max_parallel_maintenance_workers?@@ -597,8 +614,8 @@ parallel_vacuum_compute_workers(Relation *indrels,
int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;- /* Cap by max_parallel_maintenance_workers */ - parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers); + /* Cap by GUC variable */ + parallel_workers = Min(parallel_workers, max_parallel_workers);
Oh, it is my poor choice of a name for a local variable (I'll rename it).
This variable can get different values depending on performed operation :
autovacuum_max_parallel_workers for parallel autovacuum and
max_parallel_maintenance_workers for maintenance VACUUM.
3/ Shouldn't this be "max_parallel_workers" instead of "bgworkers pool" ?
+ "autovacuum_parallel_workers", + "Maximum number of parallel autovacuum workers that can be taken from bgworkers pool for processing this table. "
I don't think that we should refer to max_parallel_workers here.
Actually, this reloption doesn't depend on max_parallel_workers at all.
I wrote about bgworkers pool (both here and in description of
autovacuum_max_parallel_workers parameter) in order to clarify that
parallel autovacuum will use dynamic workers instead of launching
more a/v workers.
BTW, I don't really like that the comment on this option turns out to be
very large. I'll leave only short description in reloptions.c and move
clarification about zero value in rel.h
Mentions of bgworkers pool will remain only in
description of autovacuum_max_parallel_workers.
4/ The comment "When parallel autovacuum worker die" suggests an abnormal
exit. "Terminates" seems clearer, since this applies to both normal and
abnormal exits.instead of:
+ * When parallel autovacuum worker die,how about this:
* When parallel autovacuum worker terminates,
Sounds reasonable, I'll fix it.
5/ Any reason AutoVacuumReleaseParallelWorkers cannot be called before
DestroyParallelContext?+ nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember this value */ DestroyParallelContext(pvs->pcxt); + + /* Release all launched (i.e. reserved) parallel autovacuum workers. */ + if (AmAutoVacuumWorkerProcess()) + AutoVacuumReleaseParallelWorkers(nlaunched_workers);
I wrote about it above [1]/messages/by-id/CAJDiXgi7KB7wSQ=Ux=ngdaCvJnJ5x-ehvTyiuZez+5uKHtV6iQ@mail.gmail.com, but I think I can duplicate my thoughts here :
"""
Destroying parallel context includes waiting for all workers to exit (after
which, other operations can use them).
If we first call ParallelAutoVacuumReleaseWorkers, some operation can
reasonably request all released workers. But this request can fail,
because there is no guarantee that workers managed to finish.
Actually, there's nothing wrong with that, but I think releasing workers
only after finishing work is a more logical approach.
"""
6/ Also, would it be cleaner to move AmAutoVacuumWorkerProcess() inside
AutoVacuumReleaseParallelWorkers()?if (!AmAutoVacuumWorkerProcess())
return;
It seems to me that the opposite is true. If there is no alternative to calling
AmAutoVacuumWorkerProcess, it might confuse somebody. All doubts
will disappear after viewing the AmAutoVacuumWorkerProcess code,
but IMO code in vacuumparallel.c will become less intuitive.
7/ It looks like the psql tab completion for autovacuum_parallel_workers is
missing:test=# alter table t set (autovacuum_
autovacuum_analyze_scale_factor
autovacuum_analyze_threshold
autovacuum_enabled
autovacuum_freeze_max_age
autovacuum_freeze_min_age
autovacuum_freeze_table_age
autovacuum_multixact_freeze_max_age
autovacuum_multixact_freeze_min_age
autovacuum_multixact_freeze_table_age
autovacuum_vacuum_cost_delay
autovacuum_vacuum_cost_limit
autovacuum_vacuum_insert_scale_factor
autovacuum_vacuum_insert_threshold
autovacuum_vacuum_max_threshold
autovacuum_vacuum_scale_factor
autovacuum_vacuum_threshold
Good catch, I'll fix it.
Thank you for the review! Please, see v9 patches :
1) Run pgindent + rebase patches on newest commit in master.
2) Introduce changes for documentation.
3) Rename local variable in parallel_vacuum_compute_workers.
4) Shorten the description of autovacuum_parallel_workers in
reloptions.c (move clarifications for it into rel.h).
5) Reword "When parallel autovacuum worker die" comment.
6) Add tab completion for autovacuum_parallel_workers table option.
[1]: /messages/by-id/CAJDiXgi7KB7wSQ=Ux=ngdaCvJnJ5x-ehvTyiuZez+5uKHtV6iQ@mail.gmail.com
--
Best regards,
Daniil Davydov
Attachments:
v9-0002-Logging-for-parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v9-0002-Logging-for-parallel-autovacuum.patchDownload
From 1c4c65cad27e2986962ef0d041bf4f332c58f668 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Tue, 22 Jul 2025 02:47:24 +0700
Subject: [PATCH v9 2/3] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 27 +++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 20 ++++++++++++++------
src/include/commands/vacuum.h | 16 ++++++++++++++--
3 files changed, 53 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 14036c27e87..f1a645e79a9 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -348,6 +348,12 @@ typedef struct LVRelState
/* Instrumentation counters */
int num_index_scans;
+
+ /*
+ * Number of planned and actually launched parallel workers for all index
+ * scans, or NULL
+ */
+ PVWorkersUsage *workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -688,6 +694,16 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
indnames = palloc(sizeof(char *) * vacrel->nindexes);
for (int i = 0; i < vacrel->nindexes; i++)
indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
+
+ /*
+ * Allocate space for workers usage statistics. Thus, we explicitly
+ * make clear that such statistics must be accumulated. For now, this
+ * is used only by autovacuum leader worker, because it must log it in
+ * the end of table processing.
+ */
+ vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
+ (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
+ NULL;
}
/*
@@ -1012,6 +1028,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->relnamespace,
vacrel->relname,
vacrel->num_index_scans);
+ if (vacrel->workers_usage)
+ appendStringInfo(&buf,
+ _("workers usage statistics for all of index scans : launched in total = %d, planned in total = %d\n"),
+ vacrel->workers_usage->nlaunched,
+ vacrel->workers_usage->nplanned);
appendStringInfo(&buf, _("pages: %u removed, %u remain, %u scanned (%.2f%% of total), %u eagerly scanned\n"),
vacrel->removed_pages,
new_rel_pages,
@@ -2634,7 +2655,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3047,7 +3069,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index ffc140dabcf..51511bf2100 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersUsage * wusage);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -510,7 +510,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage * wusage)
{
Assert(!IsParallelWorker());
@@ -521,7 +521,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
}
/*
@@ -529,7 +529,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage * wusage)
{
Assert(!IsParallelWorker());
@@ -541,7 +542,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
/*
@@ -626,7 +627,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersUsage * wusage)
{
int nworkers;
PVIndVacStatus new_status;
@@ -750,6 +751,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
"launched %d parallel vacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
pvs->pcxt->nworkers_launched, nworkers)));
+
+ /* Remember these values, if we asked to. */
+ if (wusage != NULL)
+ {
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
+ wusage->nplanned += nworkers;
+ }
}
/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 14eeccbd718..d05ef7461ea 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -295,6 +295,16 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * PVWorkersUsage stores information about total number of launched and planned
+ * workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+ int nlaunched;
+ int nplanned;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -389,11 +399,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage * wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage * wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
--
2.43.0
v9-0003-Documentation-for-parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v9-0003-Documentation-for-parallel-autovacuum.patchDownload
From de46c8232641e288a46e6af1799961c1b00a4655 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Tue, 22 Jul 2025 12:31:20 +0700
Subject: [PATCH v9 3/3] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 18 ++++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 50 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index c7acc0f182f..06b0aff6cb7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2835,6 +2835,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9187,6 +9188,23 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-worker-processes"/>. The default is 0,
+ which means no parallel index vacuuming.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index e7a9f58c015..4e450ba9066 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -896,6 +896,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index dc000e913c1..288de6b0ffd 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1717,6 +1717,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
v9-0001-Parallel-index-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v9-0001-Parallel-index-autovacuum.patchDownload
From f5e21f44faa8618bfd575099317ecf06213f62f0 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Sun, 20 Jul 2025 23:03:57 +0700
Subject: [PATCH v9 1/3] Parallel index autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 46 ++++++-
src/backend/postmaster/autovacuum.c | 121 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc_tables.c | 10 ++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 4 +
src/include/utils/guc_hooks.h | 1 +
src/include/utils/rel.h | 7 +
11 files changed, 196 insertions(+), 8 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 50747c16396..cc3ffc43a05 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1872,6 +1881,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..ffc140dabcf 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * future comments, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -373,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -435,6 +439,8 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
void
parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
{
+ int nlaunched_workers;
+
Assert(!IsParallelWorker());
/* Copy the updated statistics */
@@ -453,7 +459,13 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
TidStoreDestroy(pvs->dead_items);
+ nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember this value */
DestroyParallelContext(pvs->pcxt);
+
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+ if (AmAutoVacuumWorkerProcess())
+ AutoVacuumReleaseParallelWorkers(nlaunched_workers);
+
ExitParallelMode();
pfree(pvs->will_parallel_vacuum);
@@ -553,12 +565,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -597,8 +614,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -646,6 +663,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Also reserve workers in autovacuum global state. Note, that we may be
+ * given fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = AutoVacuumReserveParallelWorkers(nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +714,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 9474095f271..76eb04029a3 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -285,6 +285,7 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +300,7 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -354,6 +356,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
@@ -753,6 +756,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -769,6 +774,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -2847,8 +2861,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3329,6 +3347,68 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader worker
+ * must call this function. It returns the number of parallel workers that
+ * actually can be launched and reserves (if any) these workers in global
+ * autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+AutoVacuumReserveParallelWorkers(int nworkers)
+{
+ int can_launch;
+
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ can_launch = Min(AutoVacuumShmem->av_freeParallelWorkers, nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+ return can_launch;
+}
+
+/*
+ * When parallel autovacuum worker terminates, leader worker must call this
+ * function in order to refresh global autovacuum state. Thus, other leaders
+ * will be able to use these workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ /* Refresh autovacuum_max_parallel_workers paremeter */
+ CHECK_FOR_INTERRUPTS();
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If autovacuum_max_parallel_workers parameter was reduced during
+ * parallel autovacuum execution, we must cap available workers number by
+ * its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ autovacuum_max_parallel_workers);
+
+ LWLockRelease(AutovacuumLock);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3389,6 +3469,8 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_freeParallelWorkers =
+ autovacuum_max_parallel_workers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3439,6 +3521,12 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
return true;
}
+void
+assign_autovacuum_max_parallel_workers(int newval, void *extra)
+{
+ autovacuum_max_parallel_workers = Min(newval, max_worker_processes);
+}
+
/*
* Returns whether there is a free autovacuum worker slot available.
*/
@@ -3470,3 +3558,32 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of free parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap the number of free workers by new parameter's value, if needed.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers,
+ autovacuum_max_parallel_workers);
+
+ if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of free workers.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+ }
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..fd00d6f89dc 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index d14b1678e7f..4941ad976df 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3604,6 +3604,16 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"autovacuum_max_parallel_workers", PGC_SIGHUP, VACUUM_AUTOVACUUM,
+ gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."),
+ gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."),
+ },
+ &autovacuum_max_parallel_workers,
+ 0, 0, MAX_BACKENDS,
+ NULL, assign_autovacuum_max_parallel_workers, NULL
+ },
+
{
{"max_parallel_maintenance_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
gettext_noop("Sets the maximum number of parallel processes per maintenance operation."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index a9d8293474a..bbf5307000f 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -683,6 +683,7 @@
autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 37524364290..3b3d4438e65 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1399,6 +1399,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..85926415657 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e8135f41a1c..42d4a63d033 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -64,6 +64,10 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern int AutoVacuumReserveParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 82ac8646a8d..04833b4f147 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -31,6 +31,7 @@ extern void assign_application_name(const char *newval, void *extra);
extern const char *show_archive_command(void);
extern bool check_autovacuum_work_mem(int *newval, void **extra,
GucSource source);
+extern void assign_autovacuum_max_parallel_workers(int newval, void *extra);
extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
GucSource source);
extern bool check_backtrace_functions(char **newval, void **extra,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..edd286808bf 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
On Mon, Jul 21, 2025 at 11:45 PM Daniil Davydov <3danissimo@gmail.com> wrote:
Hi,
On Mon, Jul 21, 2025 at 11:40 PM Sami Imseih <samimseih@gmail.com> wrote:
I have only reviewed the v8-0001-Parallel-index-autovacuum.patch so far and
have a few comments from my initial pass.1/ Please run pgindent.
OK, I'll do it.
2/ Documentation is missing. There may be more, but here are the places I
found that likely need updates for the new behavior, reloptions, GUC, etc.
Including docs in the patch early would help clarify expected behavior.https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-BASICS
https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM
https://www.postgresql.org/docs/current/runtime-config-autovacuum.html
https://www.postgresql.org/docs/current/sql-createtable.html
https://www.postgresql.org/docs/current/sql-altertable.html
https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES
https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-PARALLEL-MAINTENANCE-WORKERSThanks for gathering it all together. I'll update the documentation so
it will reflect changes in autovacuum daemon, reloptions and GUC
parameters. So far, I don't see what we can add to vacuum-basics
and alter-table paragraphs.I'll create separate .patch file for changes in documentation.
One thing I am unclear on is the interaction between max_worker_processes,
max_parallel_workers, and max_parallel_maintenance_workers. For example, does
the following change mean that manual VACUUM PARALLEL is no longer capped by
max_parallel_maintenance_workers?@@ -597,8 +614,8 @@ parallel_vacuum_compute_workers(Relation *indrels,
int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;- /* Cap by max_parallel_maintenance_workers */ - parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers); + /* Cap by GUC variable */ + parallel_workers = Min(parallel_workers, max_parallel_workers);Oh, it is my poor choice of a name for a local variable (I'll rename it).
This variable can get different values depending on performed operation :
autovacuum_max_parallel_workers for parallel autovacuum and
max_parallel_maintenance_workers for maintenance VACUUM.3/ Shouldn't this be "max_parallel_workers" instead of "bgworkers pool" ?
+ "autovacuum_parallel_workers", + "Maximum number of parallel autovacuum workers that can be taken from bgworkers pool for processing this table. "I don't think that we should refer to max_parallel_workers here.
Actually, this reloption doesn't depend on max_parallel_workers at all.
I wrote about bgworkers pool (both here and in description of
autovacuum_max_parallel_workers parameter) in order to clarify that
parallel autovacuum will use dynamic workers instead of launching
more a/v workers.BTW, I don't really like that the comment on this option turns out to be
very large. I'll leave only short description in reloptions.c and move
clarification about zero value in rel.h
Mentions of bgworkers pool will remain only in
description of autovacuum_max_parallel_workers.4/ The comment "When parallel autovacuum worker die" suggests an abnormal
exit. "Terminates" seems clearer, since this applies to both normal and
abnormal exits.instead of:
+ * When parallel autovacuum worker die,how about this:
* When parallel autovacuum worker terminates,Sounds reasonable, I'll fix it.
5/ Any reason AutoVacuumReleaseParallelWorkers cannot be called before
DestroyParallelContext?+ nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember this value */ DestroyParallelContext(pvs->pcxt); + + /* Release all launched (i.e. reserved) parallel autovacuum workers. */ + if (AmAutoVacuumWorkerProcess()) + AutoVacuumReleaseParallelWorkers(nlaunched_workers);I wrote about it above [1], but I think I can duplicate my thoughts here :
"""
Destroying parallel context includes waiting for all workers to exit (after
which, other operations can use them).
If we first call ParallelAutoVacuumReleaseWorkers, some operation can
reasonably request all released workers. But this request can fail,
because there is no guarantee that workers managed to finish.Actually, there's nothing wrong with that, but I think releasing workers
only after finishing work is a more logical approach.
"""6/ Also, would it be cleaner to move AmAutoVacuumWorkerProcess() inside
AutoVacuumReleaseParallelWorkers()?if (!AmAutoVacuumWorkerProcess())
return;It seems to me that the opposite is true. If there is no alternative to calling
AmAutoVacuumWorkerProcess, it might confuse somebody. All doubts
will disappear after viewing the AmAutoVacuumWorkerProcess code,
but IMO code in vacuumparallel.c will become less intuitive.7/ It looks like the psql tab completion for autovacuum_parallel_workers is
missing:test=# alter table t set (autovacuum_
autovacuum_analyze_scale_factor
autovacuum_analyze_threshold
autovacuum_enabled
autovacuum_freeze_max_age
autovacuum_freeze_min_age
autovacuum_freeze_table_age
autovacuum_multixact_freeze_max_age
autovacuum_multixact_freeze_min_age
autovacuum_multixact_freeze_table_age
autovacuum_vacuum_cost_delay
autovacuum_vacuum_cost_limit
autovacuum_vacuum_insert_scale_factor
autovacuum_vacuum_insert_threshold
autovacuum_vacuum_max_threshold
autovacuum_vacuum_scale_factor
autovacuum_vacuum_thresholdGood catch, I'll fix it.
Thank you for the review! Please, see v9 patches :
1) Run pgindent + rebase patches on newest commit in master.
2) Introduce changes for documentation.
3) Rename local variable in parallel_vacuum_compute_workers.
4) Shorten the description of autovacuum_parallel_workers in
reloptions.c (move clarifications for it into rel.h).
5) Reword "When parallel autovacuum worker die" comment.
6) Add tab completion for autovacuum_parallel_workers table option.
Thank you for updating the patch. Here are some review comments.
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */
+ if (AmAutoVacuumWorkerProcess())
+ AutoVacuumReleaseParallelWorkers(nlaunched_workers);
+
We release the reserved worker in parallel_vacuum_end(). However,
parallel_vacuum_end() is called only once at the end of vacuum. I
think we need to release the reserved worker after index vacuuming or
cleanup, otherwise we would end up holding the reserved workers until
the end of vacuum even if we invoke index vacuuming multiple times.
---
+void
+assign_autovacuum_max_parallel_workers(int newval, void *extra)
+{
+ autovacuum_max_parallel_workers = Min(newval, max_worker_processes);
+}
I don't think we need the assign hook for this GUC parameter. We can
internally cap the maximum value by max_worker_processes like other
GUC parameters such as max_parallel_maintenance_workers and
max_parallel_workers.
---+ /* Refresh autovacuum_max_parallel_workers paremeter */
+ CHECK_FOR_INTERRUPTS();
+ if (ConfigReloadPending)
+ {
+ ConfigReloadPending = false;
+ ProcessConfigFile(PGC_SIGHUP);
+ }
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If autovacuum_max_parallel_workers parameter was reduced during
+ * parallel autovacuum execution, we must cap available
workers number by
+ * its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ autovacuum_max_parallel_workers);
+
+ LWLockRelease(AutovacuumLock);
I think another race condition could occur; suppose
autovacuum_max_parallel_workers is set to '5' and one autovacuum
worker reserved 5 workers, meaning that
AutoVacuumShmem->av_freeParallelWorkers is 0. Then, the user changes
autovacuum_max_parallel_workers to 3 and reloads the conf file right
after the autovacuum worker checks the interruption. The launcher
processes calls adjust_free_parallel_workers() but
av_freeParallelWorkers remains 0, and the autovacuum worker increments
it by 5 as its autovacuum_max_parallel_workers value is still 5.
I think that we can have the autovacuum_max_parallel_workers value on
shmem, and only the launcher process can modify its value if the GUC
is changed. Autovacuum workers simply increase or decrease the
av_freeParallelWorkers within the range of 0 and the
autovacuum_max_parallel_workers value on shmem. When changing
autovacuum_max_parallel_workers and av_freeParallelWorkers values on
shmem, the launcher process calculates the number of workers reserved
at that time and calculate the new av_freeParallelWorkers value by
subtracting the new autovacuum_max_parallel_workers by the number of
reserved workers.
---
+AutoVacuumReserveParallelWorkers(int nworkers)
+{
+ int can_launch;
How about renaming it to 'nreserved' or something? can_launch looks
like it's a boolean variable to indicate whether the process can
launch workers.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Thu, Aug 7, 2025 at 4:38 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Jul 21, 2025 at 11:45 PM Daniil Davydov <3danissimo@gmail.com> wrote:
Hi,
On Mon, Jul 21, 2025 at 11:40 PM Sami Imseih <samimseih@gmail.com> wrote:
I have only reviewed the v8-0001-Parallel-index-autovacuum.patch so far and
have a few comments from my initial pass.1/ Please run pgindent.
OK, I'll do it.
2/ Documentation is missing. There may be more, but here are the places I
found that likely need updates for the new behavior, reloptions, GUC, etc.
Including docs in the patch early would help clarify expected behavior.https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-BASICS
https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM
https://www.postgresql.org/docs/current/runtime-config-autovacuum.html
https://www.postgresql.org/docs/current/sql-createtable.html
https://www.postgresql.org/docs/current/sql-altertable.html
https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES
https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAX-PARALLEL-MAINTENANCE-WORKERSThanks for gathering it all together. I'll update the documentation so
it will reflect changes in autovacuum daemon, reloptions and GUC
parameters. So far, I don't see what we can add to vacuum-basics
and alter-table paragraphs.I'll create separate .patch file for changes in documentation.
One thing I am unclear on is the interaction between max_worker_processes,
max_parallel_workers, and max_parallel_maintenance_workers. For example, does
the following change mean that manual VACUUM PARALLEL is no longer capped by
max_parallel_maintenance_workers?@@ -597,8 +614,8 @@ parallel_vacuum_compute_workers(Relation *indrels,
int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;- /* Cap by max_parallel_maintenance_workers */ - parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers); + /* Cap by GUC variable */ + parallel_workers = Min(parallel_workers, max_parallel_workers);Oh, it is my poor choice of a name for a local variable (I'll rename it).
This variable can get different values depending on performed operation :
autovacuum_max_parallel_workers for parallel autovacuum and
max_parallel_maintenance_workers for maintenance VACUUM.3/ Shouldn't this be "max_parallel_workers" instead of "bgworkers pool" ?
+ "autovacuum_parallel_workers", + "Maximum number of parallel autovacuum workers that can be taken from bgworkers pool for processing this table. "I don't think that we should refer to max_parallel_workers here.
Actually, this reloption doesn't depend on max_parallel_workers at all.
I wrote about bgworkers pool (both here and in description of
autovacuum_max_parallel_workers parameter) in order to clarify that
parallel autovacuum will use dynamic workers instead of launching
more a/v workers.BTW, I don't really like that the comment on this option turns out to be
very large. I'll leave only short description in reloptions.c and move
clarification about zero value in rel.h
Mentions of bgworkers pool will remain only in
description of autovacuum_max_parallel_workers.4/ The comment "When parallel autovacuum worker die" suggests an abnormal
exit. "Terminates" seems clearer, since this applies to both normal and
abnormal exits.instead of:
+ * When parallel autovacuum worker die,how about this:
* When parallel autovacuum worker terminates,Sounds reasonable, I'll fix it.
5/ Any reason AutoVacuumReleaseParallelWorkers cannot be called before
DestroyParallelContext?+ nlaunched_workers = pvs->pcxt->nworkers_launched; /* remember this value */ DestroyParallelContext(pvs->pcxt); + + /* Release all launched (i.e. reserved) parallel autovacuum workers. */ + if (AmAutoVacuumWorkerProcess()) + AutoVacuumReleaseParallelWorkers(nlaunched_workers);I wrote about it above [1], but I think I can duplicate my thoughts here :
"""
Destroying parallel context includes waiting for all workers to exit (after
which, other operations can use them).
If we first call ParallelAutoVacuumReleaseWorkers, some operation can
reasonably request all released workers. But this request can fail,
because there is no guarantee that workers managed to finish.Actually, there's nothing wrong with that, but I think releasing workers
only after finishing work is a more logical approach.
"""6/ Also, would it be cleaner to move AmAutoVacuumWorkerProcess() inside
AutoVacuumReleaseParallelWorkers()?if (!AmAutoVacuumWorkerProcess())
return;It seems to me that the opposite is true. If there is no alternative to calling
AmAutoVacuumWorkerProcess, it might confuse somebody. All doubts
will disappear after viewing the AmAutoVacuumWorkerProcess code,
but IMO code in vacuumparallel.c will become less intuitive.7/ It looks like the psql tab completion for autovacuum_parallel_workers is
missing:test=# alter table t set (autovacuum_
autovacuum_analyze_scale_factor
autovacuum_analyze_threshold
autovacuum_enabled
autovacuum_freeze_max_age
autovacuum_freeze_min_age
autovacuum_freeze_table_age
autovacuum_multixact_freeze_max_age
autovacuum_multixact_freeze_min_age
autovacuum_multixact_freeze_table_age
autovacuum_vacuum_cost_delay
autovacuum_vacuum_cost_limit
autovacuum_vacuum_insert_scale_factor
autovacuum_vacuum_insert_threshold
autovacuum_vacuum_max_threshold
autovacuum_vacuum_scale_factor
autovacuum_vacuum_thresholdGood catch, I'll fix it.
Thank you for the review! Please, see v9 patches :
1) Run pgindent + rebase patches on newest commit in master.
2) Introduce changes for documentation.
3) Rename local variable in parallel_vacuum_compute_workers.
4) Shorten the description of autovacuum_parallel_workers in
reloptions.c (move clarifications for it into rel.h).
5) Reword "When parallel autovacuum worker die" comment.
6) Add tab completion for autovacuum_parallel_workers table option.Thank you for updating the patch. Here are some review comments.
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */ + if (AmAutoVacuumWorkerProcess()) + AutoVacuumReleaseParallelWorkers(nlaunched_workers); +We release the reserved worker in parallel_vacuum_end(). However,
parallel_vacuum_end() is called only once at the end of vacuum. I
think we need to release the reserved worker after index vacuuming or
cleanup, otherwise we would end up holding the reserved workers until
the end of vacuum even if we invoke index vacuuming multiple times.--- +void +assign_autovacuum_max_parallel_workers(int newval, void *extra) +{ + autovacuum_max_parallel_workers = Min(newval, max_worker_processes); +}I don't think we need the assign hook for this GUC parameter. We can
internally cap the maximum value by max_worker_processes like other
GUC parameters such as max_parallel_maintenance_workers and
max_parallel_workers.---+ /* Refresh autovacuum_max_parallel_workers paremeter */ + CHECK_FOR_INTERRUPTS(); + if (ConfigReloadPending) + { + ConfigReloadPending = false; + ProcessConfigFile(PGC_SIGHUP); + } + + LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE); + + /* + * If autovacuum_max_parallel_workers parameter was reduced during + * parallel autovacuum execution, we must cap available workers number by + * its new value. + */ + AutoVacuumShmem->av_freeParallelWorkers = + Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers, + autovacuum_max_parallel_workers); + + LWLockRelease(AutovacuumLock);I think another race condition could occur; suppose
autovacuum_max_parallel_workers is set to '5' and one autovacuum
worker reserved 5 workers, meaning that
AutoVacuumShmem->av_freeParallelWorkers is 0. Then, the user changes
autovacuum_max_parallel_workers to 3 and reloads the conf file right
after the autovacuum worker checks the interruption. The launcher
processes calls adjust_free_parallel_workers() but
av_freeParallelWorkers remains 0, and the autovacuum worker increments
it by 5 as its autovacuum_max_parallel_workers value is still 5.I think that we can have the autovacuum_max_parallel_workers value on
shmem, and only the launcher process can modify its value if the GUC
is changed. Autovacuum workers simply increase or decrease the
av_freeParallelWorkers within the range of 0 and the
autovacuum_max_parallel_workers value on shmem. When changing
autovacuum_max_parallel_workers and av_freeParallelWorkers values on
shmem, the launcher process calculates the number of workers reserved
at that time and calculate the new av_freeParallelWorkers value by
subtracting the new autovacuum_max_parallel_workers by the number of
reserved workers.--- +AutoVacuumReserveParallelWorkers(int nworkers) +{ + int can_launch;How about renaming it to 'nreserved' or something? can_launch looks
like it's a boolean variable to indicate whether the process can
launch workers.
While testing the patch, I found there are other two problems:
1. when an autovacuum worker who reserved workers fails with an error,
the reserved workers are not released. I think we need to ensure that
all reserved workers are surely released at the end of vacuum even
with an error.
2. when an autovacuum worker (not parallel vacuum worker) who uses
parallel vacuum gets SIGHUP, it errors out with the error message
"parameter "max_stack_depth" cannot be set during a parallel
operation". Autovacuum checks the configuration file reload in
vacuum_delay_point(), and while reloading the configuration file, it
attempts to set max_stack_depth in
InitializeGUCOptionsFromEnvironment() (which is called by
ProcessConfigFileInternal()). However, it cannot change
max_stack_depth since the worker is in parallel mode but
max_stack_depth doesn't have GUC_ALLOW_IN_PARALLEL flag. This doesn't
happen in regular backends who are using parallel queries because they
check the configuration file reload at the end of each SQL command.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Hi,
Thank you very much for your comments!
In this letter I'll answer both of your recent letters.
On Fri, Aug 8, 2025 at 6:38 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Thank you for updating the patch. Here are some review comments.
+ /* Release all launched (i.e. reserved) parallel autovacuum workers. */ + if (AmAutoVacuumWorkerProcess()) + AutoVacuumReleaseParallelWorkers(nlaunched_workers); +We release the reserved worker in parallel_vacuum_end(). However,
parallel_vacuum_end() is called only once at the end of vacuum. I
think we need to release the reserved worker after index vacuuming or
cleanup, otherwise we would end up holding the reserved workers until
the end of vacuum even if we invoke index vacuuming multiple times.
Yep, you are right. It was easy to miss because typically the autovacuum
takes only one cycle to process a table. Since both index vacuum and
index cleanup uses the parallel_vacuum_process_all_indexes function,
I think that both releasing and reserving should be placed there.
--- +void +assign_autovacuum_max_parallel_workers(int newval, void *extra) +{ + autovacuum_max_parallel_workers = Min(newval, max_worker_processes); +}I don't think we need the assign hook for this GUC parameter. We can
internally cap the maximum value by max_worker_processes like other
GUC parameters such as max_parallel_maintenance_workers and
max_parallel_workers.
Ok, I get it - we don't want to give a configuration error for no serious
reason. Actually, we already internally capping
autovacuum_max_parallel_workers by max_worker_processes (inside
parallel_vacuum_compute_workers function). This is the same behavior
as max_parallel_maintenance_workers got.
I'll get rid of the assign hook and add one more capping inside autovacuum
shmem initialization : Since max_worker_processes is PGC_POSTMASTER
parameter, av_freeParallelWorkers must not exceed its value.
---+ /* Refresh autovacuum_max_parallel_workers paremeter */ + CHECK_FOR_INTERRUPTS(); + if (ConfigReloadPending) + { + ConfigReloadPending = false; + ProcessConfigFile(PGC_SIGHUP); + } + + LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE); + + /* + * If autovacuum_max_parallel_workers parameter was reduced during + * parallel autovacuum execution, we must cap available workers number by + * its new value. + */ + AutoVacuumShmem->av_freeParallelWorkers = + Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers, + autovacuum_max_parallel_workers); + + LWLockRelease(AutovacuumLock);I think another race condition could occur; suppose
autovacuum_max_parallel_workers is set to '5' and one autovacuum
worker reserved 5 workers, meaning that
AutoVacuumShmem->av_freeParallelWorkers is 0. Then, the user changes
autovacuum_max_parallel_workers to 3 and reloads the conf file right
after the autovacuum worker checks the interruption. The launcher
processes calls adjust_free_parallel_workers() but
av_freeParallelWorkers remains 0, and the autovacuum worker increments
it by 5 as its autovacuum_max_parallel_workers value is still 5.
I think this problem can be solved if we put AutovacuumLock acquiring
before processing the config file, but I understand that this is a bad way.
I think that we can have the autovacuum_max_parallel_workers value on
shmem, and only the launcher process can modify its value if the GUC
is changed. Autovacuum workers simply increase or decrease the
av_freeParallelWorkers within the range of 0 and the
autovacuum_max_parallel_workers value on shmem. When changing
autovacuum_max_parallel_workers and av_freeParallelWorkers values on
shmem, the launcher process calculates the number of workers reserved
at that time and calculate the new av_freeParallelWorkers value by
subtracting the new autovacuum_max_parallel_workers by the number of
reserved workers.
Good idea, I agree. Replacing the GUC parameter with the variable in shmem
leaves the current logic of free workers management unchanged. Essentially,
this is the same solution as I described above, but we are holding lock not
during config reloading, but during a simple value check. It makes much
more sense.
--- +AutoVacuumReserveParallelWorkers(int nworkers) +{ + int can_launch;How about renaming it to 'nreserved' or something? can_launch looks
like it's a boolean variable to indicate whether the process can
launch workers.
There are no objections.
On Fri, Aug 15, 2025 at 3:41 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
While testing the patch, I found there are other two problems:
1. when an autovacuum worker who reserved workers fails with an error,
the reserved workers are not released. I think we need to ensure that
all reserved workers are surely released at the end of vacuum even
with an error.
Agree. I'll add a try/catch block to the parallel_vacuum_process_all_indexes
(the only place where we are reserving workers).
2. when an autovacuum worker (not parallel vacuum worker) who uses
parallel vacuum gets SIGHUP, it errors out with the error message
"parameter "max_stack_depth" cannot be set during a parallel
operation". Autovacuum checks the configuration file reload in
vacuum_delay_point(), and while reloading the configuration file, it
attempts to set max_stack_depth in
InitializeGUCOptionsFromEnvironment() (which is called by
ProcessConfigFileInternal()). However, it cannot change
max_stack_depth since the worker is in parallel mode but
max_stack_depth doesn't have GUC_ALLOW_IN_PARALLEL flag. This doesn't
happen in regular backends who are using parallel queries because they
check the configuration file reload at the end of each SQL command.
Hm, this is a really serious problem. I see only two ways to solve it (both are
not really good) :
1)
Do not allow processing of the config file during parallel autovacuum
execution.
2)
Teach the autovacuum to enter parallel mode only during the index vacuum/cleanup
phase. I'm a bit wary about it, because the design says that we should
be in parallel
mode during the whole parallel operation. But actually, if we can make
sure that all
launched workers are exited, I don't see reasons, why can't we just
exit parallel mode
at the end of parallel_vacuum_process_all_indexes.
What do you think about it? By now, I haven't made any changes related
to this problem.
Again, thank you for the review. Please, see v10 patches (only 0001
has been changed) :
1) Reserve and release workers only inside parallel_vacuum_process_all_indexes.
2) Add try/catch block to the parallel_vacuum_process_all_indexes, so we can
release workers even after an error. This required adding a static
variable to account
for the total number of reserved workers (av_nworkers_reserved).
3) Cap autovacuum_max_parallel_workers by max_worker_processes only inside
autovacuum code. Assign hook has been removed.
4) Use shmem value for determining the maximum number of parallel autovacuum
workers (eliminate race condition between launcher and leader process).
--
Best regards,
Daniil Davydov
Attachments:
v10-0002-Logging-for-parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v10-0002-Logging-for-parallel-autovacuum.patchDownload
From e991e071d4798e8c2ec576389f5a8592fe76282b Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Mon, 18 Aug 2025 15:14:25 +0700
Subject: [PATCH v10 2/3] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 27 ++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 28 ++++++++++++++++++---------
src/include/commands/vacuum.h | 16 +++++++++++++--
3 files changed, 58 insertions(+), 13 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 14036c27e87..f1a645e79a9 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -348,6 +348,12 @@ typedef struct LVRelState
/* Instrumentation counters */
int num_index_scans;
+
+ /*
+ * Number of planned and actually launched parallel workers for all index
+ * scans, or NULL
+ */
+ PVWorkersUsage *workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -688,6 +694,16 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
indnames = palloc(sizeof(char *) * vacrel->nindexes);
for (int i = 0; i < vacrel->nindexes; i++)
indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
+
+ /*
+ * Allocate space for workers usage statistics. Thus, we explicitly
+ * make clear that such statistics must be accumulated. For now, this
+ * is used only by autovacuum leader worker, because it must log it in
+ * the end of table processing.
+ */
+ vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
+ (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
+ NULL;
}
/*
@@ -1012,6 +1028,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->relnamespace,
vacrel->relname,
vacrel->num_index_scans);
+ if (vacrel->workers_usage)
+ appendStringInfo(&buf,
+ _("workers usage statistics for all of index scans : launched in total = %d, planned in total = %d\n"),
+ vacrel->workers_usage->nlaunched,
+ vacrel->workers_usage->nplanned);
appendStringInfo(&buf, _("pages: %u removed, %u remain, %u scanned (%.2f%% of total), %u eagerly scanned\n"),
vacrel->removed_pages,
new_rel_pages,
@@ -2634,7 +2655,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3047,7 +3069,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 4221e6084f5..02870ed1288 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,9 +227,10 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersUsage * wusage);
static void parallel_vacuum_process_all_indexes_internal(ParallelVacuumState *pvs,
- int num_index_scans, bool vacuum);
+ int num_index_scans, bool vacuum,
+ PVWorkersUsage * wusage);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -504,7 +505,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage * wusage)
{
Assert(!IsParallelWorker());
@@ -515,7 +516,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
}
/*
@@ -523,7 +524,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage * wusage)
{
Assert(!IsParallelWorker());
@@ -535,7 +537,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
/*
@@ -620,7 +622,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersUsage * wusage)
{
/*
* Parallel autovacuum can reserve parallel workers. Use try/catch block
@@ -629,7 +631,7 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
PG_TRY();
{
parallel_vacuum_process_all_indexes_internal(pvs, num_index_scans,
- false);
+ false, wusage);
}
PG_CATCH();
{
@@ -644,7 +646,8 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
static void
parallel_vacuum_process_all_indexes_internal(ParallelVacuumState *pvs,
- int num_index_scans, bool vacuum)
+ int num_index_scans, bool vacuum,
+ PVWorkersUsage * wusage)
{
int nworkers;
PVIndVacStatus new_status;
@@ -768,6 +771,13 @@ parallel_vacuum_process_all_indexes_internal(ParallelVacuumState *pvs,
"launched %d parallel vacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
pvs->pcxt->nworkers_launched, nworkers)));
+
+ /* Remember these values, if we asked to. */
+ if (wusage != NULL)
+ {
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
+ wusage->nplanned += nworkers;
+ }
}
/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 14eeccbd718..d05ef7461ea 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -295,6 +295,16 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * PVWorkersUsage stores information about total number of launched and planned
+ * workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+ int nlaunched;
+ int nplanned;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -389,11 +399,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage * wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage * wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
--
2.43.0
v10-0003-Documentation-for-parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v10-0003-Documentation-for-parallel-autovacuum.patchDownload
From 62abb120d888a837e50bb55ba26ba740caad8f7a Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Tue, 22 Jul 2025 12:31:20 +0700
Subject: [PATCH v10 3/3] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 18 ++++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 50 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 20ccb2d6b54..b74053281de 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2835,6 +2835,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9189,6 +9190,23 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-worker-processes"/>. The default is 0,
+ which means no parallel index vacuuming.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index e7a9f58c015..4e450ba9066 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -896,6 +896,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index dc000e913c1..288de6b0ffd 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1717,6 +1717,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
v10-0001-Parallel-index-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v10-0001-Parallel-index-autovacuum.patchDownload
From a470d95603b437ef5aa45470ad7be61f03682493 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Sun, 20 Jul 2025 23:03:57 +0700
Subject: [PATCH v10 1/3] Parallel index autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 68 ++++++++-
src/backend/postmaster/autovacuum.c | 144 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc_tables.c | 10 ++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 5 +
src/include/utils/rel.h | 7 +
10 files changed, 241 insertions(+), 8 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 0af3fea68fa..1c98d43c6eb 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1872,6 +1881,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..4221e6084f5 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * future comments, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -225,6 +228,8 @@ static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
bool vacuum);
+static void parallel_vacuum_process_all_indexes_internal(ParallelVacuumState *pvs,
+ int num_index_scans, bool vacuum);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -373,8 +378,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -553,12 +559,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -597,8 +608,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -610,6 +621,30 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
bool vacuum)
+{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Use try/catch block
+ * to make ensure that all workers are released.
+ */
+ PG_TRY();
+ {
+ parallel_vacuum_process_all_indexes_internal(pvs, num_index_scans,
+ false);
+ }
+ PG_CATCH();
+ {
+ /* Release all reserved parallel workers, if any. */
+ if (AmAutoVacuumWorkerProcess())
+ AutoVacuumReleaseAllParallelWorkers();
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
+}
+
+static void
+parallel_vacuum_process_all_indexes_internal(ParallelVacuumState *pvs,
+ int num_index_scans, bool vacuum)
{
int nworkers;
PVIndVacStatus new_status;
@@ -646,6 +681,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note, that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = AutoVacuumReserveParallelWorkers(nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +732,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -738,6 +790,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Also release all previously reserved parallel autovacuum workers */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseParallelWorkers(pvs->pcxt->nworkers_launched);
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index ff96b36d710..78ceac67319 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -151,6 +151,12 @@ int Log_autovacuum_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Variable to keep number of currently reserved parallel autovacuum workers.
+ * It is only relevant for parallel autovacuum leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -285,6 +291,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +307,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -364,6 +374,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
@@ -763,6 +774,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -779,6 +792,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -2871,8 +2893,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3353,6 +3379,85 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader
+ * autovacuum process must call this function. It returns the number of
+ * parallel workers that actually can be launched and reserves these workers
+ * (if any) in global autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+AutoVacuumReserveParallelWorkers(int nworkers)
+{
+ int nreserved;
+
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ /*
+ * We can only reserve workers at the beginning of parallel index
+ * processing, so we must not have any reserved workers right now.
+ */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ nreserved = Min(AutoVacuumShmem->av_freeParallelWorkers, nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= nworkers;
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved += nworkers;
+
+ LWLockRelease(AutovacuumLock);
+ return nreserved;
+}
+
+/*
+ * Leader autovacuum process must call this function in order to update global
+ * autovacuum state, so other leaders will be able to use these parallel
+ * workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Same as above, but release *all* parallel workers, that were reserved by
+ * current leader autovacuum process.
+ */
+void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3413,6 +3518,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3494,3 +3603,34 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of free parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap the number of free workers by new parameter's value, if needed.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers,
+ autovacuum_max_parallel_workers);
+
+ if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of free workers.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+ }
+
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..fd00d6f89dc 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index d14b1678e7f..9ecb14227e5 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3604,6 +3604,16 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+ {
+ {"autovacuum_max_parallel_workers", PGC_SIGHUP, VACUUM_AUTOVACUUM,
+ gettext_noop("Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool."),
+ gettext_noop("This parameter is capped by \"max_worker_processes\" (not by \"autovacuum_max_workers\"!)."),
+ },
+ &autovacuum_max_parallel_workers,
+ 0, 0, MAX_BACKENDS,
+ NULL, NULL, NULL
+ },
+
{
{"max_parallel_maintenance_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
gettext_noop("Sets the maximum number of parallel processes per maintenance operation."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index a9d8293474a..bbf5307000f 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -683,6 +683,7 @@
autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 8b10f2313f3..290dd5cb8ec 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1402,6 +1402,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..85926415657 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e8135f41a1c..904c5ce37d8 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -64,6 +64,11 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern int AutoVacuumReserveParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseAllParallelWorkers(void);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..edd286808bf 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
On Mon, Aug 18, 2025 at 1:31 AM Daniil Davydov <3danissimo@gmail.com> wrote:
On Fri, Aug 15, 2025 at 3:41 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
2. when an autovacuum worker (not parallel vacuum worker) who uses
parallel vacuum gets SIGHUP, it errors out with the error message
"parameter "max_stack_depth" cannot be set during a parallel
operation". Autovacuum checks the configuration file reload in
vacuum_delay_point(), and while reloading the configuration file, it
attempts to set max_stack_depth in
InitializeGUCOptionsFromEnvironment() (which is called by
ProcessConfigFileInternal()). However, it cannot change
max_stack_depth since the worker is in parallel mode but
max_stack_depth doesn't have GUC_ALLOW_IN_PARALLEL flag. This doesn't
happen in regular backends who are using parallel queries because they
check the configuration file reload at the end of each SQL command.Hm, this is a really serious problem. I see only two ways to solve it (both are
not really good) :
1)
Do not allow processing of the config file during parallel autovacuum
execution.2)
Teach the autovacuum to enter parallel mode only during the index vacuum/cleanup
phase. I'm a bit wary about it, because the design says that we should
be in parallel
mode during the whole parallel operation. But actually, if we can make
sure that all
launched workers are exited, I don't see reasons, why can't we just
exit parallel mode
at the end of parallel_vacuum_process_all_indexes.What do you think about it?
Hmm, given that we're trying to support parallel heap vacuum on
another thread[1]/messages/by-id/CAD21AoAEfCNv-GgaDheDJ+s-p_Lv1H24AiJeNoPGCmZNSwL1YA@mail.gmail.com and we will probably support it in autovacuums, it
seems to me that these approaches won't work.
Another idea would be to allow autovacuum workers to process the
config file even in parallel mode. GUC changes in the leader worker
would not affect parallel vacuum workers, but it is fine to me. In the
context of autovacuum, only specific GUC parameters related to
cost-based delays need to be affected also to parallel vacuum workers.
Probably we need some changes to compute_parallel_delay() so that
parallel workers can compute the sleep time based on the new
vacuum_cost_limit and vacuum_cost_delay after the leader process
(i.e., autovacuum worker) reloads the config file.
Again, thank you for the review. Please, see v10 patches (only 0001
has been changed) :
1) Reserve and release workers only inside parallel_vacuum_process_all_indexes.
2) Add try/catch block to the parallel_vacuum_process_all_indexes, so we can
release workers even after an error. This required adding a static
variable to account
for the total number of reserved workers (av_nworkers_reserved).
3) Cap autovacuum_max_parallel_workers by max_worker_processes only inside
autovacuum code. Assign hook has been removed.
4) Use shmem value for determining the maximum number of parallel autovacuum
workers (eliminate race condition between launcher and leader process).
Thank you for updating the patch! I'll review the new version patches.
Regards,
[1]: /messages/by-id/CAD21AoAEfCNv-GgaDheDJ+s-p_Lv1H24AiJeNoPGCmZNSwL1YA@mail.gmail.com
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Hi!
On Tue, Aug 19, 2025 at 12:04 AM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:
On Mon, Aug 18, 2025 at 1:31 AM Daniil Davydov <3danissimo@gmail.com>
wrote:
On Fri, Aug 15, 2025 at 3:41 AM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:
2. when an autovacuum worker (not parallel vacuum worker) who uses
parallel vacuum gets SIGHUP, it errors out with the error message
"parameter "max_stack_depth" cannot be set during a parallel
operation". Autovacuum checks the configuration file reload in
vacuum_delay_point(), and while reloading the configuration file, it
attempts to set max_stack_depth in
InitializeGUCOptionsFromEnvironment() (which is called by
ProcessConfigFileInternal()). However, it cannot change
max_stack_depth since the worker is in parallel mode but
max_stack_depth doesn't have GUC_ALLOW_IN_PARALLEL flag. This doesn't
happen in regular backends who are using parallel queries because they
check the configuration file reload at the end of each SQL command.Hm, this is a really serious problem. I see only two ways to solve it
(both are
not really good) :
1)
Do not allow processing of the config file during parallel autovacuum
execution.2)
Teach the autovacuum to enter parallel mode only during the index
vacuum/cleanup
phase. I'm a bit wary about it, because the design says that we should
be in parallel
mode during the whole parallel operation. But actually, if we can make
sure that all
launched workers are exited, I don't see reasons, why can't we just
exit parallel mode
at the end of parallel_vacuum_process_all_indexes.What do you think about it?
Hmm, given that we're trying to support parallel heap vacuum on
another thread[1] and we will probably support it in autovacuums, it
seems to me that these approaches won't work.Another idea would be to allow autovacuum workers to process the
config file even in parallel mode. GUC changes in the leader worker
would not affect parallel vacuum workers, but it is fine to me. In the
context of autovacuum, only specific GUC parameters related to
cost-based delays need to be affected also to parallel vacuum workers.
Probably we need some changes to compute_parallel_delay() so that
parallel workers can compute the sleep time based on the new
vacuum_cost_limit and vacuum_cost_delay after the leader process
(i.e., autovacuum worker) reloads the config file.Again, thank you for the review. Please, see v10 patches (only 0001
has been changed) :
1) Reserve and release workers only inside
parallel_vacuum_process_all_indexes.
2) Add try/catch block to the parallel_vacuum_process_all_indexes, so
we can
release workers even after an error. This required adding a static
variable to account
for the total number of reserved workers (av_nworkers_reserved).
3) Cap autovacuum_max_parallel_workers by max_worker_processes only
inside
autovacuum code. Assign hook has been removed.
4) Use shmem value for determining the maximum number of parallel
autovacuum
workers (eliminate race condition between launcher and leader process).
Thank you for updating the patch! I'll review the new version patches.
I've rebased this patchset to the current master. That required me to move
the new GUC definition to guc_parameters.dat. Also, I adjusted
typedefs.list and made pgindent. Some notes about the patch.
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used
for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
Should we use MAX_PARALLEL_WORKER_LIMIT instead of hard-coded 1024 here?
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * future comments, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
Not sure about the usage of word "future" here. It doesn't look clear what
it means. Could we use "below" or "within this file"?
I see parallel_vacuum_process_all_indexes() have a TRY/CATCH block. As I
heard, the overhead of setting/doing jumps is platform-dependent, and not
harmless on some platforms. Therefore, can we skip TRY/CATCH block for
non-autovacuum vacuum? Possibly we could move it to AutoVacWorkerMain(),
that would save us from repeatedly setting a jump in autovacuum workers too.
In general, I think this patchset badly lack of testing. I think it needs
tap tests checking from the logs that autovacuum has been done in
parallel. Also, it would be good to set up some injection points, and
check that reserved autovacuum parallel workers are getting released
correctly in the case of errors.
------
Regards,
Alexander Korotkov
Supabase
Attachments:
v11-0001-Parallel-index-autovacuum.patchapplication/octet-stream; name=v11-0001-Parallel-index-autovacuum.patchDownload
From c40bfce2f812370315ca9ea735b9d3d31384d4d2 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Mon, 15 Sep 2025 21:12:01 +0300
Subject: [PATCH v11 1/3] Parallel index autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 68 ++++++++-
src/backend/postmaster/autovacuum.c | 144 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc_parameters.dat | 9 ++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 5 +
src/include/utils/rel.h | 7 +
10 files changed, 240 insertions(+), 8 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 0af3fea68fa..1c98d43c6eb 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1872,6 +1881,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..4221e6084f5 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * future comments, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -225,6 +228,8 @@ static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
bool vacuum);
+static void parallel_vacuum_process_all_indexes_internal(ParallelVacuumState *pvs,
+ int num_index_scans, bool vacuum);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -373,8 +378,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -553,12 +559,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -597,8 +608,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -610,6 +621,30 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
bool vacuum)
+{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Use try/catch block
+ * to make ensure that all workers are released.
+ */
+ PG_TRY();
+ {
+ parallel_vacuum_process_all_indexes_internal(pvs, num_index_scans,
+ false);
+ }
+ PG_CATCH();
+ {
+ /* Release all reserved parallel workers, if any. */
+ if (AmAutoVacuumWorkerProcess())
+ AutoVacuumReleaseAllParallelWorkers();
+
+ PG_RE_THROW();
+ }
+ PG_END_TRY();
+}
+
+static void
+parallel_vacuum_process_all_indexes_internal(ParallelVacuumState *pvs,
+ int num_index_scans, bool vacuum)
{
int nworkers;
PVIndVacStatus new_status;
@@ -646,6 +681,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note, that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = AutoVacuumReserveParallelWorkers(nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +732,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -738,6 +790,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Also release all previously reserved parallel autovacuum workers */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseParallelWorkers(pvs->pcxt->nworkers_launched);
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index dce4c8c45b9..2bcd2ceb2a9 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -150,6 +150,12 @@ int Log_autovacuum_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Variable to keep number of currently reserved parallel autovacuum workers.
+ * It is only relevant for parallel autovacuum leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -284,6 +290,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -298,6 +306,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -363,6 +373,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
@@ -762,6 +773,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -778,6 +791,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -2870,8 +2892,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3352,6 +3378,85 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader
+ * autovacuum process must call this function. It returns the number of
+ * parallel workers that actually can be launched and reserves these workers
+ * (if any) in global autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+AutoVacuumReserveParallelWorkers(int nworkers)
+{
+ int nreserved;
+
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ /*
+ * We can only reserve workers at the beginning of parallel index
+ * processing, so we must not have any reserved workers right now.
+ */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ nreserved = Min(AutoVacuumShmem->av_freeParallelWorkers, nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= nworkers;
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved += nworkers;
+
+ LWLockRelease(AutovacuumLock);
+ return nreserved;
+}
+
+/*
+ * Leader autovacuum process must call this function in order to update global
+ * autovacuum state, so other leaders will be able to use these parallel
+ * workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Same as above, but release *all* parallel workers, that were reserved by
+ * current leader autovacuum process.
+ */
+void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3412,6 +3517,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3493,3 +3602,34 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of free parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap the number of free workers by new parameter's value, if needed.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers,
+ autovacuum_max_parallel_workers);
+
+ if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of free workers.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+ }
+
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..fd00d6f89dc 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 6bc6be13d2a..1926218558a 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -2112,6 +2112,15 @@
max => 'MAX_BACKENDS',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+ long_desc => 'This parameter is capped by "max_worker_processes" (not by "autovacuum_max_workers"!).',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '0',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'max_parallel_maintenance_workers', type => 'int', context => 'PGC_USERSET', group => 'RESOURCES_WORKER_PROCESSES',
short_desc => 'Sets the maximum number of parallel processes per maintenance operation.',
variable => 'max_parallel_maintenance_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index c36fcb9ab61..d277fef1735 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -684,6 +684,7 @@
autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 6b20a4404b2..0fb04e08c5d 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1402,6 +1402,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..85926415657 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e8135f41a1c..904c5ce37d8 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -64,6 +64,11 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern int AutoVacuumReserveParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseAllParallelWorkers(void);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..edd286808bf 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.39.5 (Apple Git-154)
v11-0003-Documentation-for-parallel-autovacuum.patchapplication/octet-stream; name=v11-0003-Documentation-for-parallel-autovacuum.patchDownload
From 45c18534682dc4dff219518e2112b4861e3f6baf Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Tue, 22 Jul 2025 12:31:20 +0700
Subject: [PATCH v11 3/3] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 18 ++++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 50 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e9b420f3ddb..ffab6c6bea9 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2835,6 +2835,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9196,6 +9197,23 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-worker-processes"/>. The default is 0,
+ which means no parallel index vacuuming.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index e7a9f58c015..4e450ba9066 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -896,6 +896,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index dc000e913c1..288de6b0ffd 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1717,6 +1717,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.39.5 (Apple Git-154)
v11-0002-Logging-for-parallel-autovacuum.patchapplication/octet-stream; name=v11-0002-Logging-for-parallel-autovacuum.patchDownload
From af2040cb5408f3876748f95dd8ee055358314caa Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Mon, 18 Aug 2025 15:14:25 +0700
Subject: [PATCH v11 2/3] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 27 ++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 28 ++++++++++++++++++---------
src/include/commands/vacuum.h | 16 +++++++++++++--
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 59 insertions(+), 13 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 981d9380a92..6fe84d8747a 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -347,6 +347,12 @@ typedef struct LVRelState
/* Instrumentation counters */
int num_index_scans;
+
+ /*
+ * Number of planned and actually launched parallel workers for all index
+ * scans, or NULL
+ */
+ PVWorkersUsage *workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -687,6 +693,16 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
indnames = palloc(sizeof(char *) * vacrel->nindexes);
for (int i = 0; i < vacrel->nindexes; i++)
indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
+
+ /*
+ * Allocate space for workers usage statistics. Thus, we explicitly
+ * make clear that such statistics must be accumulated. For now, this
+ * is used only by autovacuum leader worker, because it must log it in
+ * the end of table processing.
+ */
+ vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
+ (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
+ NULL;
}
/*
@@ -1011,6 +1027,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->relnamespace,
vacrel->relname,
vacrel->num_index_scans);
+ if (vacrel->workers_usage)
+ appendStringInfo(&buf,
+ _("workers usage statistics for all of index scans : launched in total = %d, planned in total = %d\n"),
+ vacrel->workers_usage->nlaunched,
+ vacrel->workers_usage->nplanned);
appendStringInfo(&buf, _("pages: %u removed, %u remain, %u scanned (%.2f%% of total), %u eagerly scanned\n"),
vacrel->removed_pages,
new_rel_pages,
@@ -2639,7 +2660,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3052,7 +3074,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 4221e6084f5..cada1722b76 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,9 +227,10 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersUsage *wusage);
static void parallel_vacuum_process_all_indexes_internal(ParallelVacuumState *pvs,
- int num_index_scans, bool vacuum);
+ int num_index_scans, bool vacuum,
+ PVWorkersUsage *wusage);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -504,7 +505,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -515,7 +516,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
}
/*
@@ -523,7 +524,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -535,7 +537,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
/*
@@ -620,7 +622,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersUsage *wusage)
{
/*
* Parallel autovacuum can reserve parallel workers. Use try/catch block
@@ -629,7 +631,7 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
PG_TRY();
{
parallel_vacuum_process_all_indexes_internal(pvs, num_index_scans,
- false);
+ false, wusage);
}
PG_CATCH();
{
@@ -644,7 +646,8 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
static void
parallel_vacuum_process_all_indexes_internal(ParallelVacuumState *pvs,
- int num_index_scans, bool vacuum)
+ int num_index_scans, bool vacuum,
+ PVWorkersUsage *wusage)
{
int nworkers;
PVIndVacStatus new_status;
@@ -768,6 +771,13 @@ parallel_vacuum_process_all_indexes_internal(ParallelVacuumState *pvs,
"launched %d parallel vacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
pvs->pcxt->nworkers_launched, nworkers)));
+
+ /* Remember these values, if we asked to. */
+ if (wusage != NULL)
+ {
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
+ wusage->nplanned += nworkers;
+ }
}
/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 14eeccbd718..0829a9658f2 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -295,6 +295,16 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * PVWorkersUsage stores information about total number of launched and planned
+ * workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+ int nlaunched;
+ int nplanned;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -389,11 +399,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a13e8162890..6f9c418689c 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2366,6 +2366,7 @@ PullFilterOps
PushFilter
PushFilterOps
PushFunction
+PVWorkersUsage
PyCFunction
PyMethodDef
PyModuleDef
--
2.39.5 (Apple Git-154)
On Mon, Sep 15, 2025 at 11:50 AM Alexander Korotkov
<aekorotkov@gmail.com> wrote:
Hi!
On Tue, Aug 19, 2025 at 12:04 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Aug 18, 2025 at 1:31 AM Daniil Davydov <3danissimo@gmail.com> wrote:
On Fri, Aug 15, 2025 at 3:41 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
2. when an autovacuum worker (not parallel vacuum worker) who uses
parallel vacuum gets SIGHUP, it errors out with the error message
"parameter "max_stack_depth" cannot be set during a parallel
operation". Autovacuum checks the configuration file reload in
vacuum_delay_point(), and while reloading the configuration file, it
attempts to set max_stack_depth in
InitializeGUCOptionsFromEnvironment() (which is called by
ProcessConfigFileInternal()). However, it cannot change
max_stack_depth since the worker is in parallel mode but
max_stack_depth doesn't have GUC_ALLOW_IN_PARALLEL flag. This doesn't
happen in regular backends who are using parallel queries because they
check the configuration file reload at the end of each SQL command.Hm, this is a really serious problem. I see only two ways to solve it (both are
not really good) :
1)
Do not allow processing of the config file during parallel autovacuum
execution.2)
Teach the autovacuum to enter parallel mode only during the index vacuum/cleanup
phase. I'm a bit wary about it, because the design says that we should
be in parallel
mode during the whole parallel operation. But actually, if we can make
sure that all
launched workers are exited, I don't see reasons, why can't we just
exit parallel mode
at the end of parallel_vacuum_process_all_indexes.What do you think about it?
Hmm, given that we're trying to support parallel heap vacuum on
another thread[1] and we will probably support it in autovacuums, it
seems to me that these approaches won't work.Another idea would be to allow autovacuum workers to process the
config file even in parallel mode. GUC changes in the leader worker
would not affect parallel vacuum workers, but it is fine to me. In the
context of autovacuum, only specific GUC parameters related to
cost-based delays need to be affected also to parallel vacuum workers.
Probably we need some changes to compute_parallel_delay() so that
parallel workers can compute the sleep time based on the new
vacuum_cost_limit and vacuum_cost_delay after the leader process
(i.e., autovacuum worker) reloads the config file.Again, thank you for the review. Please, see v10 patches (only 0001
has been changed) :
1) Reserve and release workers only inside parallel_vacuum_process_all_indexes.
2) Add try/catch block to the parallel_vacuum_process_all_indexes, so we can
release workers even after an error. This required adding a static
variable to account
for the total number of reserved workers (av_nworkers_reserved).
3) Cap autovacuum_max_parallel_workers by max_worker_processes only inside
autovacuum code. Assign hook has been removed.
4) Use shmem value for determining the maximum number of parallel autovacuum
workers (eliminate race condition between launcher and leader process).Thank you for updating the patch! I'll review the new version patches.
I've rebased this patchset to the current master. That required me to move the new GUC definition to guc_parameters.dat. Also, I adjusted typedefs.list and made pgindent. Some notes about the patch.
Thank you for updating the patch!
I see parallel_vacuum_process_all_indexes() have a TRY/CATCH block. As I heard, the overhead of setting/doing jumps is platform-dependent, and not harmless on some platforms. Therefore, can we skip TRY/CATCH block for non-autovacuum vacuum? Possibly we could move it to AutoVacWorkerMain(), that would save us from repeatedly setting a jump in autovacuum workers too.
I wonder if using the TRY/CATCH block is not enough to ensure that
autovacuum workers release the reserved parallel workers in FATAL
cases.
In general, I think this patchset badly lack of testing. I think it needs tap tests checking from the logs that autovacuum has been done in parallel. Also, it would be good to set up some injection points, and check that reserved autovacuum parallel workers are getting released correctly in the case of errors.
+1
IIUC the patch still has one problem in terms of reloading the
configuration parameters during parallel mode as I mentioned
before[1]/messages/by-id/CAD21AoBRRXbNJEvCjS-0XZgCEeRBzQPKmrSDjJ3wZ8TN28vaCQ@mail.gmail.com.
Regards,
[1]: /messages/by-id/CAD21AoBRRXbNJEvCjS-0XZgCEeRBzQPKmrSDjJ3wZ8TN28vaCQ@mail.gmail.com
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Hi,
On Tue, Sep 16, 2025 at 1:50 AM Alexander Korotkov
<aekorotkov@gmail.com> wrote:
I've rebased this patchset to the current master.
That required me to move the new GUC definition to guc_parameters.dat.
Also, I adjusted typedefs.list and made pgindent.
Thank you for looking into it!
+ { + { + "autovacuum_parallel_workers", + "Maximum number of parallel autovacuum workers that can be used for processing this table.", + RELOPT_KIND_HEAP, + ShareUpdateExclusiveLock + }, + -1, -1, 1024 + },Should we use MAX_PARALLEL_WORKER_LIMIT instead of hard-coded 1024 here?
I'm afraid that we will have to include an additional header file to do this.
As far as I know, we are trying not to do so. For now, I will leave it
hardcoded.
- * Support routines for parallel vacuum execution. + * Support routines for parallel vacuum and autovacuum execution. In the + * future comments, the word "vacuum" will refer to both vacuum and + * autovacuum.Not sure about the usage of word "future" here.
It doesn't look clear what it means.
Could we use "below" or "within this file"?
Agree, fixed.
I see parallel_vacuum_process_all_indexes() have a TRY/CATCH block.
As I heard, the overhead of setting/doing jumps is platform-dependent, and
not harmless on some platforms. Therefore, can we skip TRY/CATCH block
for non-autovacuum vacuum? Possibly we could move it to AutoVacWorkerMain(),
that would save us from repeatedly setting a jump in autovacuum workers too.
Good idea. I found try/catch block inside the "do_autovacuum" function that is
obviously called only inside the autovacuum. I decided to move ReleaseAllWorkers
call there.
In general, I think this patchset badly lack of testing. I think it needs tap tests
checking from the logs that autovacuum has been done in parallel. Also, it
would be good to set up some injection points, and check that reserved
autovacuum parallel workers are getting released correctly in the case of errors.
Some time ago I tried to write a test, but it looked very ugly. Your
idea with injection points
helped me to write much more reliable tests - see it in a new (v12)
pack of patches.
On Wed, Sep 17, 2025 at 1:31 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Sep 15, 2025 at 11:50 AM Alexander Korotkov
<aekorotkov@gmail.com> wrote:I see parallel_vacuum_process_all_indexes() have a TRY/CATCH block. As I heard,
the overhead of setting/doing jumps is platform-dependent, and not harmless on some
platforms. Therefore, can we skip TRY/CATCH block for non-autovacuum vacuum?
Possibly we could move it to AutoVacWorkerMain(), that would save us from repeatedly
setting a jump in autovacuum workers too.I wonder if using the TRY/CATCH block is not enough to ensure that
autovacuum workers release the reserved parallel workers in FATAL
cases.
That's true. I'll register "before_shmem_exit" callback for autovacuum,
which will release workers if there are any reserved and if the a/v workers
exits abnormally.
IIUC the patch still has one problem in terms of reloading the
configuration parameters during parallel mode as I mentioned
before[1].
Yep. I was happy to see that you think that config file processing is OK for
autovacuum :)
I'll allow it for a/v leader. I've also thought about "compute_parallel_delay".
The simplest solution that I see is to move cost-based delay parameters to
shared state (PVShared) and create some variables such a
VacuumSharedCostBalance, so we can use them inside vacuum_delay_point.
What do you think about this idea?
Another approaches like a "tell parallel workers that they should
reload config"
looks a bit too invasive IMO.
Thanks everybody for the review! Please, see v12 patches :
1) Implement tests for parallel autovacuum
2) Fix error with unreleased workers - see try/catch block in do_autovacuum
and before_shmem_exit callback registration in AutoVacWorkerMain
3) Allow a/v leader to process config file (see guc.c)
--
Best regards,
Daniil Davydov
Attachments:
v12-0004-Documentation-for-parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v12-0004-Documentation-for-parallel-autovacuum.patchDownload
From 2afcd438a54849d3fe4e4f4afc230fb2c69c09db Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Tue, 28 Oct 2025 15:20:12 +0700
Subject: [PATCH v12 4/4] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 18 ++++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 50 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 0a2a8b49fdb..d3ea02cbbe0 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2835,6 +2835,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9254,6 +9255,23 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-worker-processes"/>. The default is 0,
+ which means no parallel index vacuuming.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index dc59c88319e..2db34cec0a9 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -897,6 +897,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index a157a244e4e..6eb58c95d9e 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1717,6 +1717,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
v12-0003-Tests-for-parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v12-0003-Tests-for-parallel-autovacuum.patchDownload
From f923d8f302b5a2d307718a665129e8bef089211c Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Tue, 28 Oct 2025 15:19:17 +0700
Subject: [PATCH v12 3/4] Tests for parallel autovacuum
---
src/backend/commands/vacuumparallel.c | 8 +
src/backend/postmaster/autovacuum.c | 14 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 26 ++
src/test/modules/test_autovacuum/meson.build | 36 +++
.../modules/test_autovacuum/t/001_basic.pl | 165 ++++++++++++
.../test_autovacuum/test_autovacuum--1.0.sql | 34 +++
.../modules/test_autovacuum/test_autovacuum.c | 255 ++++++++++++++++++
.../test_autovacuum/test_autovacuum.control | 3 +
9 files changed, 543 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_basic.pl
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.c
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.control
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 9a258238650..0cfdf79cb6c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -39,6 +39,7 @@
#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -752,6 +753,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
}
}
+ /*
+ * To be able to exercise whether all reserved parallel workers are being
+ * released anyway, allow injection points to trigger a failure at this
+ * point.
+ */
+ INJECTION_POINT("autovacuum-trigger-leader-failure", NULL);
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 9499d4f0c12..a6358200629 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3437,6 +3437,13 @@ AutoVacuumReserveParallelWorkers(int nworkers)
/* Remember how many workers we have reserved. */
av_nworkers_reserved += nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
return nreserved;
}
@@ -3467,6 +3474,13 @@ AutoVacuumReleaseParallelWorkers(int nworkers)
/* Don't have to remember these workers anymore. */
av_nworkers_reserved -= nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
}
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..4cf7344b2ac
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,26 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+MODULE_big = test_autovacuum
+OBJS = \
+ $(WIN32RES) \
+ test_autovacuum.o
+
+EXTENSION = test_autovacuum
+DATA = test_autovacuum--1.0.sql
+
+TAP_TESTS = 1
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..3441e5e49cf
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+test_autovacuum_sources = files(
+ 'test_autovacuum.c',
+)
+
+if host_system == 'windows'
+ test_autovacuum_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_autovacuum',
+ '--FILEDESC', 'test_autovacuum - test code for parallel autovacuum',])
+endif
+
+test_autovacuum = shared_module('test_autovacuum',
+ test_autovacuum_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_autovacuum
+
+test_install_data += files(
+ 'test_autovacuum.control',
+ 'test_autovacuum--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_basic.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_basic.pl b/src/test/modules/test_autovacuum/t/001_basic.pl
new file mode 100644
index 00000000000..22eaaa7da9d
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_basic.pl
@@ -0,0 +1,165 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 10
+ log_min_messages = debug2
+ log_autovacuum_min_duration = 0
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+ shared_preload_libraries=test_autovacuum
+});
+$node->start;
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table with specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ autovacuum_enabled = false);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Insert specified tuples num into the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac VALUES (i, i + 1, i + 2, i + 3);
+ END LOOP;
+ END \$\$;
+});
+
+# Now, create some dead tuples and refresh table statistics
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION test_autovacuum;
+ SELECT inj_set_free_workers_attach();
+ SELECT inj_leader_failure_attach();
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can.
+# Also check whether all requested workers:
+# 1) launched
+# 2) correctly released
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+$node->wait_for_log(qr/workers usage statistics for all of index scans : / .
+ qr/launched in total = 2, planned in total = 2/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10, 'All parallel workers has been released by the leader');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_2 = 0 WHERE (col_2 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 2:
+# We want parallel autovacuum workers to be released even if leader gets an
+# error. At first, simulate situation, when leader exites due to an ERROR.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('ERROR');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/error, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after ERROR');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_3 = 0 WHERE (col_3 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 3:
+# Same as Test 2, but simulate situation, when leader exites due to FATAL.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('FATAL');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/fatal, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after FATAL');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT inj_set_free_workers_detach();
+ SELECT inj_leader_failure_detach();
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
new file mode 100644
index 00000000000..017d5da85ea
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -0,0 +1,34 @@
+/* src/test/modules/test_autovacuum/test_autovacuum--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_autovacuum" to load this file. \quit
+
+/*
+ * Functions for expecting or to interfere autovacuum state
+ */
+CREATE FUNCTION get_parallel_autovacuum_free_workers()
+RETURNS INTEGER STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION trigger_leader_failure(failure_type text)
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+/*
+ * Injection point related functions
+ */
+CREATE FUNCTION inj_set_free_workers_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_set_free_workers_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
new file mode 100644
index 00000000000..2c979c405bd
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -0,0 +1,255 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_autovacuum.c
+ * Helpers to write tests for parallel autovacuum
+ *
+ * Copyright (c) 2020-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_autovacuum/test_autovacuum.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "postmaster/autovacuum.h"
+#include "storage/shmem.h"
+#include "storage/ipc.h"
+#include "storage/lwlock.h"
+#include "utils/builtins.h"
+#include "utils/injection_point.h"
+
+PG_MODULE_MAGIC;
+
+typedef enum AVLeaderFaulureType
+{
+ FAIL_NONE,
+ FAIL_ERROR,
+ FAIL_FATAL,
+} AVLeaderFaulureType;
+
+typedef struct InjPointState
+{
+ bool enabled_set_free_workers;
+ uint32 free_parallel_workers;
+
+ bool enabled_leader_failure;
+ AVLeaderFaulureType ftype;
+} InjPointState;
+
+static InjPointState *inj_point_state;
+
+/* Shared memory init callbacks */
+static shmem_request_hook_type prev_shmem_request_hook = NULL;
+static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+
+static void
+test_autovacuum_shmem_request(void)
+{
+ if (prev_shmem_request_hook)
+ prev_shmem_request_hook();
+
+ RequestAddinShmemSpace(sizeof(InjPointState));
+}
+
+static void
+test_autovacuum_shmem_startup(void)
+{
+ bool found;
+
+ if (prev_shmem_startup_hook)
+ prev_shmem_startup_hook();
+
+ /* Create or attach to the shared memory state */
+ LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
+
+ inj_point_state = ShmemInitStruct("injection_points",
+ sizeof(InjPointState),
+ &found);
+
+ if (!found)
+ {
+ /* First time through, initialize */
+ inj_point_state->enabled_leader_failure = false;
+ inj_point_state->enabled_set_free_workers = false;
+ inj_point_state->ftype = FAIL_NONE;
+
+ /* Keep it in sync with AutoVacuumShmemInit */
+ inj_point_state->free_parallel_workers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+
+ InjectionPointAttach("autovacuum-set-free-parallel-workers-num",
+ "test_autovacuum",
+ "inj_set_free_workers",
+ NULL,
+ 0);
+
+ InjectionPointAttach("autovacuum-trigger-leader-failure",
+ "test_autovacuum",
+ "inj_trigger_leader_failure",
+ NULL,
+ 0);
+ }
+
+ LWLockRelease(AddinShmemInitLock);
+}
+
+void
+_PG_init(void)
+{
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ prev_shmem_request_hook = shmem_request_hook;
+ shmem_request_hook = test_autovacuum_shmem_request;
+ prev_shmem_startup_hook = shmem_startup_hook;
+ shmem_startup_hook = test_autovacuum_shmem_startup;
+}
+
+extern PGDLLEXPORT void inj_set_free_workers(const char *name,
+ const void *private_data,
+ void *arg);
+extern PGDLLEXPORT void inj_trigger_leader_failure(const char *name,
+ const void *private_data,
+ void *arg);
+
+/*
+ * Set number of currently available parallel a/v workers. This value may
+ * change after reserving or releasing such workers.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_set_free_workers(const char *name, const void *private_data, void *arg)
+{
+ ereport(LOG,
+ errmsg("set parallel workers injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->enabled_set_free_workers)
+ {
+ Assert(arg != NULL);
+ inj_point_state->free_parallel_workers = *(uint32 *) arg;
+ }
+}
+
+/*
+ * Throw an ERROR or FATAL, if somebody requested it.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_trigger_leader_failure(const char *name, const void *private_data,
+ void *arg)
+{
+ int elevel;
+ char *elevel_str;
+
+ ereport(LOG,
+ errmsg("trigger leader failure injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->ftype == FAIL_NONE ||
+ !inj_point_state->enabled_leader_failure)
+ {
+ return;
+ }
+
+ elevel = inj_point_state->ftype == FAIL_ERROR ? ERROR : FATAL;
+ elevel_str = elevel == ERROR ? "error" : "fatal";
+
+ ereport(elevel, errmsg("%s, triggered by injection point", elevel_str));
+}
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+ uint32 nworkers;
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ nworkers = inj_point_state->free_parallel_workers;
+ LWLockRelease(AutovacuumLock);
+
+ PG_RETURN_UINT32(nworkers);
+}
+
+PG_FUNCTION_INFO_V1(trigger_leader_failure);
+Datum
+trigger_leader_failure(PG_FUNCTION_ARGS)
+{
+ const char *failure_type = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ if (strcmp(failure_type, "NONE") == 0)
+ inj_point_state->ftype = FAIL_NONE;
+ else if (strcmp(failure_type, "ERROR") == 0)
+ inj_point_state->ftype = FAIL_ERROR;
+ else if (strcmp(failure_type, "FATAL") == 0)
+ inj_point_state->ftype = FAIL_FATAL;
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid leader failure type : %s", failure_type)));
+
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_attach);
+Datum
+inj_set_free_workers_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = true;
+ inj_point_state->ftype = FAIL_NONE;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_detach);
+Datum
+inj_set_free_workers_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_attach);
+Datum
+inj_leader_failure_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = true;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_detach);
+Datum
+inj_leader_failure_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.control b/src/test/modules/test_autovacuum/test_autovacuum.control
new file mode 100644
index 00000000000..1b7fad258f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.control
@@ -0,0 +1,3 @@
+comment = 'Test code for parallel autovacuum'
+default_version = '1.0'
+module_pathname = '$libdir/test_autovacuum'
--
2.43.0
v12-0002-Logging-for-parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v12-0002-Logging-for-parallel-autovacuum.patchDownload
From 57ea4c318664f6e0b72040d14e7a7d9f82d2036c Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Mon, 18 Aug 2025 15:14:25 +0700
Subject: [PATCH v12 2/4] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 27 +++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 20 ++++++++++++++------
src/include/commands/vacuum.h | 16 ++++++++++++++--
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 54 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index d2b031fdd06..d364cde5fe5 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -347,6 +347,12 @@ typedef struct LVRelState
/* Instrumentation counters */
int num_index_scans;
+
+ /*
+ * Number of planned and actually launched parallel workers for all index
+ * scans, or NULL
+ */
+ PVWorkersUsage *workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -700,6 +706,16 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
indnames = palloc(sizeof(char *) * vacrel->nindexes);
for (int i = 0; i < vacrel->nindexes; i++)
indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
+
+ /*
+ * Allocate space for workers usage statistics. Thus, we explicitly
+ * make clear that such statistics must be accumulated. For now, this
+ * is used only by autovacuum leader worker, because it must log it in
+ * the end of table processing.
+ */
+ vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
+ (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
+ NULL;
}
/*
@@ -1024,6 +1040,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->relnamespace,
vacrel->relname,
vacrel->num_index_scans);
+ if (vacrel->workers_usage)
+ appendStringInfo(&buf,
+ _("workers usage statistics for all of index scans : launched in total = %d, planned in total = %d\n"),
+ vacrel->workers_usage->nlaunched,
+ vacrel->workers_usage->nplanned);
appendStringInfo(&buf, _("pages: %u removed, %u remain, %u scanned (%.2f%% of total), %u eagerly scanned\n"),
vacrel->removed_pages,
new_rel_pages,
@@ -2653,7 +2674,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3085,7 +3107,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index acd53b85b1c..9a258238650 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersUsage *wusage);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -502,7 +502,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -513,7 +513,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
}
/*
@@ -521,7 +521,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -533,7 +534,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
/*
@@ -618,7 +619,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersUsage *wusage)
{
int nworkers;
PVIndVacStatus new_status;
@@ -742,6 +743,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
"launched %d parallel vacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
pvs->pcxt->nworkers_launched, nworkers)));
+
+ /* Remember these values, if we asked to. */
+ if (wusage != NULL)
+ {
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
+ wusage->nplanned += nworkers;
+ }
}
/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1f3290c7fbf..90709ca3107 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,16 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * PVWorkersUsage stores information about total number of launched and planned
+ * workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+ int nlaunched;
+ int nplanned;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +404,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 43fe3bcd593..830763eb2fa 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2372,6 +2372,7 @@ PullFilterOps
PushFilter
PushFilterOps
PushFunction
+PVWorkersUsage
PyCFunction
PyMethodDef
PyModuleDef
--
2.43.0
v12-0001-Parallel-index-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v12-0001-Parallel-index-autovacuum.patchDownload
From 2217fc7b293c267ab497c84251dae31c0bfda7e9 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Tue, 28 Oct 2025 17:47:13 +0700
Subject: [PATCH v12] Parallel index autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 42 ++++-
src/backend/postmaster/autovacuum.c | 163 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 +-
src/backend/utils/misc/guc_parameters.dat | 9 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 5 +
src/include/utils/rel.h | 7 +
11 files changed, 239 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 9e288dfecbf..3cc29d4454a 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1881,6 +1890,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..acd53b85b1c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -373,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -553,12 +557,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -597,8 +606,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -646,6 +655,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note, that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = AutoVacuumReserveParallelWorkers(nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +706,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -738,6 +764,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Also release all previously reserved parallel autovacuum workers */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseParallelWorkers(pvs->pcxt->nworkers_launched);
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 5084af7dfb6..9499d4f0c12 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -151,6 +151,12 @@ int Log_autoanalyze_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Variable to keep number of currently reserved parallel autovacuum workers.
+ * It is only relevant for parallel autovacuum leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -285,6 +291,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +307,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -364,6 +374,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
@@ -763,6 +774,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -779,6 +792,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -1383,6 +1405,17 @@ avl_sigusr2_handler(SIGNAL_ARGS)
* AUTOVACUUM WORKER CODE
********************************************************************/
+/*
+ * If parallel autovacuum leader is finishing due to FATAL error, make sure
+ * that all reserved workers are released.
+ */
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ if (code != 0)
+ AutoVacuumReleaseAllParallelWorkers();
+}
+
/*
* Main entry point for autovacuum worker processes.
*/
@@ -1429,6 +1462,8 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGFPE, FloatExceptionHandler);
pqsignal(SIGCHLD, SIG_DFL);
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
+
/*
* Create a per-backend PGPROC struct in shared memory. We must do this
* before we can use LWLocks or access any shared memory.
@@ -2480,6 +2515,12 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure that
+ * all reserved workers are released.
+ */
+ AutoVacuumReleaseAllParallelWorkers();
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -2877,8 +2918,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3360,6 +3405,85 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader
+ * autovacuum process must call this function. It returns the number of
+ * parallel workers that actually can be launched and reserves these workers
+ * (if any) in global autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+AutoVacuumReserveParallelWorkers(int nworkers)
+{
+ int nreserved;
+
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ /*
+ * We can only reserve workers at the beginning of parallel index
+ * processing, so we must not have any reserved workers right now.
+ */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ nreserved = Min(AutoVacuumShmem->av_freeParallelWorkers, nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= nworkers;
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved += nworkers;
+
+ LWLockRelease(AutovacuumLock);
+ return nreserved;
+}
+
+/*
+ * Leader autovacuum process must call this function in order to update global
+ * autovacuum state, so other leaders will be able to use these parallel
+ * workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Same as above, but release *all* parallel workers, that were reserved by
+ * current leader autovacuum process.
+ */
+void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3420,6 +3544,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3501,3 +3629,34 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of free parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap the number of free workers by new parameter's value, if needed.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers,
+ autovacuum_max_parallel_workers);
+
+ if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of free workers.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+ }
+
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..fd00d6f89dc 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index a82286cc98a..e7c5982da2a 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3387,9 +3387,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only cost-based
+ * delays need to be affected also to parallel vacuum workers, and we will
+ * handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index d6fc8333850..5fbda66b3d4 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -2129,6 +2129,15 @@
max => 'MAX_BACKENDS',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+ long_desc => 'This parameter is capped by "max_worker_processes" (not by "autovacuum_max_workers"!).',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '0',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'max_parallel_maintenance_workers', type => 'int', context => 'PGC_USERSET', group => 'RESOURCES_WORKER_PROCESSES',
short_desc => 'Sets the maximum number of parallel processes per maintenance operation.',
variable => 'max_parallel_maintenance_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index f62b61967ef..b3e471ed33e 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -691,6 +691,7 @@
autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 36ea6a4d557..d89da606920 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1412,6 +1412,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..85926415657 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 023ac6d5fa8..f4b93b44531 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -65,6 +65,11 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern int AutoVacuumReserveParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseAllParallelWorkers(void);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 80286076a11..e879fdcfc69 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
Hi,
On Tue, Oct 28, 2025 at 8:09 PM Daniil Davydov <3danissimo@gmail.com> wrote:
Thanks everybody for the review! Please, see v12 patches :
1) Implement tests for parallel autovacuum
I forgot to add a new directory to Makefile and meson.build files.
Fixed in v13 patches (only 0003 has changed).
--
Best regards,
Daniil Davydov
Attachments:
v13-0002-Logging-for-parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v13-0002-Logging-for-parallel-autovacuum.patchDownload
From d1544aaad4206687afe730a43e818f16a4f67710 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Fri, 31 Oct 2025 14:42:46 +0700
Subject: [PATCH v13 2/4] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 27 +++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 20 ++++++++++++++------
src/include/commands/vacuum.h | 16 ++++++++++++++--
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 54 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index d2b031fdd06..d364cde5fe5 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -347,6 +347,12 @@ typedef struct LVRelState
/* Instrumentation counters */
int num_index_scans;
+
+ /*
+ * Number of planned and actually launched parallel workers for all index
+ * scans, or NULL
+ */
+ PVWorkersUsage *workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -700,6 +706,16 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
indnames = palloc(sizeof(char *) * vacrel->nindexes);
for (int i = 0; i < vacrel->nindexes; i++)
indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
+
+ /*
+ * Allocate space for workers usage statistics. Thus, we explicitly
+ * make clear that such statistics must be accumulated. For now, this
+ * is used only by autovacuum leader worker, because it must log it in
+ * the end of table processing.
+ */
+ vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
+ (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
+ NULL;
}
/*
@@ -1024,6 +1040,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->relnamespace,
vacrel->relname,
vacrel->num_index_scans);
+ if (vacrel->workers_usage)
+ appendStringInfo(&buf,
+ _("workers usage statistics for all of index scans : launched in total = %d, planned in total = %d\n"),
+ vacrel->workers_usage->nlaunched,
+ vacrel->workers_usage->nplanned);
appendStringInfo(&buf, _("pages: %u removed, %u remain, %u scanned (%.2f%% of total), %u eagerly scanned\n"),
vacrel->removed_pages,
new_rel_pages,
@@ -2653,7 +2674,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3085,7 +3107,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index acd53b85b1c..9a258238650 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersUsage *wusage);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -502,7 +502,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -513,7 +513,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
}
/*
@@ -521,7 +521,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -533,7 +534,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
/*
@@ -618,7 +619,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersUsage *wusage)
{
int nworkers;
PVIndVacStatus new_status;
@@ -742,6 +743,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
"launched %d parallel vacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
pvs->pcxt->nworkers_launched, nworkers)));
+
+ /* Remember these values, if we asked to. */
+ if (wusage != NULL)
+ {
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
+ wusage->nplanned += nworkers;
+ }
}
/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1f3290c7fbf..90709ca3107 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,16 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * PVWorkersUsage stores information about total number of launched and planned
+ * workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+ int nlaunched;
+ int nplanned;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +404,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 018b5919cf6..2c73faa30e7 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2372,6 +2372,7 @@ PullFilterOps
PushFilter
PushFilterOps
PushFunction
+PVWorkersUsage
PyCFunction
PyMethodDef
PyModuleDef
--
2.43.0
v13-0004-Documentation-for-parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v13-0004-Documentation-for-parallel-autovacuum.patchDownload
From 01ecdb2e6ebc57ddd7f343d135617a92ba0ebf73 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Fri, 31 Oct 2025 14:44:35 +0700
Subject: [PATCH v13 4/4] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 18 ++++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 50 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 0a2a8b49fdb..d3ea02cbbe0 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2835,6 +2835,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9254,6 +9255,23 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-worker-processes"/>. The default is 0,
+ which means no parallel index vacuuming.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index dc59c88319e..2db34cec0a9 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -897,6 +897,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index a157a244e4e..6eb58c95d9e 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1717,6 +1717,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
v13-0001-Parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v13-0001-Parallel-autovacuum.patchDownload
From 72bfe3c48b7c445038cc8c83e3b9fd5ad72e27d2 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Fri, 31 Oct 2025 14:42:27 +0700
Subject: [PATCH v13 1/4] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 42 ++++-
src/backend/postmaster/autovacuum.c | 163 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 +-
src/backend/utils/misc/guc_parameters.dat | 9 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 5 +
src/include/utils/rel.h | 7 +
11 files changed, 239 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 9e288dfecbf..3cc29d4454a 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1881,6 +1890,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..acd53b85b1c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -373,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -553,12 +557,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -597,8 +606,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -646,6 +655,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note, that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = AutoVacuumReserveParallelWorkers(nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +706,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -738,6 +764,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Also release all previously reserved parallel autovacuum workers */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseParallelWorkers(pvs->pcxt->nworkers_launched);
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 5084af7dfb6..9499d4f0c12 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -151,6 +151,12 @@ int Log_autoanalyze_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Variable to keep number of currently reserved parallel autovacuum workers.
+ * It is only relevant for parallel autovacuum leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -285,6 +291,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +307,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -364,6 +374,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
@@ -763,6 +774,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -779,6 +792,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -1383,6 +1405,17 @@ avl_sigusr2_handler(SIGNAL_ARGS)
* AUTOVACUUM WORKER CODE
********************************************************************/
+/*
+ * If parallel autovacuum leader is finishing due to FATAL error, make sure
+ * that all reserved workers are released.
+ */
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ if (code != 0)
+ AutoVacuumReleaseAllParallelWorkers();
+}
+
/*
* Main entry point for autovacuum worker processes.
*/
@@ -1429,6 +1462,8 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGFPE, FloatExceptionHandler);
pqsignal(SIGCHLD, SIG_DFL);
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
+
/*
* Create a per-backend PGPROC struct in shared memory. We must do this
* before we can use LWLocks or access any shared memory.
@@ -2480,6 +2515,12 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure that
+ * all reserved workers are released.
+ */
+ AutoVacuumReleaseAllParallelWorkers();
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -2877,8 +2918,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3360,6 +3405,85 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader
+ * autovacuum process must call this function. It returns the number of
+ * parallel workers that actually can be launched and reserves these workers
+ * (if any) in global autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+AutoVacuumReserveParallelWorkers(int nworkers)
+{
+ int nreserved;
+
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ /*
+ * We can only reserve workers at the beginning of parallel index
+ * processing, so we must not have any reserved workers right now.
+ */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ nreserved = Min(AutoVacuumShmem->av_freeParallelWorkers, nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= nworkers;
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved += nworkers;
+
+ LWLockRelease(AutovacuumLock);
+ return nreserved;
+}
+
+/*
+ * Leader autovacuum process must call this function in order to update global
+ * autovacuum state, so other leaders will be able to use these parallel
+ * workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Same as above, but release *all* parallel workers, that were reserved by
+ * current leader autovacuum process.
+ */
+void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3420,6 +3544,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3501,3 +3629,34 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of free parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap the number of free workers by new parameter's value, if needed.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers,
+ autovacuum_max_parallel_workers);
+
+ if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of free workers.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+ }
+
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..fd00d6f89dc 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 679846da42c..d1d796a1b18 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3315,9 +3315,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only cost-based
+ * delays need to be affected also to parallel vacuum workers, and we will
+ * handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index d6fc8333850..5fbda66b3d4 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -2129,6 +2129,15 @@
max => 'MAX_BACKENDS',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+ long_desc => 'This parameter is capped by "max_worker_processes" (not by "autovacuum_max_workers"!).',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '0',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'max_parallel_maintenance_workers', type => 'int', context => 'PGC_USERSET', group => 'RESOURCES_WORKER_PROCESSES',
short_desc => 'Sets the maximum number of parallel processes per maintenance operation.',
variable => 'max_parallel_maintenance_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index f62b61967ef..b3e471ed33e 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -691,6 +691,7 @@
autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 36ea6a4d557..d89da606920 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1412,6 +1412,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..85926415657 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 023ac6d5fa8..f4b93b44531 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -65,6 +65,11 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern int AutoVacuumReserveParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseAllParallelWorkers(void);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 80286076a11..e879fdcfc69 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
v13-0003-Tests-for-parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v13-0003-Tests-for-parallel-autovacuum.patchDownload
From 534ed530a7fd11cad34f6c39678b9937971b80b0 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Fri, 31 Oct 2025 14:44:12 +0700
Subject: [PATCH v13 3/4] Tests for parallel autovacuum
---
src/backend/commands/vacuumparallel.c | 8 +
src/backend/postmaster/autovacuum.c | 14 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 26 ++
src/test/modules/test_autovacuum/meson.build | 36 +++
.../modules/test_autovacuum/t/001_basic.pl | 165 ++++++++++++
.../test_autovacuum/test_autovacuum--1.0.sql | 34 +++
.../modules/test_autovacuum/test_autovacuum.c | 255 ++++++++++++++++++
.../test_autovacuum/test_autovacuum.control | 3 +
11 files changed, 545 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_basic.pl
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.c
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.control
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 9a258238650..0cfdf79cb6c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -39,6 +39,7 @@
#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -752,6 +753,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
}
}
+ /*
+ * To be able to exercise whether all reserved parallel workers are being
+ * released anyway, allow injection points to trigger a failure at this
+ * point.
+ */
+ INJECTION_POINT("autovacuum-trigger-leader-failure", NULL);
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 9499d4f0c12..a6358200629 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3437,6 +3437,13 @@ AutoVacuumReserveParallelWorkers(int nworkers)
/* Remember how many workers we have reserved. */
av_nworkers_reserved += nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
return nreserved;
}
@@ -3467,6 +3474,13 @@ AutoVacuumReleaseParallelWorkers(int nworkers)
/* Don't have to remember these workers anymore. */
av_nworkers_reserved -= nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
}
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 902a7954101..f09d0060248 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -15,6 +15,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 14fc761c4cf..ee7e855def0 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -14,6 +14,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..4cf7344b2ac
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,26 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+MODULE_big = test_autovacuum
+OBJS = \
+ $(WIN32RES) \
+ test_autovacuum.o
+
+EXTENSION = test_autovacuum
+DATA = test_autovacuum--1.0.sql
+
+TAP_TESTS = 1
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..3441e5e49cf
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+test_autovacuum_sources = files(
+ 'test_autovacuum.c',
+)
+
+if host_system == 'windows'
+ test_autovacuum_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_autovacuum',
+ '--FILEDESC', 'test_autovacuum - test code for parallel autovacuum',])
+endif
+
+test_autovacuum = shared_module('test_autovacuum',
+ test_autovacuum_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_autovacuum
+
+test_install_data += files(
+ 'test_autovacuum.control',
+ 'test_autovacuum--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_basic.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_basic.pl b/src/test/modules/test_autovacuum/t/001_basic.pl
new file mode 100644
index 00000000000..22eaaa7da9d
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_basic.pl
@@ -0,0 +1,165 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 10
+ log_min_messages = debug2
+ log_autovacuum_min_duration = 0
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+ shared_preload_libraries=test_autovacuum
+});
+$node->start;
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table with specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ autovacuum_enabled = false);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Insert specified tuples num into the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac VALUES (i, i + 1, i + 2, i + 3);
+ END LOOP;
+ END \$\$;
+});
+
+# Now, create some dead tuples and refresh table statistics
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION test_autovacuum;
+ SELECT inj_set_free_workers_attach();
+ SELECT inj_leader_failure_attach();
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can.
+# Also check whether all requested workers:
+# 1) launched
+# 2) correctly released
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+$node->wait_for_log(qr/workers usage statistics for all of index scans : / .
+ qr/launched in total = 2, planned in total = 2/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10, 'All parallel workers has been released by the leader');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_2 = 0 WHERE (col_2 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 2:
+# We want parallel autovacuum workers to be released even if leader gets an
+# error. At first, simulate situation, when leader exites due to an ERROR.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('ERROR');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/error, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after ERROR');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_3 = 0 WHERE (col_3 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 3:
+# Same as Test 2, but simulate situation, when leader exites due to FATAL.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('FATAL');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/fatal, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after FATAL');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT inj_set_free_workers_detach();
+ SELECT inj_leader_failure_detach();
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
new file mode 100644
index 00000000000..017d5da85ea
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -0,0 +1,34 @@
+/* src/test/modules/test_autovacuum/test_autovacuum--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_autovacuum" to load this file. \quit
+
+/*
+ * Functions for expecting or to interfere autovacuum state
+ */
+CREATE FUNCTION get_parallel_autovacuum_free_workers()
+RETURNS INTEGER STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION trigger_leader_failure(failure_type text)
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+/*
+ * Injection point related functions
+ */
+CREATE FUNCTION inj_set_free_workers_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_set_free_workers_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
new file mode 100644
index 00000000000..2c979c405bd
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -0,0 +1,255 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_autovacuum.c
+ * Helpers to write tests for parallel autovacuum
+ *
+ * Copyright (c) 2020-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_autovacuum/test_autovacuum.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "postmaster/autovacuum.h"
+#include "storage/shmem.h"
+#include "storage/ipc.h"
+#include "storage/lwlock.h"
+#include "utils/builtins.h"
+#include "utils/injection_point.h"
+
+PG_MODULE_MAGIC;
+
+typedef enum AVLeaderFaulureType
+{
+ FAIL_NONE,
+ FAIL_ERROR,
+ FAIL_FATAL,
+} AVLeaderFaulureType;
+
+typedef struct InjPointState
+{
+ bool enabled_set_free_workers;
+ uint32 free_parallel_workers;
+
+ bool enabled_leader_failure;
+ AVLeaderFaulureType ftype;
+} InjPointState;
+
+static InjPointState *inj_point_state;
+
+/* Shared memory init callbacks */
+static shmem_request_hook_type prev_shmem_request_hook = NULL;
+static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+
+static void
+test_autovacuum_shmem_request(void)
+{
+ if (prev_shmem_request_hook)
+ prev_shmem_request_hook();
+
+ RequestAddinShmemSpace(sizeof(InjPointState));
+}
+
+static void
+test_autovacuum_shmem_startup(void)
+{
+ bool found;
+
+ if (prev_shmem_startup_hook)
+ prev_shmem_startup_hook();
+
+ /* Create or attach to the shared memory state */
+ LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
+
+ inj_point_state = ShmemInitStruct("injection_points",
+ sizeof(InjPointState),
+ &found);
+
+ if (!found)
+ {
+ /* First time through, initialize */
+ inj_point_state->enabled_leader_failure = false;
+ inj_point_state->enabled_set_free_workers = false;
+ inj_point_state->ftype = FAIL_NONE;
+
+ /* Keep it in sync with AutoVacuumShmemInit */
+ inj_point_state->free_parallel_workers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+
+ InjectionPointAttach("autovacuum-set-free-parallel-workers-num",
+ "test_autovacuum",
+ "inj_set_free_workers",
+ NULL,
+ 0);
+
+ InjectionPointAttach("autovacuum-trigger-leader-failure",
+ "test_autovacuum",
+ "inj_trigger_leader_failure",
+ NULL,
+ 0);
+ }
+
+ LWLockRelease(AddinShmemInitLock);
+}
+
+void
+_PG_init(void)
+{
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ prev_shmem_request_hook = shmem_request_hook;
+ shmem_request_hook = test_autovacuum_shmem_request;
+ prev_shmem_startup_hook = shmem_startup_hook;
+ shmem_startup_hook = test_autovacuum_shmem_startup;
+}
+
+extern PGDLLEXPORT void inj_set_free_workers(const char *name,
+ const void *private_data,
+ void *arg);
+extern PGDLLEXPORT void inj_trigger_leader_failure(const char *name,
+ const void *private_data,
+ void *arg);
+
+/*
+ * Set number of currently available parallel a/v workers. This value may
+ * change after reserving or releasing such workers.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_set_free_workers(const char *name, const void *private_data, void *arg)
+{
+ ereport(LOG,
+ errmsg("set parallel workers injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->enabled_set_free_workers)
+ {
+ Assert(arg != NULL);
+ inj_point_state->free_parallel_workers = *(uint32 *) arg;
+ }
+}
+
+/*
+ * Throw an ERROR or FATAL, if somebody requested it.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_trigger_leader_failure(const char *name, const void *private_data,
+ void *arg)
+{
+ int elevel;
+ char *elevel_str;
+
+ ereport(LOG,
+ errmsg("trigger leader failure injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->ftype == FAIL_NONE ||
+ !inj_point_state->enabled_leader_failure)
+ {
+ return;
+ }
+
+ elevel = inj_point_state->ftype == FAIL_ERROR ? ERROR : FATAL;
+ elevel_str = elevel == ERROR ? "error" : "fatal";
+
+ ereport(elevel, errmsg("%s, triggered by injection point", elevel_str));
+}
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+ uint32 nworkers;
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ nworkers = inj_point_state->free_parallel_workers;
+ LWLockRelease(AutovacuumLock);
+
+ PG_RETURN_UINT32(nworkers);
+}
+
+PG_FUNCTION_INFO_V1(trigger_leader_failure);
+Datum
+trigger_leader_failure(PG_FUNCTION_ARGS)
+{
+ const char *failure_type = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ if (strcmp(failure_type, "NONE") == 0)
+ inj_point_state->ftype = FAIL_NONE;
+ else if (strcmp(failure_type, "ERROR") == 0)
+ inj_point_state->ftype = FAIL_ERROR;
+ else if (strcmp(failure_type, "FATAL") == 0)
+ inj_point_state->ftype = FAIL_FATAL;
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid leader failure type : %s", failure_type)));
+
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_attach);
+Datum
+inj_set_free_workers_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = true;
+ inj_point_state->ftype = FAIL_NONE;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_detach);
+Datum
+inj_set_free_workers_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_attach);
+Datum
+inj_leader_failure_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = true;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_detach);
+Datum
+inj_leader_failure_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.control b/src/test/modules/test_autovacuum/test_autovacuum.control
new file mode 100644
index 00000000000..1b7fad258f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.control
@@ -0,0 +1,3 @@
+comment = 'Test code for parallel autovacuum'
+default_version = '1.0'
+module_pathname = '$libdir/test_autovacuum'
--
2.43.0
On Tue, Oct 28, 2025 at 6:10 AM Daniil Davydov <3danissimo@gmail.com> wrote:
IIUC the patch still has one problem in terms of reloading the
configuration parameters during parallel mode as I mentioned
before[1].Yep. I was happy to see that you think that config file processing is OK for
autovacuum :)
I'll allow it for a/v leader. I've also thought about "compute_parallel_delay".
The simplest solution that I see is to move cost-based delay parameters to
shared state (PVShared) and create some variables such a
VacuumSharedCostBalance, so we can use them inside vacuum_delay_point.
What do you think about this idea?
I think that we need to somehow have parallel workers use the new
vacuum delay parameters (e.g., VacuumCostPageHit and
VacuumCostPageMiss) after the leader reloads the configuration file.
The leader shares the initial parameters with the parallel workers
(via DSM) before starting the workers but doesn't propagate the
updates during the parallel operations. And the worker doesn't reload
the configuration file.
Another approaches like a "tell parallel workers that they should
reload config"
looks a bit too invasive IMO.Thanks everybody for the review! Please, see v12 patches :
1) Implement tests for parallel autovacuum
2) Fix error with unreleased workers - see try/catch block in do_autovacuum
and before_shmem_exit callback registration in AutoVacWorkerMain
3) Allow a/v leader to process config file (see guc.c)
Here are some review comments for 0001 patch:
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ if (code != 0)
+ AutoVacuumReleaseAllParallelWorkers();
+}
+
AutoVacuumReleaseAllParallelWorkers() calls
AutoVacuumReleaseParallelWorkers() only when av_nworkers_reserved > 0,
so I think we don't need the condition 'if (code != 0)' here.
---
+extern void AutoVacuumReleaseAllParallelWorkers(void);
There is no caller of this function outside of autovacuum.h.
---
{ name => 'autovacuum_max_parallel_workers', type => 'int', context =>
'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Maximum number of parallel autovacuum workers, that
can be taken from bgworkers pool.',
long_desc => 'This parameter is capped by "max_worker_processes"
(not by "autovacuum_max_workers"!).',
variable => 'autovacuum_max_parallel_workers',
boot_val => '0',
min => '0',
max => 'MAX_BACKENDS',
},
Parallel vacuum in autovacuum can be used only when users set the
autovacuum_parallel_workers storage parameter. How about using the
default value 2 for autovacuum_max_parallel_workers GUC parameter?
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Hi,
I started to review this patch set again, and it needed rebasing, so I
went ahead and did that.
I also have some comments:
#1
In AutoVacuumReserveParallelWorkers()
I think here we should assert:
```
Assert(nworkers <= AutoVacuumShmem->av_freeParallelWorkers);
```
prior to:
```
+ AutoVacuumShmem->av_freeParallelWorkers -= nworkers;
```
We are capping nworkers earlier in parallel_vacuum_compute_workers()
```
/* Cap by GUC variable */
parallel_workers = Min(parallel_workers, max_workers);
```
so the assert will safe-guard against someone making a faulty change
in parallel_vacuum_compute_workers()
#2
In
parallel_vacuum_process_all_indexes()
```
+ /*
+ * Reserve workers in autovacuum global state. Note, that we
may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = AutoVacuumReserveParallelWorkers(nworkers);
```
nworkers has a double meaning. The return value of
AutoVacuumReserveParallelWorkers
is nreserved. I think this should be
```
nreserved = AutoVacuumReserveParallelWorkers(nworkers);
```
and nreserved becomes the authoritative value for the number of parallel
workers after that point.
#3
I noticed in the logging:
```
2025-11-20 18:44:09.252 UTC [36787] LOG: automatic vacuum of table
"test.public.t": index scans: 0
workers usage statistics for all of index scans : launched in
total = 3, planned in total = 3
pages: 0 removed, 503306 remain, 14442 scanned (2.87% of
total), 0 eagerly scanned
tuples: 101622 removed, 7557074 remain, 0 are dead but not yet removable
removable cutoff: 1711, which was 1 XIDs old when operation ended
frozen: 4793 pages from table (0.95% of total) had 98303 tuples frozen
visibility map: 4822 pages set all-visible, 4745 pages set
all-frozen (0 were all-visible)
index scan bypassed: 8884 pages from table (1.77% of total)
have 195512 dead item identifiers
```
that even though index scan was bypased, we still launched parallel
workers. I didn't dig deep into this,
but that looks wrong. what do you think?
#4
instead of:
"workers usage statistics for all of index scans : launched in total =
0, planned in total = 0"
how about:
"parallel index scan : workers planned = 0, workers launched = 0"
also log this after the "index scan needed:" line; so it looks like
this. What do you think>
```
index scan needed: 13211 pages from table (2.63% of total) had
289482 dead item identifiers removed
parallel index scan : workers planned = 0, workers launched = 0
index "t_pkey": pages: 25234 in total, 0 newly deleted, 0 currently
deleted, 0 reusable
index "t_c1_idx": pages: 10219 in total, 0 newly deleted, 0
currently deleted, 0 reusable
```
--
Sami Imseih
Amazon Web Services (AWS)
Attachments:
v14-0004-Documentation-for-parallel-autovacuum.patchapplication/octet-stream; name=v14-0004-Documentation-for-parallel-autovacuum.patchDownload
From 1302f966053c30c89c9365b48bff793844053d28 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Fri, 31 Oct 2025 14:44:35 +0700
Subject: [PATCH v14 4/4] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 18 ++++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 50 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 023b3f03ba9..0f7096c2b5f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2841,6 +2841,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9264,6 +9265,23 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-worker-processes"/>. The default is 0,
+ which means no parallel index vacuuming.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index f4f0433ef6f..02f306bbb8a 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -897,6 +897,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 6557c5cffd8..e95a6488c5e 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1717,6 +1717,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
v14-0003-Tests-for-parallel-autovacuum.patchapplication/octet-stream; name=v14-0003-Tests-for-parallel-autovacuum.patchDownload
From 3a7cacb4e7ded37dac5eb54e7614942e9684f690 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Fri, 31 Oct 2025 14:44:12 +0700
Subject: [PATCH v14 3/4] Tests for parallel autovacuum
---
src/backend/commands/vacuumparallel.c | 8 +
src/backend/postmaster/autovacuum.c | 14 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 26 ++
src/test/modules/test_autovacuum/meson.build | 36 +++
.../modules/test_autovacuum/t/001_basic.pl | 165 ++++++++++++
.../test_autovacuum/test_autovacuum--1.0.sql | 34 +++
.../modules/test_autovacuum/test_autovacuum.c | 255 ++++++++++++++++++
.../test_autovacuum/test_autovacuum.control | 3 +
11 files changed, 545 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_basic.pl
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.c
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.control
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 9a258238650..0cfdf79cb6c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -39,6 +39,7 @@
#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -752,6 +753,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
}
}
+ /*
+ * To be able to exercise whether all reserved parallel workers are being
+ * released anyway, allow injection points to trigger a failure at this
+ * point.
+ */
+ INJECTION_POINT("autovacuum-trigger-leader-failure", NULL);
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 2b6ceedf987..3a8a617fc63 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3435,6 +3435,13 @@ AutoVacuumReserveParallelWorkers(int nworkers)
/* Remember how many workers we have reserved. */
av_nworkers_reserved += nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
return nreserved;
}
@@ -3465,6 +3472,13 @@ AutoVacuumReleaseParallelWorkers(int nworkers)
/* Don't have to remember these workers anymore. */
av_nworkers_reserved -= nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
}
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 902a7954101..f09d0060248 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -15,6 +15,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 14fc761c4cf..ee7e855def0 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -14,6 +14,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..4cf7344b2ac
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,26 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+MODULE_big = test_autovacuum
+OBJS = \
+ $(WIN32RES) \
+ test_autovacuum.o
+
+EXTENSION = test_autovacuum
+DATA = test_autovacuum--1.0.sql
+
+TAP_TESTS = 1
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..3441e5e49cf
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+test_autovacuum_sources = files(
+ 'test_autovacuum.c',
+)
+
+if host_system == 'windows'
+ test_autovacuum_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_autovacuum',
+ '--FILEDESC', 'test_autovacuum - test code for parallel autovacuum',])
+endif
+
+test_autovacuum = shared_module('test_autovacuum',
+ test_autovacuum_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_autovacuum
+
+test_install_data += files(
+ 'test_autovacuum.control',
+ 'test_autovacuum--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_basic.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_basic.pl b/src/test/modules/test_autovacuum/t/001_basic.pl
new file mode 100644
index 00000000000..22eaaa7da9d
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_basic.pl
@@ -0,0 +1,165 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 10
+ log_min_messages = debug2
+ log_autovacuum_min_duration = 0
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+ shared_preload_libraries=test_autovacuum
+});
+$node->start;
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table with specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ autovacuum_enabled = false);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Insert specified tuples num into the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac VALUES (i, i + 1, i + 2, i + 3);
+ END LOOP;
+ END \$\$;
+});
+
+# Now, create some dead tuples and refresh table statistics
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION test_autovacuum;
+ SELECT inj_set_free_workers_attach();
+ SELECT inj_leader_failure_attach();
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can.
+# Also check whether all requested workers:
+# 1) launched
+# 2) correctly released
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+$node->wait_for_log(qr/workers usage statistics for all of index scans : / .
+ qr/launched in total = 2, planned in total = 2/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10, 'All parallel workers has been released by the leader');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_2 = 0 WHERE (col_2 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 2:
+# We want parallel autovacuum workers to be released even if leader gets an
+# error. At first, simulate situation, when leader exites due to an ERROR.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('ERROR');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/error, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after ERROR');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_3 = 0 WHERE (col_3 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 3:
+# Same as Test 2, but simulate situation, when leader exites due to FATAL.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('FATAL');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/fatal, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after FATAL');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT inj_set_free_workers_detach();
+ SELECT inj_leader_failure_detach();
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
new file mode 100644
index 00000000000..017d5da85ea
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -0,0 +1,34 @@
+/* src/test/modules/test_autovacuum/test_autovacuum--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_autovacuum" to load this file. \quit
+
+/*
+ * Functions for expecting or to interfere autovacuum state
+ */
+CREATE FUNCTION get_parallel_autovacuum_free_workers()
+RETURNS INTEGER STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION trigger_leader_failure(failure_type text)
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+/*
+ * Injection point related functions
+ */
+CREATE FUNCTION inj_set_free_workers_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_set_free_workers_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
new file mode 100644
index 00000000000..7948f4858ae
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -0,0 +1,255 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_autovacuum.c
+ * Helpers to write tests for parallel autovacuum
+ *
+ * Copyright (c) 2020-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_autovacuum/test_autovacuum.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "postmaster/autovacuum.h"
+#include "storage/shmem.h"
+#include "storage/ipc.h"
+#include "storage/lwlock.h"
+#include "utils/builtins.h"
+#include "utils/injection_point.h"
+
+PG_MODULE_MAGIC;
+
+typedef enum AVLeaderFaulureType
+{
+ FAIL_NONE,
+ FAIL_ERROR,
+ FAIL_FATAL,
+} AVLeaderFaulureType;
+
+typedef struct InjPointState
+{
+ bool enabled_set_free_workers;
+ uint32 free_parallel_workers;
+
+ bool enabled_leader_failure;
+ AVLeaderFaulureType ftype;
+} InjPointState;
+
+static InjPointState * inj_point_state;
+
+/* Shared memory init callbacks */
+static shmem_request_hook_type prev_shmem_request_hook = NULL;
+static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+
+static void
+test_autovacuum_shmem_request(void)
+{
+ if (prev_shmem_request_hook)
+ prev_shmem_request_hook();
+
+ RequestAddinShmemSpace(sizeof(InjPointState));
+}
+
+static void
+test_autovacuum_shmem_startup(void)
+{
+ bool found;
+
+ if (prev_shmem_startup_hook)
+ prev_shmem_startup_hook();
+
+ /* Create or attach to the shared memory state */
+ LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
+
+ inj_point_state = ShmemInitStruct("injection_points",
+ sizeof(InjPointState),
+ &found);
+
+ if (!found)
+ {
+ /* First time through, initialize */
+ inj_point_state->enabled_leader_failure = false;
+ inj_point_state->enabled_set_free_workers = false;
+ inj_point_state->ftype = FAIL_NONE;
+
+ /* Keep it in sync with AutoVacuumShmemInit */
+ inj_point_state->free_parallel_workers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+
+ InjectionPointAttach("autovacuum-set-free-parallel-workers-num",
+ "test_autovacuum",
+ "inj_set_free_workers",
+ NULL,
+ 0);
+
+ InjectionPointAttach("autovacuum-trigger-leader-failure",
+ "test_autovacuum",
+ "inj_trigger_leader_failure",
+ NULL,
+ 0);
+ }
+
+ LWLockRelease(AddinShmemInitLock);
+}
+
+void
+_PG_init(void)
+{
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ prev_shmem_request_hook = shmem_request_hook;
+ shmem_request_hook = test_autovacuum_shmem_request;
+ prev_shmem_startup_hook = shmem_startup_hook;
+ shmem_startup_hook = test_autovacuum_shmem_startup;
+}
+
+extern PGDLLEXPORT void inj_set_free_workers(const char *name,
+ const void *private_data,
+ void *arg);
+extern PGDLLEXPORT void inj_trigger_leader_failure(const char *name,
+ const void *private_data,
+ void *arg);
+
+/*
+ * Set number of currently available parallel a/v workers. This value may
+ * change after reserving or releasing such workers.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_set_free_workers(const char *name, const void *private_data, void *arg)
+{
+ ereport(LOG,
+ errmsg("set parallel workers injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->enabled_set_free_workers)
+ {
+ Assert(arg != NULL);
+ inj_point_state->free_parallel_workers = *(uint32 *) arg;
+ }
+}
+
+/*
+ * Throw an ERROR or FATAL, if somebody requested it.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_trigger_leader_failure(const char *name, const void *private_data,
+ void *arg)
+{
+ int elevel;
+ char *elevel_str;
+
+ ereport(LOG,
+ errmsg("trigger leader failure injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->ftype == FAIL_NONE ||
+ !inj_point_state->enabled_leader_failure)
+ {
+ return;
+ }
+
+ elevel = inj_point_state->ftype == FAIL_ERROR ? ERROR : FATAL;
+ elevel_str = elevel == ERROR ? "error" : "fatal";
+
+ ereport(elevel, errmsg("%s, triggered by injection point", elevel_str));
+}
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+ uint32 nworkers;
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ nworkers = inj_point_state->free_parallel_workers;
+ LWLockRelease(AutovacuumLock);
+
+ PG_RETURN_UINT32(nworkers);
+}
+
+PG_FUNCTION_INFO_V1(trigger_leader_failure);
+Datum
+trigger_leader_failure(PG_FUNCTION_ARGS)
+{
+ const char *failure_type = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ if (strcmp(failure_type, "NONE") == 0)
+ inj_point_state->ftype = FAIL_NONE;
+ else if (strcmp(failure_type, "ERROR") == 0)
+ inj_point_state->ftype = FAIL_ERROR;
+ else if (strcmp(failure_type, "FATAL") == 0)
+ inj_point_state->ftype = FAIL_FATAL;
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid leader failure type : %s", failure_type)));
+
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_attach);
+Datum
+inj_set_free_workers_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = true;
+ inj_point_state->ftype = FAIL_NONE;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_detach);
+Datum
+inj_set_free_workers_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_attach);
+Datum
+inj_leader_failure_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = true;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_detach);
+Datum
+inj_leader_failure_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.control b/src/test/modules/test_autovacuum/test_autovacuum.control
new file mode 100644
index 00000000000..1b7fad258f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.control
@@ -0,0 +1,3 @@
+comment = 'Test code for parallel autovacuum'
+default_version = '1.0'
+module_pathname = '$libdir/test_autovacuum'
--
2.43.0
v14-0002-Logging-for-parallel-autovacuum.patchapplication/octet-stream; name=v14-0002-Logging-for-parallel-autovacuum.patchDownload
From 45439d8d5e5da1ca9f10fddfd958943a2abae08c Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Fri, 31 Oct 2025 14:42:46 +0700
Subject: [PATCH v14 2/4] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 27 +++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 20 ++++++++++++++------
src/include/commands/vacuum.h | 16 ++++++++++++++--
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 54 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 7a6d6f42634..59438c18b10 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -347,6 +347,12 @@ typedef struct LVRelState
/* Instrumentation counters */
int num_index_scans;
+
+ /*
+ * Number of planned and actually launched parallel workers for all index
+ * scans, or NULL
+ */
+ PVWorkersUsage *workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -700,6 +706,16 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
indnames = palloc(sizeof(char *) * vacrel->nindexes);
for (int i = 0; i < vacrel->nindexes; i++)
indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
+
+ /*
+ * Allocate space for workers usage statistics. Thus, we explicitly
+ * make clear that such statistics must be accumulated. For now, this
+ * is used only by autovacuum leader worker, because it must log it in
+ * the end of table processing.
+ */
+ vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
+ (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
+ NULL;
}
/*
@@ -1024,6 +1040,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
vacrel->relnamespace,
vacrel->relname,
vacrel->num_index_scans);
+ if (vacrel->workers_usage)
+ appendStringInfo(&buf,
+ _("workers usage statistics for all of index scans : launched in total = %d, planned in total = %d\n"),
+ vacrel->workers_usage->nlaunched,
+ vacrel->workers_usage->nplanned);
appendStringInfo(&buf, _("pages: %u removed, %u remain, %u scanned (%.2f%% of total), %u eagerly scanned\n"),
vacrel->removed_pages,
new_rel_pages,
@@ -2655,7 +2676,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3087,7 +3109,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index acd53b85b1c..9a258238650 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersUsage *wusage);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -502,7 +502,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -513,7 +513,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
}
/*
@@ -521,7 +521,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -533,7 +534,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
/*
@@ -618,7 +619,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersUsage *wusage)
{
int nworkers;
PVIndVacStatus new_status;
@@ -742,6 +743,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
"launched %d parallel vacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
pvs->pcxt->nworkers_launched, nworkers)));
+
+ /* Remember these values, if we asked to. */
+ if (wusage != NULL)
+ {
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
+ wusage->nplanned += nworkers;
+ }
}
/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1f3290c7fbf..90709ca3107 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,16 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * PVWorkersUsage stores information about total number of launched and planned
+ * workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+ int nlaunched;
+ int nplanned;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +404,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index c751c25a04d..fbff437d104 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2378,6 +2378,7 @@ PullFilterOps
PushFilter
PushFilterOps
PushFunction
+PVWorkersUsage
PyCFunction
PyMethodDef
PyModuleDef
--
2.43.0
v14-0001-Parallel-autovacuum.patchapplication/octet-stream; name=v14-0001-Parallel-autovacuum.patchDownload
From 74d2c076f5dda0bb135107b1e09511a04137c125 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Fri, 31 Oct 2025 14:42:27 +0700
Subject: [PATCH v14 1/4] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 42 ++++-
src/backend/postmaster/autovacuum.c | 163 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 +-
src/backend/utils/misc/guc_parameters.dat | 9 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 5 +
src/include/utils/rel.h | 7 +
11 files changed, 239 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 9e288dfecbf..3cc29d4454a 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1881,6 +1890,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..acd53b85b1c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -373,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -553,12 +557,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -597,8 +606,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -646,6 +655,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note, that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = AutoVacuumReserveParallelWorkers(nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +706,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -738,6 +764,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Also release all previously reserved parallel autovacuum workers */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseParallelWorkers(pvs->pcxt->nworkers_launched);
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 1c38488f2cb..2b6ceedf987 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -151,6 +151,12 @@ int Log_autoanalyze_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Variable to keep number of currently reserved parallel autovacuum workers.
+ * It is only relevant for parallel autovacuum leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -285,6 +291,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +307,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -364,6 +374,7 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
@@ -763,6 +774,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -779,6 +792,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -1383,6 +1405,17 @@ avl_sigusr2_handler(SIGNAL_ARGS)
* AUTOVACUUM WORKER CODE
********************************************************************/
+/*
+ * If parallel autovacuum leader is finishing due to FATAL error, make sure
+ * that all reserved workers are released.
+ */
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ if (code != 0)
+ AutoVacuumReleaseAllParallelWorkers();
+}
+
/*
* Main entry point for autovacuum worker processes.
*/
@@ -1429,6 +1462,8 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGFPE, FloatExceptionHandler);
pqsignal(SIGCHLD, SIG_DFL);
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
+
/*
* Create a per-backend PGPROC struct in shared memory. We must do this
* before we can use LWLocks or access any shared memory.
@@ -2480,6 +2515,12 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure
+ * that all reserved workers are released.
+ */
+ AutoVacuumReleaseAllParallelWorkers();
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -2880,8 +2921,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3358,6 +3403,85 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader
+ * autovacuum process must call this function. It returns the number of
+ * parallel workers that actually can be launched and reserves these workers
+ * (if any) in global autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+AutoVacuumReserveParallelWorkers(int nworkers)
+{
+ int nreserved;
+
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ /*
+ * We can only reserve workers at the beginning of parallel index
+ * processing, so we must not have any reserved workers right now.
+ */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ nreserved = Min(AutoVacuumShmem->av_freeParallelWorkers, nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= nworkers;
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved += nworkers;
+
+ LWLockRelease(AutovacuumLock);
+ return nreserved;
+}
+
+/*
+ * Leader autovacuum process must call this function in order to update global
+ * autovacuum state, so other leaders will be able to use these parallel
+ * workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Same as above, but release *all* parallel workers, that were reserved by
+ * current leader autovacuum process.
+ */
+void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3418,6 +3542,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3499,3 +3627,34 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of free parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap the number of free workers by new parameter's value, if needed.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers,
+ autovacuum_max_parallel_workers);
+
+ if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of free workers.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+ }
+
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..fd00d6f89dc 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index c6484aea087..2a037485d5e 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3326,9 +3326,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only
+ * cost-based delays need to be affected also to parallel vacuum workers,
+ * and we will handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 1128167c025..de9f1bd4808 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -154,6 +154,15 @@
max => '2000000000',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+ long_desc => 'This parameter is capped by "max_worker_processes" (not by "autovacuum_max_workers"!).',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '0',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index dc9e2255f8a..559ef7b1771 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -691,6 +691,7 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 0 # disabled by default and limited by max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 51806597037..6170436b341 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1423,6 +1423,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 9a7d733ddef..605d0829b03 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 023ac6d5fa8..f4b93b44531 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -65,6 +65,11 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern int AutoVacuumReserveParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseAllParallelWorkers(void);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 80286076a11..e879fdcfc69 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
Hi,
On Sat, Nov 1, 2025 at 3:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Oct 28, 2025 at 6:10 AM Daniil Davydov <3danissimo@gmail.com> wrote:
I'll allow it for a/v leader. I've also thought about "compute_parallel_delay".
The simplest solution that I see is to move cost-based delay parameters to
shared state (PVShared) and create some variables such a
VacuumSharedCostBalance, so we can use them inside vacuum_delay_point.
What do you think about this idea?I think that we need to somehow have parallel workers use the new
vacuum delay parameters (e.g., VacuumCostPageHit and
VacuumCostPageMiss) after the leader reloads the configuration file.
The leader shares the initial parameters with the parallel workers
(via DSM) before starting the workers but doesn't propagate the
updates during the parallel operations. And the worker doesn't reload
the configuration file.
I'm still working on it.
Here are some review comments for 0001 patch:
+static void +autovacuum_worker_before_shmem_exit(int code, Datum arg) +{ + if (code != 0) + AutoVacuumReleaseAllParallelWorkers(); +} +AutoVacuumReleaseAllParallelWorkers() calls
AutoVacuumReleaseParallelWorkers() only when av_nworkers_reserved > 0,
so I think we don't need the condition 'if (code != 0)' here.
Yeah, I wrote it more like a hint for the reader - "we should call
this function only
if the process is exiting due to an error". But actually it is not
necessary condition.
--- +extern void AutoVacuumReleaseAllParallelWorkers(void);There is no caller of this function outside of autovacuum.h.
I will fix it.
---
{ name => 'autovacuum_max_parallel_workers', type => 'int', context =>
'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Maximum number of parallel autovacuum workers, that
can be taken from bgworkers pool.',
long_desc => 'This parameter is capped by "max_worker_processes"
(not by "autovacuum_max_workers"!).',
variable => 'autovacuum_max_parallel_workers',
boot_val => '0',
min => '0',
max => 'MAX_BACKENDS',
},Parallel vacuum in autovacuum can be used only when users set the
autovacuum_parallel_workers storage parameter. How about using the
default value 2 for autovacuum_max_parallel_workers GUC parameter?
Sounds reasonable, +1 for it.
On Fri, Nov 21, 2025 at 2:31 AM Sami Imseih <samimseih@gmail.com> wrote:
Hi,
I started to review this patch set again, and it needed rebasing, so I
went ahead and did that.
Thanks for the review and rebasing the patch!
I also have some comments:
#1
In AutoVacuumReserveParallelWorkers()
I think here we should assert:```
Assert(nworkers <= AutoVacuumShmem->av_freeParallelWorkers);
```
prior to:
```
+ AutoVacuumShmem->av_freeParallelWorkers -= nworkers;
```We are capping nworkers earlier in parallel_vacuum_compute_workers()
```
/* Cap by GUC variable */
parallel_workers = Min(parallel_workers, max_workers);
```so the assert will safe-guard against someone making a faulty change
in parallel_vacuum_compute_workers()
Hm, I guess it is just a bug. We should reduce av_freeParallelWorkers by the
computed 'nreserved/ value (thus, we don't need any assertion). I'll fix it.
#2
In
parallel_vacuum_process_all_indexes()``` + /* + * Reserve workers in autovacuum global state. Note, that we may be given + * fewer workers than we requested. + */ + if (AmAutoVacuumWorkerProcess() && nworkers > 0) + nworkers = AutoVacuumReserveParallelWorkers(nworkers); ```nworkers has a double meaning. The return value of
AutoVacuumReserveParallelWorkers
is nreserved. I think this should be```
nreserved = AutoVacuumReserveParallelWorkers(nworkers);
```and nreserved becomes the authoritative value for the number of parallel
workers after that point.
Reserving parallel workers is specific for an autovacuum. If we add
'nreserved' variable, we would have to change all conditions below in
order not to break maintenance parallel vacuum. I think it will be confusing :
***
if (nworkers > 0 || (AmAutoVacuumWorkerProcess() && nreserved > 0))
***
Moreover, 'nworkers' reflects how many workers will be involved in vacuuming,
and I think that capping it by 'nreserved' is not breaking this semantic.
#3
I noticed in the logging:```
2025-11-20 18:44:09.252 UTC [36787] LOG: automatic vacuum of table
"test.public.t": index scans: 0
workers usage statistics for all of index scans : launched in
total = 3, planned in total = 3
pages: 0 removed, 503306 remain, 14442 scanned (2.87% of
total), 0 eagerly scanned
tuples: 101622 removed, 7557074 remain, 0 are dead but not yet removable
removable cutoff: 1711, which was 1 XIDs old when operation ended
frozen: 4793 pages from table (0.95% of total) had 98303 tuples frozen
visibility map: 4822 pages set all-visible, 4745 pages set
all-frozen (0 were all-visible)
index scan bypassed: 8884 pages from table (1.77% of total)
have 195512 dead item identifiers
```that even though index scan was bypased, we still launched parallel
workers. I didn't dig deep into this,
but that looks wrong. what do you think?
We can do both index vacuuming and index cleanup in parallel. I guess that
in your situation the vacuum was bypassed, but cleanup was called.
#4
instead of:"workers usage statistics for all of index scans : launched in total =
0, planned in total = 0"how about:
"parallel index scan : workers planned = 0, workers launched = 0"
also log this after the "index scan needed:" line; so it looks like
this. What do you think>```
index scan needed: 13211 pages from table (2.63% of total) had
289482 dead item identifiers removed
parallel index scan : workers planned = 0, workers launched = 0
index "t_pkey": pages: 25234 in total, 0 newly deleted, 0 currently
deleted, 0 reusable
index "t_c1_idx": pages: 10219 in total, 0 newly deleted, 0
currently deleted, 0 reusable
```
Agree, it looks better.
Thanks everybody for the comments!
Please, see v15 patches.
--
Best regards,
Daniil Davydov
Attachments:
v15-0004-Documentation-for-parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v15-0004-Documentation-for-parallel-autovacuum.patchDownload
From a867a0ffb18549b493412d6bc079df6aef9b92a4 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Sun, 23 Nov 2025 02:32:44 +0700
Subject: [PATCH v15 4/4] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 18 ++++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 50 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 023b3f03ba9..0f7096c2b5f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2841,6 +2841,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9264,6 +9265,23 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-worker-processes"/>. The default is 0,
+ which means no parallel index vacuuming.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index f4f0433ef6f..02f306bbb8a 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -897,6 +897,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 6557c5cffd8..e95a6488c5e 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1717,6 +1717,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
v15-0002-Logging-for-parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v15-0002-Logging-for-parallel-autovacuum.patchDownload
From d10e3e0edd1f17ceabe8b12f780827ae0c9b686d Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Sun, 23 Nov 2025 01:07:47 +0700
Subject: [PATCH v15 2/4] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 27 +++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 20 ++++++++++++++------
src/include/commands/vacuum.h | 16 ++++++++++++++--
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 54 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 65bb0568a86..ea7a18d4d51 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -347,6 +347,12 @@ typedef struct LVRelState
/* Instrumentation counters */
int num_index_scans;
+
+ /*
+ * Number of planned and actually launched parallel workers for all index
+ * scans, or NULL
+ */
+ PVWorkersUsage *workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -700,6 +706,16 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
indnames = palloc(sizeof(char *) * vacrel->nindexes);
for (int i = 0; i < vacrel->nindexes; i++)
indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
+
+ /*
+ * Allocate space for workers usage statistics. Thus, we explicitly
+ * make clear that such statistics must be accumulated. For now, this
+ * is used only by autovacuum leader worker, because it must log it in
+ * the end of table processing.
+ */
+ vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
+ (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
+ NULL;
}
/*
@@ -1099,6 +1115,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
orig_rel_pages == 0 ? 100.0 :
100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
vacrel->lpdead_items);
+ if (vacrel->workers_usage)
+ appendStringInfo(&buf,
+ _("parallel index vacuum/cleanup : workers planned = %d, workers launched = %d\n"),
+ vacrel->workers_usage->nplanned,
+ vacrel->workers_usage->nlaunched);
for (int i = 0; i < vacrel->nindexes; i++)
{
IndexBulkDeleteResult *istat = vacrel->indstats[i];
@@ -2659,7 +2680,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3091,7 +3113,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index acd53b85b1c..9a258238650 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersUsage *wusage);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -502,7 +502,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -513,7 +513,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
}
/*
@@ -521,7 +521,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -533,7 +534,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
/*
@@ -618,7 +619,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersUsage *wusage)
{
int nworkers;
PVIndVacStatus new_status;
@@ -742,6 +743,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
"launched %d parallel vacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
pvs->pcxt->nworkers_launched, nworkers)));
+
+ /* Remember these values, if we asked to. */
+ if (wusage != NULL)
+ {
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
+ wusage->nplanned += nworkers;
+ }
}
/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1f3290c7fbf..90709ca3107 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,16 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * PVWorkersUsage stores information about total number of launched and planned
+ * workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+ int nlaunched;
+ int nplanned;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +404,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 27a4d131897..a838b0885c6 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2378,6 +2378,7 @@ PullFilterOps
PushFilter
PushFilterOps
PushFunction
+PVWorkersUsage
PyCFunction
PyMethodDef
PyModuleDef
--
2.43.0
v15-0003-Tests-for-parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v15-0003-Tests-for-parallel-autovacuum.patchDownload
From 267641b1832f011a32b8f870dd1794d0a82f0a7f Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Sun, 23 Nov 2025 01:08:14 +0700
Subject: [PATCH v15 3/4] Tests for parallel autovacuum
---
src/backend/commands/vacuumparallel.c | 8 +
src/backend/postmaster/autovacuum.c | 14 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 26 ++
src/test/modules/test_autovacuum/meson.build | 36 +++
.../modules/test_autovacuum/t/001_basic.pl | 165 ++++++++++++
.../test_autovacuum/test_autovacuum--1.0.sql | 34 +++
.../modules/test_autovacuum/test_autovacuum.c | 255 ++++++++++++++++++
.../test_autovacuum/test_autovacuum.control | 3 +
11 files changed, 545 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_basic.pl
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.c
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.control
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 9a258238650..0cfdf79cb6c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -39,6 +39,7 @@
#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -752,6 +753,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
}
}
+ /*
+ * To be able to exercise whether all reserved parallel workers are being
+ * released anyway, allow injection points to trigger a failure at this
+ * point.
+ */
+ INJECTION_POINT("autovacuum-trigger-leader-failure", NULL);
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index e6a4aa99eae..37c8d268903 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3436,6 +3436,13 @@ AutoVacuumReserveParallelWorkers(int nworkers)
/* Remember how many workers we have reserved. */
av_nworkers_reserved += nreserved;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
return nreserved;
}
@@ -3466,6 +3473,13 @@ AutoVacuumReleaseParallelWorkers(int nworkers)
/* Don't have to remember these workers anymore. */
av_nworkers_reserved -= nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
}
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 902a7954101..f09d0060248 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -15,6 +15,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 14fc761c4cf..ee7e855def0 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -14,6 +14,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..4cf7344b2ac
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,26 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+MODULE_big = test_autovacuum
+OBJS = \
+ $(WIN32RES) \
+ test_autovacuum.o
+
+EXTENSION = test_autovacuum
+DATA = test_autovacuum--1.0.sql
+
+TAP_TESTS = 1
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..3441e5e49cf
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+test_autovacuum_sources = files(
+ 'test_autovacuum.c',
+)
+
+if host_system == 'windows'
+ test_autovacuum_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_autovacuum',
+ '--FILEDESC', 'test_autovacuum - test code for parallel autovacuum',])
+endif
+
+test_autovacuum = shared_module('test_autovacuum',
+ test_autovacuum_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_autovacuum
+
+test_install_data += files(
+ 'test_autovacuum.control',
+ 'test_autovacuum--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_basic.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_basic.pl b/src/test/modules/test_autovacuum/t/001_basic.pl
new file mode 100644
index 00000000000..1271768ebd2
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_basic.pl
@@ -0,0 +1,165 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 10
+ log_min_messages = debug2
+ log_autovacuum_min_duration = 0
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+ shared_preload_libraries=test_autovacuum
+});
+$node->start;
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table with specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ autovacuum_enabled = false);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Insert specified tuples num into the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac VALUES (i, i + 1, i + 2, i + 3);
+ END LOOP;
+ END \$\$;
+});
+
+# Now, create some dead tuples and refresh table statistics
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION test_autovacuum;
+ SELECT inj_set_free_workers_attach();
+ SELECT inj_leader_failure_attach();
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can.
+# Also check whether all requested workers:
+# 1) launched
+# 2) correctly released
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+$node->wait_for_log(qr/parallel index vacuum\/cleanup : workers planned = 2, / .
+ qr/workers launched = 2/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10, 'All parallel workers has been released by the leader');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_2 = 0 WHERE (col_2 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 2:
+# We want parallel autovacuum workers to be released even if leader gets an
+# error. At first, simulate situation, when leader exites due to an ERROR.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('ERROR');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/error, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after ERROR');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_3 = 0 WHERE (col_3 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 3:
+# Same as Test 2, but simulate situation, when leader exites due to FATAL.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('FATAL');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/fatal, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after FATAL');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT inj_set_free_workers_detach();
+ SELECT inj_leader_failure_detach();
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
new file mode 100644
index 00000000000..017d5da85ea
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -0,0 +1,34 @@
+/* src/test/modules/test_autovacuum/test_autovacuum--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_autovacuum" to load this file. \quit
+
+/*
+ * Functions for expecting or to interfere autovacuum state
+ */
+CREATE FUNCTION get_parallel_autovacuum_free_workers()
+RETURNS INTEGER STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION trigger_leader_failure(failure_type text)
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+/*
+ * Injection point related functions
+ */
+CREATE FUNCTION inj_set_free_workers_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_set_free_workers_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
new file mode 100644
index 00000000000..7948f4858ae
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -0,0 +1,255 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_autovacuum.c
+ * Helpers to write tests for parallel autovacuum
+ *
+ * Copyright (c) 2020-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_autovacuum/test_autovacuum.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "postmaster/autovacuum.h"
+#include "storage/shmem.h"
+#include "storage/ipc.h"
+#include "storage/lwlock.h"
+#include "utils/builtins.h"
+#include "utils/injection_point.h"
+
+PG_MODULE_MAGIC;
+
+typedef enum AVLeaderFaulureType
+{
+ FAIL_NONE,
+ FAIL_ERROR,
+ FAIL_FATAL,
+} AVLeaderFaulureType;
+
+typedef struct InjPointState
+{
+ bool enabled_set_free_workers;
+ uint32 free_parallel_workers;
+
+ bool enabled_leader_failure;
+ AVLeaderFaulureType ftype;
+} InjPointState;
+
+static InjPointState * inj_point_state;
+
+/* Shared memory init callbacks */
+static shmem_request_hook_type prev_shmem_request_hook = NULL;
+static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+
+static void
+test_autovacuum_shmem_request(void)
+{
+ if (prev_shmem_request_hook)
+ prev_shmem_request_hook();
+
+ RequestAddinShmemSpace(sizeof(InjPointState));
+}
+
+static void
+test_autovacuum_shmem_startup(void)
+{
+ bool found;
+
+ if (prev_shmem_startup_hook)
+ prev_shmem_startup_hook();
+
+ /* Create or attach to the shared memory state */
+ LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
+
+ inj_point_state = ShmemInitStruct("injection_points",
+ sizeof(InjPointState),
+ &found);
+
+ if (!found)
+ {
+ /* First time through, initialize */
+ inj_point_state->enabled_leader_failure = false;
+ inj_point_state->enabled_set_free_workers = false;
+ inj_point_state->ftype = FAIL_NONE;
+
+ /* Keep it in sync with AutoVacuumShmemInit */
+ inj_point_state->free_parallel_workers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+
+ InjectionPointAttach("autovacuum-set-free-parallel-workers-num",
+ "test_autovacuum",
+ "inj_set_free_workers",
+ NULL,
+ 0);
+
+ InjectionPointAttach("autovacuum-trigger-leader-failure",
+ "test_autovacuum",
+ "inj_trigger_leader_failure",
+ NULL,
+ 0);
+ }
+
+ LWLockRelease(AddinShmemInitLock);
+}
+
+void
+_PG_init(void)
+{
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ prev_shmem_request_hook = shmem_request_hook;
+ shmem_request_hook = test_autovacuum_shmem_request;
+ prev_shmem_startup_hook = shmem_startup_hook;
+ shmem_startup_hook = test_autovacuum_shmem_startup;
+}
+
+extern PGDLLEXPORT void inj_set_free_workers(const char *name,
+ const void *private_data,
+ void *arg);
+extern PGDLLEXPORT void inj_trigger_leader_failure(const char *name,
+ const void *private_data,
+ void *arg);
+
+/*
+ * Set number of currently available parallel a/v workers. This value may
+ * change after reserving or releasing such workers.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_set_free_workers(const char *name, const void *private_data, void *arg)
+{
+ ereport(LOG,
+ errmsg("set parallel workers injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->enabled_set_free_workers)
+ {
+ Assert(arg != NULL);
+ inj_point_state->free_parallel_workers = *(uint32 *) arg;
+ }
+}
+
+/*
+ * Throw an ERROR or FATAL, if somebody requested it.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_trigger_leader_failure(const char *name, const void *private_data,
+ void *arg)
+{
+ int elevel;
+ char *elevel_str;
+
+ ereport(LOG,
+ errmsg("trigger leader failure injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->ftype == FAIL_NONE ||
+ !inj_point_state->enabled_leader_failure)
+ {
+ return;
+ }
+
+ elevel = inj_point_state->ftype == FAIL_ERROR ? ERROR : FATAL;
+ elevel_str = elevel == ERROR ? "error" : "fatal";
+
+ ereport(elevel, errmsg("%s, triggered by injection point", elevel_str));
+}
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+ uint32 nworkers;
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ nworkers = inj_point_state->free_parallel_workers;
+ LWLockRelease(AutovacuumLock);
+
+ PG_RETURN_UINT32(nworkers);
+}
+
+PG_FUNCTION_INFO_V1(trigger_leader_failure);
+Datum
+trigger_leader_failure(PG_FUNCTION_ARGS)
+{
+ const char *failure_type = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ if (strcmp(failure_type, "NONE") == 0)
+ inj_point_state->ftype = FAIL_NONE;
+ else if (strcmp(failure_type, "ERROR") == 0)
+ inj_point_state->ftype = FAIL_ERROR;
+ else if (strcmp(failure_type, "FATAL") == 0)
+ inj_point_state->ftype = FAIL_FATAL;
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid leader failure type : %s", failure_type)));
+
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_attach);
+Datum
+inj_set_free_workers_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = true;
+ inj_point_state->ftype = FAIL_NONE;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_detach);
+Datum
+inj_set_free_workers_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_attach);
+Datum
+inj_leader_failure_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = true;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_detach);
+Datum
+inj_leader_failure_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.control b/src/test/modules/test_autovacuum/test_autovacuum.control
new file mode 100644
index 00000000000..1b7fad258f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.control
@@ -0,0 +1,3 @@
+comment = 'Test code for parallel autovacuum'
+default_version = '1.0'
+module_pathname = '$libdir/test_autovacuum'
--
2.43.0
v15-0001-Parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v15-0001-Parallel-autovacuum.patchDownload
From 6c6806211a364519150138be6aff9f749e708252 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Sun, 23 Nov 2025 01:03:24 +0700
Subject: [PATCH v15 1/4] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 42 ++++-
src/backend/postmaster/autovacuum.c | 164 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 +-
src/backend/utils/misc/guc_parameters.dat | 9 +
src/backend/utils/misc/postgresql.conf.sample | 2 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 4 +
src/include/utils/rel.h | 7 +
11 files changed, 240 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 9e288dfecbf..3cc29d4454a 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1881,6 +1890,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..acd53b85b1c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -373,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -553,12 +557,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -597,8 +606,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -646,6 +655,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note, that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ nworkers = AutoVacuumReserveParallelWorkers(nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +706,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -738,6 +764,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Also release all previously reserved parallel autovacuum workers */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseParallelWorkers(pvs->pcxt->nworkers_launched);
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 1c38488f2cb..e6a4aa99eae 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -151,6 +151,12 @@ int Log_autoanalyze_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Variable to keep number of currently reserved parallel autovacuum workers.
+ * It is only relevant for parallel autovacuum leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -285,6 +291,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +307,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -364,6 +374,8 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
+static void AutoVacuumReleaseAllParallelWorkers(void);
@@ -763,6 +775,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -779,6 +793,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -1383,6 +1406,17 @@ avl_sigusr2_handler(SIGNAL_ARGS)
* AUTOVACUUM WORKER CODE
********************************************************************/
+/*
+ * Make sure that all reserved workers are released, if parallel autovacuum
+ * leader is finishing due to FATAL error. Otherwise function have no effect.
+ */
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ if (code != 0)
+ AutoVacuumReleaseAllParallelWorkers();
+}
+
/*
* Main entry point for autovacuum worker processes.
*/
@@ -1429,6 +1463,8 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGFPE, FloatExceptionHandler);
pqsignal(SIGCHLD, SIG_DFL);
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
+
/*
* Create a per-backend PGPROC struct in shared memory. We must do this
* before we can use LWLocks or access any shared memory.
@@ -2480,6 +2516,12 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure
+ * that all reserved workers are released.
+ */
+ AutoVacuumReleaseAllParallelWorkers();
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -2880,8 +2922,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3358,6 +3404,85 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader
+ * autovacuum process must call this function. It returns the number of
+ * parallel workers that actually can be launched and reserves these workers
+ * (if any) in global autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+AutoVacuumReserveParallelWorkers(int nworkers)
+{
+ int nreserved;
+
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ /*
+ * We can only reserve workers at the beginning of parallel index
+ * processing, so we must not have any reserved workers right now.
+ */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ nreserved = Min(AutoVacuumShmem->av_freeParallelWorkers, nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= nreserved;
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved += nreserved;
+
+ LWLockRelease(AutovacuumLock);
+ return nreserved;
+}
+
+/*
+ * Leader autovacuum process must call this function in order to update global
+ * autovacuum state, so other leaders will be able to use these parallel
+ * workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Same as above, but release *all* parallel workers, that were reserved by
+ * current leader autovacuum process.
+ */
+static void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3418,6 +3543,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3499,3 +3628,34 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of free parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap the number of free workers by new parameter's value, if needed.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers,
+ autovacuum_max_parallel_workers);
+
+ if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of free workers.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+ }
+
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..fd00d6f89dc 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index c6484aea087..2a037485d5e 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3326,9 +3326,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only
+ * cost-based delays need to be affected also to parallel vacuum workers,
+ * and we will handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 1128167c025..6c38275d30b 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -154,6 +154,15 @@
max => '2000000000',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+ long_desc => 'This parameter is capped by "max_worker_processes" (not by "autovacuum_max_workers"!).',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '2',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index dc9e2255f8a..86c67b790b0 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -691,6 +691,8 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 2 # disabled by default and limited by
+ # max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 51806597037..6170436b341 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1423,6 +1423,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 9a7d733ddef..605d0829b03 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 023ac6d5fa8..23cb531c68c 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -65,6 +65,10 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern int AutoVacuumReserveParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 80286076a11..e879fdcfc69 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
nworkers has a double meaning. The return value of
AutoVacuumReserveParallelWorkers
is nreserved. I think this should be```
nreserved = AutoVacuumReserveParallelWorkers(nworkers);
```and nreserved becomes the authoritative value for the number of parallel
workers after that point.
I could not find this pattern being used in the code base.
I think it will be better and more in-line without what we generally do
and pass-by-reference and update the value inside
AutoVacuumReserveParallelWorkers:
```
AutoVacuumReserveParallelWorkers(&nworkers).
```
Maybe that's just my preference.
---
{ name => 'autovacuum_max_parallel_workers', type => 'int', context =>
'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Maximum number of parallel autovacuum workers, that
can be taken from bgworkers pool.',
long_desc => 'This parameter is capped by "max_worker_processes"
(not by "autovacuum_max_workers"!).',
variable => 'autovacuum_max_parallel_workers',
boot_val => '0',
min => '0',
max => 'MAX_BACKENDS',
},Parallel vacuum in autovacuum can be used only when users set the
autovacuum_parallel_workers storage parameter. How about using the
default value 2 for autovacuum_max_parallel_workers GUC parameter?
Sounds reasonable, +1 for it.
v15-0004 has an incorrect default value for `autovacuum_max_parallel_workers`.
It should now be 2.
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-worker-processes"/>. The default is 0,
+ which means no parallel index vacuuming.
--
Sami
Hi,
On Sun, Nov 23, 2025 at 5:51 AM Sami Imseih <samimseih@gmail.com> wrote:
nworkers has a double meaning. The return value of
AutoVacuumReserveParallelWorkers
is nreserved. I think this should be```
nreserved = AutoVacuumReserveParallelWorkers(nworkers);
```and nreserved becomes the authoritative value for the number of parallel
workers after that point.I could not find this pattern being used in the code base.
I think it will be better and more in-line without what we generally do
and pass-by-reference and update the value inside
AutoVacuumReserveParallelWorkers:```
AutoVacuumReserveParallelWorkers(&nworkers).
```
Maybe I just don't like functions with side effects, but this function will
have ones anyway. I'll add logic with passing by reference as you
suggested.
---
{ name => 'autovacuum_max_parallel_workers', type => 'int', context =>
'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Maximum number of parallel autovacuum workers, that
can be taken from bgworkers pool.',
long_desc => 'This parameter is capped by "max_worker_processes"
(not by "autovacuum_max_workers"!).',
variable => 'autovacuum_max_parallel_workers',
boot_val => '0',
min => '0',
max => 'MAX_BACKENDS',
},Parallel vacuum in autovacuum can be used only when users set the
autovacuum_parallel_workers storage parameter. How about using the
default value 2 for autovacuum_max_parallel_workers GUC parameter?Sounds reasonable, +1 for it.
v15-0004 has an incorrect default value for `autovacuum_max_parallel_workers`.
It should now be 2.+ Sets the maximum number of parallel autovacuum workers that + can be used for parallel index vacuuming at one time. Is capped by + <xref linkend="guc-max-worker-processes"/>. The default is 0, + which means no parallel index vacuuming.
Thanks for noticing it! Fixed.
I am sending an updated set of patches.
--
Best regards,
Daniil Davydov
Attachments:
v16-0003-Tests-for-parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v16-0003-Tests-for-parallel-autovacuum.patchDownload
From 9ecc7800596c79b5f1234e2b8453aac42321c1fd Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Sun, 23 Nov 2025 01:08:14 +0700
Subject: [PATCH v16 3/4] Tests for parallel autovacuum
---
src/backend/commands/vacuumparallel.c | 8 +
src/backend/postmaster/autovacuum.c | 14 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 26 ++
src/test/modules/test_autovacuum/meson.build | 36 +++
.../modules/test_autovacuum/t/001_basic.pl | 165 ++++++++++++
.../test_autovacuum/test_autovacuum--1.0.sql | 34 +++
.../modules/test_autovacuum/test_autovacuum.c | 255 ++++++++++++++++++
.../test_autovacuum/test_autovacuum.control | 3 +
11 files changed, 545 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_basic.pl
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.c
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.control
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index ec7b7170be5..bf22fe2d00c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -39,6 +39,7 @@
#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -752,6 +753,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
}
}
+ /*
+ * To be able to exercise whether all reserved parallel workers are being
+ * released anyway, allow injection points to trigger a failure at this
+ * point.
+ */
+ INJECTION_POINT("autovacuum-trigger-leader-failure", NULL);
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index ceca03bcf34..ce15985cf5d 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3437,6 +3437,13 @@ AutoVacuumReserveParallelWorkers(int *nworkers)
/* Remember how many workers we have reserved. */
av_nworkers_reserved += *nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
}
@@ -3466,6 +3473,13 @@ AutoVacuumReleaseParallelWorkers(int nworkers)
/* Don't have to remember these workers anymore. */
av_nworkers_reserved -= nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
}
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 902a7954101..f09d0060248 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -15,6 +15,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 14fc761c4cf..ee7e855def0 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -14,6 +14,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..4cf7344b2ac
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,26 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+MODULE_big = test_autovacuum
+OBJS = \
+ $(WIN32RES) \
+ test_autovacuum.o
+
+EXTENSION = test_autovacuum
+DATA = test_autovacuum--1.0.sql
+
+TAP_TESTS = 1
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..3441e5e49cf
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+test_autovacuum_sources = files(
+ 'test_autovacuum.c',
+)
+
+if host_system == 'windows'
+ test_autovacuum_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_autovacuum',
+ '--FILEDESC', 'test_autovacuum - test code for parallel autovacuum',])
+endif
+
+test_autovacuum = shared_module('test_autovacuum',
+ test_autovacuum_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_autovacuum
+
+test_install_data += files(
+ 'test_autovacuum.control',
+ 'test_autovacuum--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_basic.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_basic.pl b/src/test/modules/test_autovacuum/t/001_basic.pl
new file mode 100644
index 00000000000..1271768ebd2
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_basic.pl
@@ -0,0 +1,165 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 10
+ log_min_messages = debug2
+ log_autovacuum_min_duration = 0
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+ shared_preload_libraries=test_autovacuum
+});
+$node->start;
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table with specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ autovacuum_enabled = false);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Insert specified tuples num into the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac VALUES (i, i + 1, i + 2, i + 3);
+ END LOOP;
+ END \$\$;
+});
+
+# Now, create some dead tuples and refresh table statistics
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION test_autovacuum;
+ SELECT inj_set_free_workers_attach();
+ SELECT inj_leader_failure_attach();
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can.
+# Also check whether all requested workers:
+# 1) launched
+# 2) correctly released
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+$node->wait_for_log(qr/parallel index vacuum\/cleanup : workers planned = 2, / .
+ qr/workers launched = 2/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10, 'All parallel workers has been released by the leader');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_2 = 0 WHERE (col_2 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 2:
+# We want parallel autovacuum workers to be released even if leader gets an
+# error. At first, simulate situation, when leader exites due to an ERROR.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('ERROR');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/error, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after ERROR');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_3 = 0 WHERE (col_3 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 3:
+# Same as Test 2, but simulate situation, when leader exites due to FATAL.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('FATAL');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/fatal, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after FATAL');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT inj_set_free_workers_detach();
+ SELECT inj_leader_failure_detach();
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
new file mode 100644
index 00000000000..017d5da85ea
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -0,0 +1,34 @@
+/* src/test/modules/test_autovacuum/test_autovacuum--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_autovacuum" to load this file. \quit
+
+/*
+ * Functions for expecting or to interfere autovacuum state
+ */
+CREATE FUNCTION get_parallel_autovacuum_free_workers()
+RETURNS INTEGER STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION trigger_leader_failure(failure_type text)
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+/*
+ * Injection point related functions
+ */
+CREATE FUNCTION inj_set_free_workers_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_set_free_workers_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
new file mode 100644
index 00000000000..7948f4858ae
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -0,0 +1,255 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_autovacuum.c
+ * Helpers to write tests for parallel autovacuum
+ *
+ * Copyright (c) 2020-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_autovacuum/test_autovacuum.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "postmaster/autovacuum.h"
+#include "storage/shmem.h"
+#include "storage/ipc.h"
+#include "storage/lwlock.h"
+#include "utils/builtins.h"
+#include "utils/injection_point.h"
+
+PG_MODULE_MAGIC;
+
+typedef enum AVLeaderFaulureType
+{
+ FAIL_NONE,
+ FAIL_ERROR,
+ FAIL_FATAL,
+} AVLeaderFaulureType;
+
+typedef struct InjPointState
+{
+ bool enabled_set_free_workers;
+ uint32 free_parallel_workers;
+
+ bool enabled_leader_failure;
+ AVLeaderFaulureType ftype;
+} InjPointState;
+
+static InjPointState * inj_point_state;
+
+/* Shared memory init callbacks */
+static shmem_request_hook_type prev_shmem_request_hook = NULL;
+static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+
+static void
+test_autovacuum_shmem_request(void)
+{
+ if (prev_shmem_request_hook)
+ prev_shmem_request_hook();
+
+ RequestAddinShmemSpace(sizeof(InjPointState));
+}
+
+static void
+test_autovacuum_shmem_startup(void)
+{
+ bool found;
+
+ if (prev_shmem_startup_hook)
+ prev_shmem_startup_hook();
+
+ /* Create or attach to the shared memory state */
+ LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
+
+ inj_point_state = ShmemInitStruct("injection_points",
+ sizeof(InjPointState),
+ &found);
+
+ if (!found)
+ {
+ /* First time through, initialize */
+ inj_point_state->enabled_leader_failure = false;
+ inj_point_state->enabled_set_free_workers = false;
+ inj_point_state->ftype = FAIL_NONE;
+
+ /* Keep it in sync with AutoVacuumShmemInit */
+ inj_point_state->free_parallel_workers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+
+ InjectionPointAttach("autovacuum-set-free-parallel-workers-num",
+ "test_autovacuum",
+ "inj_set_free_workers",
+ NULL,
+ 0);
+
+ InjectionPointAttach("autovacuum-trigger-leader-failure",
+ "test_autovacuum",
+ "inj_trigger_leader_failure",
+ NULL,
+ 0);
+ }
+
+ LWLockRelease(AddinShmemInitLock);
+}
+
+void
+_PG_init(void)
+{
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ prev_shmem_request_hook = shmem_request_hook;
+ shmem_request_hook = test_autovacuum_shmem_request;
+ prev_shmem_startup_hook = shmem_startup_hook;
+ shmem_startup_hook = test_autovacuum_shmem_startup;
+}
+
+extern PGDLLEXPORT void inj_set_free_workers(const char *name,
+ const void *private_data,
+ void *arg);
+extern PGDLLEXPORT void inj_trigger_leader_failure(const char *name,
+ const void *private_data,
+ void *arg);
+
+/*
+ * Set number of currently available parallel a/v workers. This value may
+ * change after reserving or releasing such workers.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_set_free_workers(const char *name, const void *private_data, void *arg)
+{
+ ereport(LOG,
+ errmsg("set parallel workers injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->enabled_set_free_workers)
+ {
+ Assert(arg != NULL);
+ inj_point_state->free_parallel_workers = *(uint32 *) arg;
+ }
+}
+
+/*
+ * Throw an ERROR or FATAL, if somebody requested it.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_trigger_leader_failure(const char *name, const void *private_data,
+ void *arg)
+{
+ int elevel;
+ char *elevel_str;
+
+ ereport(LOG,
+ errmsg("trigger leader failure injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->ftype == FAIL_NONE ||
+ !inj_point_state->enabled_leader_failure)
+ {
+ return;
+ }
+
+ elevel = inj_point_state->ftype == FAIL_ERROR ? ERROR : FATAL;
+ elevel_str = elevel == ERROR ? "error" : "fatal";
+
+ ereport(elevel, errmsg("%s, triggered by injection point", elevel_str));
+}
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+ uint32 nworkers;
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ nworkers = inj_point_state->free_parallel_workers;
+ LWLockRelease(AutovacuumLock);
+
+ PG_RETURN_UINT32(nworkers);
+}
+
+PG_FUNCTION_INFO_V1(trigger_leader_failure);
+Datum
+trigger_leader_failure(PG_FUNCTION_ARGS)
+{
+ const char *failure_type = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ if (strcmp(failure_type, "NONE") == 0)
+ inj_point_state->ftype = FAIL_NONE;
+ else if (strcmp(failure_type, "ERROR") == 0)
+ inj_point_state->ftype = FAIL_ERROR;
+ else if (strcmp(failure_type, "FATAL") == 0)
+ inj_point_state->ftype = FAIL_FATAL;
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid leader failure type : %s", failure_type)));
+
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_attach);
+Datum
+inj_set_free_workers_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = true;
+ inj_point_state->ftype = FAIL_NONE;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_detach);
+Datum
+inj_set_free_workers_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_attach);
+Datum
+inj_leader_failure_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = true;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_detach);
+Datum
+inj_leader_failure_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.control b/src/test/modules/test_autovacuum/test_autovacuum.control
new file mode 100644
index 00000000000..1b7fad258f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.control
@@ -0,0 +1,3 @@
+comment = 'Test code for parallel autovacuum'
+default_version = '1.0'
+module_pathname = '$libdir/test_autovacuum'
--
2.43.0
v16-0004-Documentation-for-parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v16-0004-Documentation-for-parallel-autovacuum.patchDownload
From 5a7bda14d60af3dcd7072dd34dba637e19de7aba Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Sun, 23 Nov 2025 02:32:44 +0700
Subject: [PATCH v16 4/4] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 17 +++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 49 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 023b3f03ba9..85db09df897 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2841,6 +2841,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9264,6 +9265,22 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-worker-processes"/>. The default is 2.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index f4f0433ef6f..02f306bbb8a 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -897,6 +897,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 6557c5cffd8..e95a6488c5e 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1717,6 +1717,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
v16-0001-Parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v16-0001-Parallel-autovacuum.patchDownload
From 304f92f13dcbf90ddbfbdd95859c772fdfaadbb3 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Sun, 23 Nov 2025 01:03:24 +0700
Subject: [PATCH v16 1/4] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 42 ++++-
src/backend/postmaster/autovacuum.c | 164 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 +-
src/backend/utils/misc/guc_parameters.dat | 9 +
src/backend/utils/misc/postgresql.conf.sample | 2 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 4 +
src/include/utils/rel.h | 7 +
11 files changed, 240 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 9e288dfecbf..3cc29d4454a 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1881,6 +1890,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..6e2c22be2ee 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -373,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -553,12 +557,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -597,8 +606,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -646,6 +655,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note, that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ AutoVacuumReserveParallelWorkers(&nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +706,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -738,6 +764,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Also release all previously reserved parallel autovacuum workers */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseParallelWorkers(pvs->pcxt->nworkers_launched);
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 1c38488f2cb..ceca03bcf34 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -151,6 +151,12 @@ int Log_autoanalyze_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Variable to keep number of currently reserved parallel autovacuum workers.
+ * It is only relevant for parallel autovacuum leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -285,6 +291,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +307,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -364,6 +374,8 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
+static void AutoVacuumReleaseAllParallelWorkers(void);
@@ -763,6 +775,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -779,6 +793,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -1383,6 +1406,17 @@ avl_sigusr2_handler(SIGNAL_ARGS)
* AUTOVACUUM WORKER CODE
********************************************************************/
+/*
+ * Make sure that all reserved workers are released, if parallel autovacuum
+ * leader is finishing due to FATAL error. Otherwise function have no effect.
+ */
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ if (code != 0)
+ AutoVacuumReleaseAllParallelWorkers();
+}
+
/*
* Main entry point for autovacuum worker processes.
*/
@@ -1429,6 +1463,8 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGFPE, FloatExceptionHandler);
pqsignal(SIGCHLD, SIG_DFL);
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
+
/*
* Create a per-backend PGPROC struct in shared memory. We must do this
* before we can use LWLocks or access any shared memory.
@@ -2480,6 +2516,12 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure
+ * that all reserved workers are released.
+ */
+ AutoVacuumReleaseAllParallelWorkers();
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -2880,8 +2922,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3358,6 +3404,85 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader
+ * autovacuum process must call this function during computing the parallel
+ * degree.
+ *
+ * 'nworkers' is the desired number of parallel workers to reserve. Function
+ * sets 'nworkers' to the number of parallel workers that actually can be
+ * launched and reserves these workers (if any) in global autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+void
+AutoVacuumReserveParallelWorkers(int *nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ /*
+ * We can only reserve workers at the beginning of parallel index
+ * processing, so we must not have any reserved workers right now.
+ */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ *nworkers= Min(AutoVacuumShmem->av_freeParallelWorkers, *nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= *nworkers;
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved += *nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Leader autovacuum process must call this function in order to update global
+ * autovacuum state, so other leaders will be able to use these parallel
+ * workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Same as above, but release *all* parallel workers, that were reserved by
+ * current leader autovacuum process.
+ */
+static void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3418,6 +3543,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3499,3 +3628,34 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of free parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap the number of free workers by new parameter's value, if needed.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers,
+ autovacuum_max_parallel_workers);
+
+ if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of free workers.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+ }
+
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..fd00d6f89dc 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index c6484aea087..2a037485d5e 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3326,9 +3326,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only
+ * cost-based delays need to be affected also to parallel vacuum workers,
+ * and we will handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 1128167c025..6c38275d30b 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -154,6 +154,15 @@
max => '2000000000',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+ long_desc => 'This parameter is capped by "max_worker_processes" (not by "autovacuum_max_workers"!).',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '2',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index dc9e2255f8a..86c67b790b0 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -691,6 +691,8 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 2 # disabled by default and limited by
+ # max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 51806597037..6170436b341 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1423,6 +1423,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 9a7d733ddef..605d0829b03 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 023ac6d5fa8..9d558c9c056 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -65,6 +65,10 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern void AutoVacuumReserveParallelWorkers(int *nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 80286076a11..e879fdcfc69 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
v16-0002-Logging-for-parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v16-0002-Logging-for-parallel-autovacuum.patchDownload
From 5cacc91030063a1514e74e9af864868014df1658 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Sun, 23 Nov 2025 01:07:47 +0700
Subject: [PATCH v16 2/4] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 27 +++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 20 ++++++++++++++------
src/include/commands/vacuum.h | 16 ++++++++++++++--
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 54 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 65bb0568a86..ea7a18d4d51 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -347,6 +347,12 @@ typedef struct LVRelState
/* Instrumentation counters */
int num_index_scans;
+
+ /*
+ * Number of planned and actually launched parallel workers for all index
+ * scans, or NULL
+ */
+ PVWorkersUsage *workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -700,6 +706,16 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
indnames = palloc(sizeof(char *) * vacrel->nindexes);
for (int i = 0; i < vacrel->nindexes; i++)
indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
+
+ /*
+ * Allocate space for workers usage statistics. Thus, we explicitly
+ * make clear that such statistics must be accumulated. For now, this
+ * is used only by autovacuum leader worker, because it must log it in
+ * the end of table processing.
+ */
+ vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
+ (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
+ NULL;
}
/*
@@ -1099,6 +1115,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
orig_rel_pages == 0 ? 100.0 :
100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
vacrel->lpdead_items);
+ if (vacrel->workers_usage)
+ appendStringInfo(&buf,
+ _("parallel index vacuum/cleanup : workers planned = %d, workers launched = %d\n"),
+ vacrel->workers_usage->nplanned,
+ vacrel->workers_usage->nlaunched);
for (int i = 0; i < vacrel->nindexes; i++)
{
IndexBulkDeleteResult *istat = vacrel->indstats[i];
@@ -2659,7 +2680,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3091,7 +3113,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 6e2c22be2ee..ec7b7170be5 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersUsage *wusage);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -502,7 +502,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -513,7 +513,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
}
/*
@@ -521,7 +521,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -533,7 +534,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
/*
@@ -618,7 +619,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersUsage *wusage)
{
int nworkers;
PVIndVacStatus new_status;
@@ -742,6 +743,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
"launched %d parallel vacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
pvs->pcxt->nworkers_launched, nworkers)));
+
+ /* Remember these values, if we asked to. */
+ if (wusage != NULL)
+ {
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
+ wusage->nplanned += nworkers;
+ }
}
/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1f3290c7fbf..90709ca3107 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,16 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * PVWorkersUsage stores information about total number of launched and planned
+ * workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+ int nlaunched;
+ int nplanned;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +404,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 27a4d131897..a838b0885c6 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2378,6 +2378,7 @@ PullFilterOps
PushFilter
PushFilterOps
PushFunction
+PVWorkersUsage
PyCFunction
PyMethodDef
PyModuleDef
--
2.43.0
On Sun, Nov 23, 2025 at 7:02 AM Daniil Davydov <3danissimo@gmail.com> wrote:
Hi,
On Sun, Nov 23, 2025 at 5:51 AM Sami Imseih <samimseih@gmail.com> wrote:
nworkers has a double meaning. The return value of
AutoVacuumReserveParallelWorkers
is nreserved. I think this should be```
nreserved = AutoVacuumReserveParallelWorkers(nworkers);
```and nreserved becomes the authoritative value for the number of parallel
workers after that point.I could not find this pattern being used in the code base.
I think it will be better and more in-line without what we generally do
and pass-by-reference and update the value inside
AutoVacuumReserveParallelWorkers:```
AutoVacuumReserveParallelWorkers(&nworkers).
```Maybe I just don't like functions with side effects, but this function will
have ones anyway. I'll add logic with passing by reference as you
suggested.---
{ name => 'autovacuum_max_parallel_workers', type => 'int', context =>
'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Maximum number of parallel autovacuum workers, that
can be taken from bgworkers pool.',
long_desc => 'This parameter is capped by "max_worker_processes"
(not by "autovacuum_max_workers"!).',
variable => 'autovacuum_max_parallel_workers',
boot_val => '0',
min => '0',
max => 'MAX_BACKENDS',
},Parallel vacuum in autovacuum can be used only when users set the
autovacuum_parallel_workers storage parameter. How about using the
default value 2 for autovacuum_max_parallel_workers GUC parameter?Sounds reasonable, +1 for it.
v15-0004 has an incorrect default value for `autovacuum_max_parallel_workers`.
It should now be 2.+ Sets the maximum number of parallel autovacuum workers that + can be used for parallel index vacuuming at one time. Is capped by + <xref linkend="guc-max-worker-processes"/>. The default is 0, + which means no parallel index vacuuming.Thanks for noticing it! Fixed.
I am sending an updated set of patches.
Thank you for updating the patch! I've reviewed the 0001 patch and
here are some comments:
---
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved += *nworkers;
I think we can simply assign *nworkers to av_nworkers_reserved instead
of incrementing it as we're sure that av_nworkers_reserved is 0 at the
beginning of this function.
---
+static void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+}
We can put an assertion at the end of the function to verify that this
worker doesn't reserve any worker.
---
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ if (code != 0)
+ AutoVacuumReleaseAllParallelWorkers();
+}
I think it would be more future-proof if we call
AutoVacuumReleaseAllParallelWorkers() regardless of the code if there
is no strong reason why we check the code there.
---
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
/*
* Create a per-backend PGPROC struct in shared memory. We must do this
* before we can use LWLocks or access any shared memory.
*/
InitProcess();
I think it's better to register the
autovacuum_worker_before_shmem_exit() after the process
initialization. The function could use LWLocks to release the reserved
workers. Given that AutoVacuumReleaseAllParallelWorkers() doesn't try
to release the reserved worker when av_nworkers_reserved == 0, but it
would be more future-proof to do that after the basic process
initialization processes.
How about renaming autovacuum_worker_before_shmem_exit() to
autovacuum_worker_onexit()?
---
IIUC the patch needs to implement some logic to propagate the updates
of vacuum delay parameters to parallel vacuum workers. Are you still
working on it? Or shall I draft this part on top of the 0001 patch?
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Hi,
On Tue, Jan 6, 2026 at 1:51 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sun, Nov 23, 2025 at 7:02 AM Daniil Davydov <3danissimo@gmail.com> wrote:
Hi,
On Sun, Nov 23, 2025 at 5:51 AM Sami Imseih <samimseih@gmail.com> wrote:
nworkers has a double meaning. The return value of
AutoVacuumReserveParallelWorkers
is nreserved. I think this should be```
nreserved = AutoVacuumReserveParallelWorkers(nworkers);
```and nreserved becomes the authoritative value for the number of parallel
workers after that point.I could not find this pattern being used in the code base.
I think it will be better and more in-line without what we generally do
and pass-by-reference and update the value inside
AutoVacuumReserveParallelWorkers:```
AutoVacuumReserveParallelWorkers(&nworkers).
```Maybe I just don't like functions with side effects, but this function will
have ones anyway. I'll add logic with passing by reference as you
suggested.---
{ name => 'autovacuum_max_parallel_workers', type => 'int', context =>
'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Maximum number of parallel autovacuum workers, that
can be taken from bgworkers pool.',
long_desc => 'This parameter is capped by "max_worker_processes"
(not by "autovacuum_max_workers"!).',
variable => 'autovacuum_max_parallel_workers',
boot_val => '0',
min => '0',
max => 'MAX_BACKENDS',
},Parallel vacuum in autovacuum can be used only when users set the
autovacuum_parallel_workers storage parameter. How about using the
default value 2 for autovacuum_max_parallel_workers GUC parameter?Sounds reasonable, +1 for it.
v15-0004 has an incorrect default value for `autovacuum_max_parallel_workers`.
It should now be 2.+ Sets the maximum number of parallel autovacuum workers that + can be used for parallel index vacuuming at one time. Is capped by + <xref linkend="guc-max-worker-processes"/>. The default is 0, + which means no parallel index vacuuming.Thanks for noticing it! Fixed.
I am sending an updated set of patches.
Thank you for updating the patch! I've reviewed the 0001 patch and
here are some comments:
Thank you for the review!
--- + /* Remember how many workers we have reserved. */ + av_nworkers_reserved += *nworkers;I think we can simply assign *nworkers to av_nworkers_reserved instead
of incrementing it as we're sure that av_nworkers_reserved is 0 at the
beginning of this function.
Agree, it will be more clear.
--- +static void +AutoVacuumReleaseAllParallelWorkers(void) +{ + /* Only leader worker can call this function. */ + Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker()); + + if (av_nworkers_reserved > 0) + AutoVacuumReleaseParallelWorkers(av_nworkers_reserved); +}We can put an assertion at the end of the function to verify that this
worker doesn't reserve any worker.
It's not a problem to add this assertion, but I have doubts : we have a
function that promises to release a given number of workers, but we are
still checking whether a specified number of workers have been released.
I suggest another place for assertion - see comment below.
--- +static void +autovacuum_worker_before_shmem_exit(int code, Datum arg) +{ + if (code != 0) + AutoVacuumReleaseAllParallelWorkers(); +}I think it would be more future-proof if we call
AutoVacuumReleaseAllParallelWorkers() regardless of the code if there
is no strong reason why we check the code there.
I think we can leave "code != 0" so as not to confuse the readers, but
add the assertion that at the end of the function all workers have been
released. Thus, we are telling that 1) in normal processing we must not
have reserved workers and 2) even after a FATAL error we are sure
that we don't have reserved workers.
--- + before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);/*
* Create a per-backend PGPROC struct in shared memory. We must do this
* before we can use LWLocks or access any shared memory.
*/
InitProcess();I think it's better to register the
autovacuum_worker_before_shmem_exit() after the process
initialization. The function could use LWLocks to release the reserved
workers. Given that AutoVacuumReleaseAllParallelWorkers() doesn't try
to release the reserved worker when av_nworkers_reserved == 0, but it
would be more future-proof to do that after the basic process
initialization processes.
My bad, I miss the comment above InitProcess. Agree with you.
Just in case, callback registration will be invoked after BaseInit.
How about renaming autovacuum_worker_before_shmem_exit() to
autovacuum_worker_onexit()?
We also have "on_shmem_exit" callbacks. Maybe "onexit" naming can confuse
somebody?..
Since the function name does not cross line length boundary anywhere, I suggest
leaving the current naming.
---
IIUC the patch needs to implement some logic to propagate the updates
of vacuum delay parameters to parallel vacuum workers.
Yep.
Are you still working on it? Or shall I draft this part on top of the
0001 patch?
I thought about some "beautiful" approach, but for now I have
only one idea - force parallel a/v workers to get values for these
parameters from shmem (which obviously can be modified by the
leader a/v process). I'll post this patch in the near future.
Please, see v17 patches (only 0001 has been changed).
--
Best regards,
Daniil Davydov
Attachments:
v17-0002-Logging-for-parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v17-0002-Logging-for-parallel-autovacuum.patchDownload
From 0aafa271ec90dbe494eea79fd484a4856023b3a8 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Sun, 23 Nov 2025 01:07:47 +0700
Subject: [PATCH v17 2/4] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 27 +++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 20 ++++++++++++++------
src/include/commands/vacuum.h | 16 ++++++++++++++--
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 54 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 2086a577199..35d2b07aa8a 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -349,6 +349,12 @@ typedef struct LVRelState
int num_index_scans;
int num_dead_items_resets;
Size total_dead_items_bytes;
+
+ /*
+ * Number of planned and actually launched parallel workers for all index
+ * scans, or NULL
+ */
+ PVWorkersUsage *workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -711,6 +717,16 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
indnames = palloc_array(char *, vacrel->nindexes);
for (int i = 0; i < vacrel->nindexes; i++)
indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
+
+ /*
+ * Allocate space for workers usage statistics. Thus, we explicitly
+ * make clear that such statistics must be accumulated. For now, this
+ * is used only by autovacuum leader worker, because it must log it in
+ * the end of table processing.
+ */
+ vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
+ (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
+ NULL;
}
/*
@@ -1125,6 +1141,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
orig_rel_pages == 0 ? 100.0 :
100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
vacrel->lpdead_items);
+ if (vacrel->workers_usage)
+ appendStringInfo(&buf,
+ _("parallel index vacuum/cleanup : workers planned = %d, workers launched = %d\n"),
+ vacrel->workers_usage->nplanned,
+ vacrel->workers_usage->nlaunched);
for (int i = 0; i < vacrel->nindexes; i++)
{
IndexBulkDeleteResult *istat = vacrel->indstats[i];
@@ -2700,7 +2721,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3133,7 +3155,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 6a3a00585f9..490f93959d1 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersUsage *wusage);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -502,7 +502,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -513,7 +513,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
}
/*
@@ -521,7 +521,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -533,7 +534,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
/*
@@ -618,7 +619,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersUsage *wusage)
{
int nworkers;
PVIndVacStatus new_status;
@@ -742,6 +743,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
"launched %d parallel vacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
pvs->pcxt->nworkers_launched, nworkers)));
+
+ /* Remember these values, if we asked to. */
+ if (wusage != NULL)
+ {
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
+ wusage->nplanned += nworkers;
+ }
}
/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index e885a4b9c77..ec5d70aacdc 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,16 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * PVWorkersUsage stores information about total number of launched and planned
+ * workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+ int nlaunched;
+ int nplanned;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +404,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index b9e671fcda8..6e35c6aa493 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2397,6 +2397,7 @@ PullFilterOps
PushFilter
PushFilterOps
PushFunction
+PVWorkersUsage
PyCFunction
PyMethodDef
PyModuleDef
--
2.43.0
v17-0004-Documentation-for-parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v17-0004-Documentation-for-parallel-autovacuum.patchDownload
From 6f615b06b1578b5c72b36074de20811999b52e4f Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Sun, 23 Nov 2025 02:32:44 +0700
Subject: [PATCH v17 4/4] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 17 +++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 49 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 601aa3afb8e..36fcc72f325 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2847,6 +2847,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9282,6 +9283,22 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-worker-processes"/>. The default is 2.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 7c958b06273..c9f9163c551 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -926,6 +926,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 77c5a763d45..3592c9acff9 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1717,6 +1717,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
v17-0003-Tests-for-parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v17-0003-Tests-for-parallel-autovacuum.patchDownload
From ca52efb09b02d21f6e35dabed2b9563d851151a0 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Sun, 23 Nov 2025 01:08:14 +0700
Subject: [PATCH v17 3/4] Tests for parallel autovacuum
---
src/backend/commands/vacuumparallel.c | 8 +
src/backend/postmaster/autovacuum.c | 14 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 26 ++
src/test/modules/test_autovacuum/meson.build | 36 +++
.../modules/test_autovacuum/t/001_basic.pl | 165 ++++++++++++
.../test_autovacuum/test_autovacuum--1.0.sql | 34 +++
.../modules/test_autovacuum/test_autovacuum.c | 255 ++++++++++++++++++
.../test_autovacuum/test_autovacuum.control | 3 +
11 files changed, 545 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_basic.pl
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.c
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.control
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 490f93959d1..c2f0a37eef2 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -39,6 +39,7 @@
#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -752,6 +753,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
}
}
+ /*
+ * To be able to exercise whether all reserved parallel workers are being
+ * released anyway, allow injection points to trigger a failure at this
+ * point.
+ */
+ INJECTION_POINT("autovacuum-trigger-leader-failure", NULL);
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index bc11970bfee..a27274bfb4d 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3439,6 +3439,13 @@ AutoVacuumReserveParallelWorkers(int *nworkers)
/* Remember how many workers we have reserved. */
av_nworkers_reserved = *nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
}
@@ -3468,6 +3475,13 @@ AutoVacuumReleaseParallelWorkers(int nworkers)
/* Don't have to remember these workers anymore. */
av_nworkers_reserved -= nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
}
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 4c6d56d97d8..bfe365fa575 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -16,6 +16,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 1b31c5b98d6..01a3e3ec044 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..4cf7344b2ac
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,26 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+MODULE_big = test_autovacuum
+OBJS = \
+ $(WIN32RES) \
+ test_autovacuum.o
+
+EXTENSION = test_autovacuum
+DATA = test_autovacuum--1.0.sql
+
+TAP_TESTS = 1
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..3441e5e49cf
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+test_autovacuum_sources = files(
+ 'test_autovacuum.c',
+)
+
+if host_system == 'windows'
+ test_autovacuum_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_autovacuum',
+ '--FILEDESC', 'test_autovacuum - test code for parallel autovacuum',])
+endif
+
+test_autovacuum = shared_module('test_autovacuum',
+ test_autovacuum_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_autovacuum
+
+test_install_data += files(
+ 'test_autovacuum.control',
+ 'test_autovacuum--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_basic.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_basic.pl b/src/test/modules/test_autovacuum/t/001_basic.pl
new file mode 100644
index 00000000000..1271768ebd2
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_basic.pl
@@ -0,0 +1,165 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 10
+ log_min_messages = debug2
+ log_autovacuum_min_duration = 0
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+ shared_preload_libraries=test_autovacuum
+});
+$node->start;
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table with specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ autovacuum_enabled = false);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Insert specified tuples num into the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac VALUES (i, i + 1, i + 2, i + 3);
+ END LOOP;
+ END \$\$;
+});
+
+# Now, create some dead tuples and refresh table statistics
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION test_autovacuum;
+ SELECT inj_set_free_workers_attach();
+ SELECT inj_leader_failure_attach();
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can.
+# Also check whether all requested workers:
+# 1) launched
+# 2) correctly released
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+$node->wait_for_log(qr/parallel index vacuum\/cleanup : workers planned = 2, / .
+ qr/workers launched = 2/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10, 'All parallel workers has been released by the leader');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_2 = 0 WHERE (col_2 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 2:
+# We want parallel autovacuum workers to be released even if leader gets an
+# error. At first, simulate situation, when leader exites due to an ERROR.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('ERROR');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/error, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after ERROR');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_3 = 0 WHERE (col_3 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 3:
+# Same as Test 2, but simulate situation, when leader exites due to FATAL.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('FATAL');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/fatal, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after FATAL');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT inj_set_free_workers_detach();
+ SELECT inj_leader_failure_detach();
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
new file mode 100644
index 00000000000..017d5da85ea
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -0,0 +1,34 @@
+/* src/test/modules/test_autovacuum/test_autovacuum--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_autovacuum" to load this file. \quit
+
+/*
+ * Functions for expecting or to interfere autovacuum state
+ */
+CREATE FUNCTION get_parallel_autovacuum_free_workers()
+RETURNS INTEGER STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION trigger_leader_failure(failure_type text)
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+/*
+ * Injection point related functions
+ */
+CREATE FUNCTION inj_set_free_workers_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_set_free_workers_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
new file mode 100644
index 00000000000..7948f4858ae
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -0,0 +1,255 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_autovacuum.c
+ * Helpers to write tests for parallel autovacuum
+ *
+ * Copyright (c) 2020-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_autovacuum/test_autovacuum.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "postmaster/autovacuum.h"
+#include "storage/shmem.h"
+#include "storage/ipc.h"
+#include "storage/lwlock.h"
+#include "utils/builtins.h"
+#include "utils/injection_point.h"
+
+PG_MODULE_MAGIC;
+
+typedef enum AVLeaderFaulureType
+{
+ FAIL_NONE,
+ FAIL_ERROR,
+ FAIL_FATAL,
+} AVLeaderFaulureType;
+
+typedef struct InjPointState
+{
+ bool enabled_set_free_workers;
+ uint32 free_parallel_workers;
+
+ bool enabled_leader_failure;
+ AVLeaderFaulureType ftype;
+} InjPointState;
+
+static InjPointState * inj_point_state;
+
+/* Shared memory init callbacks */
+static shmem_request_hook_type prev_shmem_request_hook = NULL;
+static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+
+static void
+test_autovacuum_shmem_request(void)
+{
+ if (prev_shmem_request_hook)
+ prev_shmem_request_hook();
+
+ RequestAddinShmemSpace(sizeof(InjPointState));
+}
+
+static void
+test_autovacuum_shmem_startup(void)
+{
+ bool found;
+
+ if (prev_shmem_startup_hook)
+ prev_shmem_startup_hook();
+
+ /* Create or attach to the shared memory state */
+ LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
+
+ inj_point_state = ShmemInitStruct("injection_points",
+ sizeof(InjPointState),
+ &found);
+
+ if (!found)
+ {
+ /* First time through, initialize */
+ inj_point_state->enabled_leader_failure = false;
+ inj_point_state->enabled_set_free_workers = false;
+ inj_point_state->ftype = FAIL_NONE;
+
+ /* Keep it in sync with AutoVacuumShmemInit */
+ inj_point_state->free_parallel_workers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+
+ InjectionPointAttach("autovacuum-set-free-parallel-workers-num",
+ "test_autovacuum",
+ "inj_set_free_workers",
+ NULL,
+ 0);
+
+ InjectionPointAttach("autovacuum-trigger-leader-failure",
+ "test_autovacuum",
+ "inj_trigger_leader_failure",
+ NULL,
+ 0);
+ }
+
+ LWLockRelease(AddinShmemInitLock);
+}
+
+void
+_PG_init(void)
+{
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ prev_shmem_request_hook = shmem_request_hook;
+ shmem_request_hook = test_autovacuum_shmem_request;
+ prev_shmem_startup_hook = shmem_startup_hook;
+ shmem_startup_hook = test_autovacuum_shmem_startup;
+}
+
+extern PGDLLEXPORT void inj_set_free_workers(const char *name,
+ const void *private_data,
+ void *arg);
+extern PGDLLEXPORT void inj_trigger_leader_failure(const char *name,
+ const void *private_data,
+ void *arg);
+
+/*
+ * Set number of currently available parallel a/v workers. This value may
+ * change after reserving or releasing such workers.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_set_free_workers(const char *name, const void *private_data, void *arg)
+{
+ ereport(LOG,
+ errmsg("set parallel workers injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->enabled_set_free_workers)
+ {
+ Assert(arg != NULL);
+ inj_point_state->free_parallel_workers = *(uint32 *) arg;
+ }
+}
+
+/*
+ * Throw an ERROR or FATAL, if somebody requested it.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_trigger_leader_failure(const char *name, const void *private_data,
+ void *arg)
+{
+ int elevel;
+ char *elevel_str;
+
+ ereport(LOG,
+ errmsg("trigger leader failure injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->ftype == FAIL_NONE ||
+ !inj_point_state->enabled_leader_failure)
+ {
+ return;
+ }
+
+ elevel = inj_point_state->ftype == FAIL_ERROR ? ERROR : FATAL;
+ elevel_str = elevel == ERROR ? "error" : "fatal";
+
+ ereport(elevel, errmsg("%s, triggered by injection point", elevel_str));
+}
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+ uint32 nworkers;
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ nworkers = inj_point_state->free_parallel_workers;
+ LWLockRelease(AutovacuumLock);
+
+ PG_RETURN_UINT32(nworkers);
+}
+
+PG_FUNCTION_INFO_V1(trigger_leader_failure);
+Datum
+trigger_leader_failure(PG_FUNCTION_ARGS)
+{
+ const char *failure_type = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ if (strcmp(failure_type, "NONE") == 0)
+ inj_point_state->ftype = FAIL_NONE;
+ else if (strcmp(failure_type, "ERROR") == 0)
+ inj_point_state->ftype = FAIL_ERROR;
+ else if (strcmp(failure_type, "FATAL") == 0)
+ inj_point_state->ftype = FAIL_FATAL;
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid leader failure type : %s", failure_type)));
+
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_attach);
+Datum
+inj_set_free_workers_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = true;
+ inj_point_state->ftype = FAIL_NONE;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_detach);
+Datum
+inj_set_free_workers_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_attach);
+Datum
+inj_leader_failure_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = true;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_detach);
+Datum
+inj_leader_failure_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.control b/src/test/modules/test_autovacuum/test_autovacuum.control
new file mode 100644
index 00000000000..1b7fad258f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.control
@@ -0,0 +1,3 @@
+comment = 'Test code for parallel autovacuum'
+default_version = '1.0'
+module_pathname = '$libdir/test_autovacuum'
--
2.43.0
v17-0001-Parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v17-0001-Parallel-autovacuum.patchDownload
From a5f261dc7b4fe37aba8f24ef5241e2b1f2d85a36 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Sun, 23 Nov 2025 01:03:24 +0700
Subject: [PATCH v17 1/4] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 42 ++++-
src/backend/postmaster/autovacuum.c | 166 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 +-
src/backend/utils/misc/guc_parameters.dat | 9 +
src/backend/utils/misc/postgresql.conf.sample | 2 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 4 +
src/include/utils/rel.h | 7 +
11 files changed, 242 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 0b83f98ed5f..692ac46733e 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1881,6 +1890,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index c3b3c9ea21a..6a3a00585f9 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -373,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -553,12 +557,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -597,8 +606,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -646,6 +655,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note, that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ AutoVacuumReserveParallelWorkers(&nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +706,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -738,6 +764,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Also release all previously reserved parallel autovacuum workers */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseParallelWorkers(pvs->pcxt->nworkers_launched);
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 3e507d23cc9..bc11970bfee 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -151,6 +151,12 @@ int Log_autoanalyze_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Variable to keep number of currently reserved parallel autovacuum workers.
+ * It is only relevant for parallel autovacuum leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -285,6 +291,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +307,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -364,6 +374,8 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
+static void AutoVacuumReleaseAllParallelWorkers(void);
@@ -763,6 +775,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -779,6 +793,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -1383,6 +1406,19 @@ avl_sigusr2_handler(SIGNAL_ARGS)
* AUTOVACUUM WORKER CODE
********************************************************************/
+/*
+ * Make sure that all reserved workers are released, even if parallel
+ * autovacuum leader is finishing due to FATAL error.
+ */
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ if (code != 0)
+ AutoVacuumReleaseAllParallelWorkers();
+
+ Assert(av_nworkers_reserved == 0);
+}
+
/*
* Main entry point for autovacuum worker processes.
*/
@@ -1438,6 +1474,8 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
/* Early initialization */
BaseInit();
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
+
/*
* If an exception is encountered, processing resumes here.
*
@@ -2480,6 +2518,12 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure
+ * that all reserved workers are released.
+ */
+ AutoVacuumReleaseAllParallelWorkers();
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -2880,8 +2924,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3358,6 +3406,85 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader
+ * autovacuum process must call this function during computing the parallel
+ * degree.
+ *
+ * 'nworkers' is the desired number of parallel workers to reserve. Function
+ * sets 'nworkers' to the number of parallel workers that actually can be
+ * launched and reserves these workers (if any) in global autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+void
+AutoVacuumReserveParallelWorkers(int *nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ /*
+ * We can only reserve workers at the beginning of parallel index
+ * processing, so we must not have any reserved workers right now.
+ */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ *nworkers= Min(AutoVacuumShmem->av_freeParallelWorkers, *nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= *nworkers;
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved = *nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Leader autovacuum process must call this function in order to update global
+ * autovacuum state, so other leaders will be able to use these parallel
+ * workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Same as above, but release *all* parallel workers, that were reserved by
+ * current leader autovacuum process.
+ */
+static void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3418,6 +3545,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3499,3 +3630,34 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of free parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap the number of free workers by new parameter's value, if needed.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers,
+ autovacuum_max_parallel_workers);
+
+ if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of free workers.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+ }
+
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..24ddb276f0c 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index ae9d5f3fb70..c8a99a67767 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3326,9 +3326,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only
+ * cost-based delays need to be affected also to parallel vacuum workers,
+ * and we will handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 7c60b125564..e933f5048f7 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -154,6 +154,15 @@
max => '2000000000',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+ long_desc => 'This parameter is capped by "max_worker_processes" (not by "autovacuum_max_workers"!).',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '2',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index dc9e2255f8a..86c67b790b0 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -691,6 +691,8 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 2 # disabled by default and limited by
+ # max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 06edea98f06..2b8a4aab390 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1423,6 +1423,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index db559b39c4d..ad6e19f426c 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e43067d0260..4acadbc0610 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -65,6 +65,10 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern void AutoVacuumReserveParallelWorkers(int *nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index d03ab247788..c1d882659f9 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
Hi,
On Tue, Jan 6, 2026 at 3:44 AM Daniil Davydov <3danissimo@gmail.com> wrote:
On Tue, Jan 6, 2026 at 1:51 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Are you still working on it? Or shall I draft this part on top of the
0001 patch?I thought about some "beautiful" approach, but for now I have
only one idea - force parallel a/v workers to get values for these
parameters from shmem (which obviously can be modified by the
leader a/v process). I'll post this patch in the near future.
I am posting a draft version of the patch (see 0005) that allows parallel
leader to propagate changes of cost-based parameters to its parallel
workers. It is a very rough fix, but it reflects my idea that is to have some
shared state from which parallel workers can get values for the parameters
(and which only leader worker can modify, obviously).
I have also added a test that shows that this idea is working - the test
ensures that parallel workers can change its parameters if they have been
changed for the leader worker.
Note that so far the work is in progress - this logic works only for
vacuum_cost_delay and vacuum_cost_limits parameters. I think that we
should agree on an idea first, and only then apply logic for all appropriate
parameters.
What do you think?
--
Best regards,
Daniil Davydov
Attachments:
v18-0001-Parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v18-0001-Parallel-autovacuum.patchDownload
From a5f261dc7b4fe37aba8f24ef5241e2b1f2d85a36 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Sun, 23 Nov 2025 01:03:24 +0700
Subject: [PATCH v18 1/5] Parallel autovacuum
---
src/backend/access/common/reloptions.c | 11 ++
src/backend/commands/vacuumparallel.c | 42 ++++-
src/backend/postmaster/autovacuum.c | 166 +++++++++++++++++-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/misc/guc.c | 8 +-
src/backend/utils/misc/guc_parameters.dat | 9 +
src/backend/utils/misc/postgresql.conf.sample | 2 +
src/bin/psql/tab-complete.in.c | 1 +
src/include/miscadmin.h | 1 +
src/include/postmaster/autovacuum.h | 4 +
src/include/utils/rel.h | 7 +
11 files changed, 242 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 0b83f98ed5f..692ac46733e 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,15 @@ static relopt_int intRelOpts[] =
},
SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
},
+ {
+ {
+ "autovacuum_parallel_workers",
+ "Maximum number of parallel autovacuum workers that can be used for processing this table.",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ -1, -1, 1024
+ },
{
{
"autovacuum_vacuum_threshold",
@@ -1881,6 +1890,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
{"autovacuum_enabled", RELOPT_TYPE_BOOL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+ {"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index c3b3c9ea21a..6a3a00585f9 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
/*-------------------------------------------------------------------------
*
* vacuumparallel.c
- * Support routines for parallel vacuum execution.
+ * Support routines for parallel vacuum and autovacuum execution. In the
+ * comments below, the word "vacuum" will refer to both vacuum and
+ * autovacuum.
*
* This file contains routines that are intended to support setting up, using,
* and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
#include "executor/instrument.h"
#include "optimizer/paths.h"
#include "pgstat.h"
+#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
#include "utils/lsyscache.h"
@@ -373,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
shared->queryid = pgstat_get_my_query_id();
shared->maintenance_work_mem_worker =
(nindexes_mwm > 0) ?
- maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
- maintenance_work_mem;
+ vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+ vac_work_mem;
+
shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
/* Prepare DSA space for dead items */
@@ -553,12 +557,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
int nindexes_parallel_bulkdel = 0;
int nindexes_parallel_cleanup = 0;
int parallel_workers;
+ int max_workers;
+
+ max_workers = AmAutoVacuumWorkerProcess() ?
+ autovacuum_max_parallel_workers :
+ max_parallel_maintenance_workers;
/*
* We don't allow performing parallel operation in standalone backend or
* when parallelism is disabled.
*/
- if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+ if (!IsUnderPostmaster || max_workers == 0)
return 0;
/*
@@ -597,8 +606,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
parallel_workers = (nrequested > 0) ?
Min(nrequested, nindexes_parallel) : nindexes_parallel;
- /* Cap by max_parallel_maintenance_workers */
- parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+ /* Cap by GUC variable */
+ parallel_workers = Min(parallel_workers, max_workers);
return parallel_workers;
}
@@ -646,6 +655,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
*/
nworkers = Min(nworkers, pvs->pcxt->nworkers);
+ /*
+ * Reserve workers in autovacuum global state. Note, that we may be given
+ * fewer workers than we requested.
+ */
+ if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+ AutoVacuumReserveParallelWorkers(&nworkers);
+
/*
* Set index vacuum status and mark whether parallel vacuum worker can
* process it.
@@ -690,6 +706,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
LaunchParallelWorkers(pvs->pcxt);
+ if (AmAutoVacuumWorkerProcess() &&
+ pvs->pcxt->nworkers_launched < nworkers)
+ {
+ /*
+ * Tell autovacuum that we could not launch all the previously
+ * reserved workers.
+ */
+ AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+ }
+
if (pvs->pcxt->nworkers_launched > 0)
{
/*
@@ -738,6 +764,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+ /* Also release all previously reserved parallel autovacuum workers */
+ if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+ AutoVacuumReleaseParallelWorkers(pvs->pcxt->nworkers_launched);
}
/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 3e507d23cc9..bc11970bfee 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -151,6 +151,12 @@ int Log_autoanalyze_min_duration = 600000;
static double av_storage_param_cost_delay = -1;
static int av_storage_param_cost_limit = -1;
+/*
+ * Variable to keep number of currently reserved parallel autovacuum workers.
+ * It is only relevant for parallel autovacuum leader process.
+ */
+static int av_nworkers_reserved = 0;
+
/* Flags set by signal handlers */
static volatile sig_atomic_t got_SIGUSR2 = false;
@@ -285,6 +291,8 @@ typedef struct AutoVacuumWorkItem
* av_workItems work item array
* av_nworkersForBalance the number of autovacuum workers to use when
* calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
*
* This struct is protected by AutovacuumLock, except for av_signal and parts
* of the worker list (see above).
@@ -299,6 +307,8 @@ typedef struct
WorkerInfo av_startingWorker;
AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
pg_atomic_uint32 av_nworkersForBalance;
+ uint32 av_freeParallelWorkers;
+ uint32 av_maxParallelWorkers;
} AutoVacuumShmemStruct;
static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -364,6 +374,8 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
static void avl_sigusr2_handler(SIGNAL_ARGS);
static bool av_worker_available(void);
static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
+static void AutoVacuumReleaseAllParallelWorkers(void);
@@ -763,6 +775,8 @@ ProcessAutoVacLauncherInterrupts(void)
if (ConfigReloadPending)
{
int autovacuum_max_workers_prev = autovacuum_max_workers;
+ int autovacuum_max_parallel_workers_prev =
+ autovacuum_max_parallel_workers;
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
@@ -779,6 +793,15 @@ ProcessAutoVacLauncherInterrupts(void)
if (autovacuum_max_workers_prev != autovacuum_max_workers)
check_av_worker_gucs();
+ /*
+ * If autovacuum_max_parallel_workers changed, we must take care of
+ * the correct value of available parallel autovacuum workers in
+ * shmem.
+ */
+ if (autovacuum_max_parallel_workers_prev !=
+ autovacuum_max_parallel_workers)
+ adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
/* rebuild the list in case the naptime changed */
rebuild_database_list(InvalidOid);
}
@@ -1383,6 +1406,19 @@ avl_sigusr2_handler(SIGNAL_ARGS)
* AUTOVACUUM WORKER CODE
********************************************************************/
+/*
+ * Make sure that all reserved workers are released, even if parallel
+ * autovacuum leader is finishing due to FATAL error.
+ */
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+ if (code != 0)
+ AutoVacuumReleaseAllParallelWorkers();
+
+ Assert(av_nworkers_reserved == 0);
+}
+
/*
* Main entry point for autovacuum worker processes.
*/
@@ -1438,6 +1474,8 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
/* Early initialization */
BaseInit();
+ before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
+
/*
* If an exception is encountered, processing resumes here.
*
@@ -2480,6 +2518,12 @@ do_autovacuum(void)
}
PG_CATCH();
{
+ /*
+ * Parallel autovacuum can reserve parallel workers. Make sure
+ * that all reserved workers are released.
+ */
+ AutoVacuumReleaseAllParallelWorkers();
+
/*
* Abort the transaction, start a new one, and proceed with the
* next table in our list.
@@ -2880,8 +2924,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
*/
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
- /* As of now, we don't support parallel vacuum for autovacuum */
- tab->at_params.nworkers = -1;
+
+ /* Decide whether we need to process indexes of table in parallel. */
+ tab->at_params.nworkers = avopts
+ ? avopts->autovacuum_parallel_workers
+ : -1;
+
tab->at_params.freeze_min_age = freeze_min_age;
tab->at_params.freeze_table_age = freeze_table_age;
tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3358,6 +3406,85 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
return result;
}
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader
+ * autovacuum process must call this function during computing the parallel
+ * degree.
+ *
+ * 'nworkers' is the desired number of parallel workers to reserve. Function
+ * sets 'nworkers' to the number of parallel workers that actually can be
+ * launched and reserves these workers (if any) in global autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+void
+AutoVacuumReserveParallelWorkers(int *nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ /*
+ * We can only reserve workers at the beginning of parallel index
+ * processing, so we must not have any reserved workers right now.
+ */
+ Assert(av_nworkers_reserved == 0);
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /* Provide as many workers as we can. */
+ *nworkers= Min(AutoVacuumShmem->av_freeParallelWorkers, *nworkers);
+ AutoVacuumShmem->av_freeParallelWorkers -= *nworkers;
+
+ /* Remember how many workers we have reserved. */
+ av_nworkers_reserved = *nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Leader autovacuum process must call this function in order to update global
+ * autovacuum state, so other leaders will be able to use these parallel
+ * workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * If the maximum number of parallel workers was reduced during execution,
+ * we must cap available workers number by its new value.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+ AutoVacuumShmem->av_maxParallelWorkers);
+
+ /* Don't have to remember these workers anymore. */
+ av_nworkers_reserved -= nworkers;
+
+ LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Same as above, but release *all* parallel workers, that were reserved by
+ * current leader autovacuum process.
+ */
+static void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+ /* Only leader worker can call this function. */
+ Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+ if (av_nworkers_reserved > 0)
+ AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+}
+
/*
* autovac_init
* This is called at postmaster initialization.
@@ -3418,6 +3545,10 @@ AutoVacuumShmemInit(void)
Assert(!found);
AutoVacuumShmem->av_launcherpid = 0;
+ AutoVacuumShmem->av_maxParallelWorkers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+ AutoVacuumShmem->av_freeParallelWorkers =
+ AutoVacuumShmem->av_maxParallelWorkers;
dclist_init(&AutoVacuumShmem->av_freeWorkers);
dlist_init(&AutoVacuumShmem->av_runningWorkers);
AutoVacuumShmem->av_startingWorker = NULL;
@@ -3499,3 +3630,34 @@ check_av_worker_gucs(void)
errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
autovacuum_worker_slots)));
}
+
+/*
+ * Make sure that number of free parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+ /*
+ * Cap the number of free workers by new parameter's value, if needed.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers =
+ Min(AutoVacuumShmem->av_freeParallelWorkers,
+ autovacuum_max_parallel_workers);
+
+ if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+ {
+ /*
+ * If user wants to increase number of parallel autovacuum workers, we
+ * must increase number of free workers.
+ */
+ AutoVacuumShmem->av_freeParallelWorkers +=
+ (autovacuum_max_parallel_workers - prev_max_parallel_workers);
+ }
+
+ AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+ LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..24ddb276f0c 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int NBuffers = 16384;
int MaxConnections = 100;
int max_worker_processes = 8;
int max_parallel_workers = 8;
+int autovacuum_max_parallel_workers = 0;
int MaxBackends = 0;
/* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index ae9d5f3fb70..c8a99a67767 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3326,9 +3326,13 @@ set_config_with_handle(const char *name, config_handle *handle,
*
* Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
*
- * Other changes might need to affect other workers, so forbid them.
+ * Other changes might need to affect other workers, so forbid them. Note,
+ * that parallel autovacuum leader is an exception, because only
+ * cost-based delays need to be affected also to parallel vacuum workers,
+ * and we will handle it elsewhere if appropriate.
*/
- if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+ if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+ action != GUC_ACTION_SAVE &&
(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
{
ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 7c60b125564..e933f5048f7 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -154,6 +154,15 @@
max => '2000000000',
},
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+ short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+ long_desc => 'This parameter is capped by "max_worker_processes" (not by "autovacuum_max_workers"!).',
+ variable => 'autovacuum_max_parallel_workers',
+ boot_val => '2',
+ min => '0',
+ max => 'MAX_BACKENDS',
+},
+
{ name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index dc9e2255f8a..86c67b790b0 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -691,6 +691,8 @@
#autovacuum_worker_slots = 16 # autovacuum worker slots to allocate
# (change requires restart)
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 2 # disabled by default and limited by
+ # max_worker_processes
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
# vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 06edea98f06..2b8a4aab390 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1423,6 +1423,7 @@ static const char *const table_storage_parameters[] = {
"autovacuum_multixact_freeze_max_age",
"autovacuum_multixact_freeze_min_age",
"autovacuum_multixact_freeze_table_age",
+ "autovacuum_parallel_workers",
"autovacuum_vacuum_cost_delay",
"autovacuum_vacuum_cost_limit",
"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index db559b39c4d..ad6e19f426c 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
extern PGDLLIMPORT int MaxConnections;
extern PGDLLIMPORT int max_worker_processes;
extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
extern PGDLLIMPORT int commit_timestamp_buffers;
extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index e43067d0260..4acadbc0610 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -65,6 +65,10 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
Oid relationId, BlockNumber blkno);
+/* parallel autovacuum stuff */
+extern void AutoVacuumReserveParallelWorkers(int *nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+
/* shared memory stuff */
extern Size AutoVacuumShmemSize(void);
extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index d03ab247788..c1d882659f9 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
typedef struct AutoVacOpts
{
bool enabled;
+
+ /*
+ * Max number of parallel autovacuum workers. If value is 0 then parallel
+ * degree will computed based on number of indexes.
+ */
+ int autovacuum_parallel_workers;
+
int vacuum_threshold;
int vacuum_max_threshold;
int vacuum_ins_threshold;
--
2.43.0
v18-0004-Documentation-for-parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v18-0004-Documentation-for-parallel-autovacuum.patchDownload
From bbcc4b92941325248254b074a2d1c94f244b6a6c Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Sun, 23 Nov 2025 02:32:44 +0700
Subject: [PATCH v18 4/5] Documentation for parallel autovacuum
---
doc/src/sgml/config.sgml | 17 +++++++++++++++++
doc/src/sgml/maintenance.sgml | 12 ++++++++++++
doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
3 files changed, 49 insertions(+)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 601aa3afb8e..36fcc72f325 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2847,6 +2847,7 @@ include_dir 'conf.d'
<para>
When changing this value, consider also adjusting
<xref linkend="guc-max-parallel-workers"/>,
+ <xref linkend="guc-autovacuum-max-parallel-workers"/>,
<xref linkend="guc-max-parallel-maintenance-workers"/>, and
<xref linkend="guc-max-parallel-workers-per-gather"/>.
</para>
@@ -9282,6 +9283,22 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+ <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+ <secondary>configuration parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that
+ can be used for parallel index vacuuming at one time. Is capped by
+ <xref linkend="guc-max-worker-processes"/>. The default is 2.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 7c958b06273..c9f9163c551 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -926,6 +926,18 @@ HINT: Execute a database-wide VACUUM in that database.
autovacuum workers' activity.
</para>
+ <para>
+ If an autovacuum worker process comes across a table with the enabled
+ <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+ it will launch parallel workers in order to vacuum indexes of this table
+ in a parallel mode. Parallel workers are taken from the pool of processes
+ established by <xref linkend="guc-max-worker-processes"/>, limited by
+ <xref linkend="guc-max-parallel-workers"/>.
+ The total number of parallel autovacuum workers that can be active at one
+ time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+ configuration parameter.
+ </para>
+
<para>
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 77c5a763d45..3592c9acff9 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1717,6 +1717,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+ <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the maximum number of parallel autovacuum workers that can process
+ indexes of this table.
+ The default value is -1, which means no parallel index vacuuming for
+ this table. If value is 0 then parallel degree will computed based on
+ number of indexes.
+ Note that the computed number of workers may not actually be available at
+ run time. If this occurs, the autovacuum will run with fewer workers
+ than expected.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
<term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
<indexterm>
--
2.43.0
v18-0002-Logging-for-parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v18-0002-Logging-for-parallel-autovacuum.patchDownload
From 0aafa271ec90dbe494eea79fd484a4856023b3a8 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Sun, 23 Nov 2025 01:07:47 +0700
Subject: [PATCH v18 2/5] Logging for parallel autovacuum
---
src/backend/access/heap/vacuumlazy.c | 27 +++++++++++++++++++++++++--
src/backend/commands/vacuumparallel.c | 20 ++++++++++++++------
src/include/commands/vacuum.h | 16 ++++++++++++++--
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 54 insertions(+), 10 deletions(-)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 2086a577199..35d2b07aa8a 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -349,6 +349,12 @@ typedef struct LVRelState
int num_index_scans;
int num_dead_items_resets;
Size total_dead_items_bytes;
+
+ /*
+ * Number of planned and actually launched parallel workers for all index
+ * scans, or NULL
+ */
+ PVWorkersUsage *workers_usage;
/* Counters that follow are only for scanned_pages */
int64 tuples_deleted; /* # deleted from table */
int64 tuples_frozen; /* # newly frozen */
@@ -711,6 +717,16 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
indnames = palloc_array(char *, vacrel->nindexes);
for (int i = 0; i < vacrel->nindexes; i++)
indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
+
+ /*
+ * Allocate space for workers usage statistics. Thus, we explicitly
+ * make clear that such statistics must be accumulated. For now, this
+ * is used only by autovacuum leader worker, because it must log it in
+ * the end of table processing.
+ */
+ vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
+ (PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
+ NULL;
}
/*
@@ -1125,6 +1141,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
orig_rel_pages == 0 ? 100.0 :
100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
vacrel->lpdead_items);
+ if (vacrel->workers_usage)
+ appendStringInfo(&buf,
+ _("parallel index vacuum/cleanup : workers planned = %d, workers launched = %d\n"),
+ vacrel->workers_usage->nplanned,
+ vacrel->workers_usage->nlaunched);
for (int i = 0; i < vacrel->nindexes; i++)
{
IndexBulkDeleteResult *istat = vacrel->indstats[i];
@@ -2700,7 +2721,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
{
/* Outsource everything to parallel variant */
parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
- vacrel->num_index_scans);
+ vacrel->num_index_scans,
+ vacrel->workers_usage);
/*
* Do a postcheck to consider applying wraparound failsafe now. Note
@@ -3133,7 +3155,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
/* Outsource everything to parallel variant */
parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
vacrel->num_index_scans,
- estimated_count);
+ estimated_count,
+ vacrel->workers_usage);
}
/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 6a3a00585f9..490f93959d1 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
static int parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
bool *will_parallel_vacuum);
static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum);
+ bool vacuum, PVWorkersUsage *wusage);
static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -502,7 +502,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
*/
void
parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans)
+ int num_index_scans, PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -513,7 +513,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = true;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
}
/*
@@ -521,7 +521,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
*/
void
parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
- int num_index_scans, bool estimated_count)
+ int num_index_scans, bool estimated_count,
+ PVWorkersUsage *wusage)
{
Assert(!IsParallelWorker());
@@ -533,7 +534,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
pvs->shared->reltuples = num_table_tuples;
pvs->shared->estimated_count = estimated_count;
- parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+ parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
/*
@@ -618,7 +619,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
*/
static void
parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
- bool vacuum)
+ bool vacuum, PVWorkersUsage *wusage)
{
int nworkers;
PVIndVacStatus new_status;
@@ -742,6 +743,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
"launched %d parallel vacuum workers for index cleanup (planned: %d)",
pvs->pcxt->nworkers_launched),
pvs->pcxt->nworkers_launched, nworkers)));
+
+ /* Remember these values, if we asked to. */
+ if (wusage != NULL)
+ {
+ wusage->nlaunched += pvs->pcxt->nworkers_launched;
+ wusage->nplanned += nworkers;
+ }
}
/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index e885a4b9c77..ec5d70aacdc 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,16 @@ typedef struct VacDeadItemsInfo
int64 num_items; /* current # of entries */
} VacDeadItemsInfo;
+/*
+ * PVWorkersUsage stores information about total number of launched and planned
+ * workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+ int nlaunched;
+ int nplanned;
+} PVWorkersUsage;
+
/* GUC parameters */
extern PGDLLIMPORT int default_statistics_target; /* PGDLLIMPORT for PostGIS */
extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +404,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
- int num_index_scans);
+ int num_index_scans,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
long num_table_tuples,
int num_index_scans,
- bool estimated_count);
+ bool estimated_count,
+ PVWorkersUsage *wusage);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index b9e671fcda8..6e35c6aa493 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2397,6 +2397,7 @@ PullFilterOps
PushFilter
PushFilterOps
PushFunction
+PVWorkersUsage
PyCFunction
PyMethodDef
PyModuleDef
--
2.43.0
v18-0003-Tests-for-parallel-autovacuum.patchtext/x-patch; charset=US-ASCII; name=v18-0003-Tests-for-parallel-autovacuum.patchDownload
From 29fb650ac54e2f3bbc8f920292662906345e29ac Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Sun, 23 Nov 2025 01:08:14 +0700
Subject: [PATCH v18 3/5] Tests for parallel autovacuum
---
src/backend/commands/vacuumparallel.c | 8 +
src/backend/postmaster/autovacuum.c | 14 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_autovacuum/.gitignore | 2 +
src/test/modules/test_autovacuum/Makefile | 26 ++
src/test/modules/test_autovacuum/meson.build | 36 +++
.../modules/test_autovacuum/t/001_basic.pl | 170 ++++++++++++
.../test_autovacuum/test_autovacuum--1.0.sql | 34 +++
.../modules/test_autovacuum/test_autovacuum.c | 255 ++++++++++++++++++
.../test_autovacuum/test_autovacuum.control | 3 +
11 files changed, 550 insertions(+)
create mode 100644 src/test/modules/test_autovacuum/.gitignore
create mode 100644 src/test/modules/test_autovacuum/Makefile
create mode 100644 src/test/modules/test_autovacuum/meson.build
create mode 100644 src/test/modules/test_autovacuum/t/001_basic.pl
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.c
create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.control
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 490f93959d1..c2f0a37eef2 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -39,6 +39,7 @@
#include "postmaster/autovacuum.h"
#include "storage/bufmgr.h"
#include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -752,6 +753,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
}
}
+ /*
+ * To be able to exercise whether all reserved parallel workers are being
+ * released anyway, allow injection points to trigger a failure at this
+ * point.
+ */
+ INJECTION_POINT("autovacuum-trigger-leader-failure", NULL);
+
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index bc11970bfee..a27274bfb4d 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3439,6 +3439,13 @@ AutoVacuumReserveParallelWorkers(int *nworkers)
/* Remember how many workers we have reserved. */
av_nworkers_reserved = *nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
}
@@ -3468,6 +3475,13 @@ AutoVacuumReleaseParallelWorkers(int nworkers)
/* Don't have to remember these workers anymore. */
av_nworkers_reserved -= nworkers;
+ /*
+ * Injection point to help exercising number of available parallel
+ * autovacuum workers.
+ */
+ INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+ &AutoVacuumShmem->av_freeParallelWorkers);
+
LWLockRelease(AutovacuumLock);
}
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 4c6d56d97d8..bfe365fa575 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -16,6 +16,7 @@ SUBDIRS = \
plsample \
spgist_name_ops \
test_aio \
+ test_autovacuum \
test_binaryheap \
test_bitmapset \
test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 1b31c5b98d6..01a3e3ec044 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('plsample')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
+subdir('test_autovacuum')
subdir('test_binaryheap')
subdir('test_bitmapset')
subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..4cf7344b2ac
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,26 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+MODULE_big = test_autovacuum
+OBJS = \
+ $(WIN32RES) \
+ test_autovacuum.o
+
+EXTENSION = test_autovacuum
+DATA = test_autovacuum--1.0.sql
+
+TAP_TESTS = 1
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..3441e5e49cf
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+test_autovacuum_sources = files(
+ 'test_autovacuum.c',
+)
+
+if host_system == 'windows'
+ test_autovacuum_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'test_autovacuum',
+ '--FILEDESC', 'test_autovacuum - test code for parallel autovacuum',])
+endif
+
+test_autovacuum = shared_module('test_autovacuum',
+ test_autovacuum_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += test_autovacuum
+
+test_install_data += files(
+ 'test_autovacuum.control',
+ 'test_autovacuum--1.0.sql',
+)
+
+tests += {
+ 'name': 'test_autovacuum',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'env': {
+ 'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+ },
+ 'tests': [
+ 't/001_basic.pl',
+ ],
+ },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_basic.pl b/src/test/modules/test_autovacuum/t/001_basic.pl
new file mode 100644
index 00000000000..8bf153d132c
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_basic.pl
@@ -0,0 +1,170 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+ max_worker_processes = 20
+ max_parallel_workers = 20
+ max_parallel_maintenance_workers = 20
+ autovacuum_max_parallel_workers = 10
+ log_min_messages = debug2
+ log_autovacuum_min_duration = 0
+ autovacuum_naptime = '1s'
+ min_parallel_index_scan_size = 0
+ shared_preload_libraries=test_autovacuum
+});
+$node->start;
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table with specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+ CREATE TABLE test_autovac (
+ id SERIAL PRIMARY KEY,
+ col_1 INTEGER, col_2 INTEGER, col_3 INTEGER, col_4 INTEGER
+ ) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+ autovacuum_enabled = false);
+
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$indexes_num LOOP
+ EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+ END LOOP;
+ END \$\$;
+});
+
+# Insert specified tuples num into the table
+$node->safe_psql('postgres', qq{
+ DO \$\$
+ DECLARE
+ i INTEGER;
+ BEGIN
+ FOR i IN 1..$initial_rows_num LOOP
+ INSERT INTO test_autovac VALUES (i, i + 1, i + 2, i + 3);
+ END LOOP;
+ END \$\$;
+});
+
+# Now, create some dead tuples and refresh table statistics
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+ CREATE EXTENSION test_autovacuum;
+ SELECT inj_set_free_workers_attach();
+ SELECT inj_leader_failure_attach();
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can.
+# Also check whether all requested workers:
+# 1) launched
+# 2) correctly released
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+$node->wait_for_log(qr/parallel index vacuum\/cleanup : workers planned = 2, / .
+ qr/workers launched = 2/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10, 'All parallel workers has been released by the leader');
+
+# Disable autovacuum on table during preparation for the next test
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_2 = 0 WHERE (col_2 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 2:
+# We want parallel autovacuum workers to be released even if leader gets an
+# error. At first, simulate situation, when leader exites due to an ERROR.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('ERROR');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/error, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after ERROR');
+
+# Disable autovacuum on table during preparation for the next test
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_3 = 0 WHERE (col_3 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+# Test 3:
+# Same as Test 2, but simulate situation, when leader exites due to FATAL.
+
+$node->safe_psql('postgres', qq(
+ SELECT trigger_leader_failure('FATAL');
+));
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/fatal, triggered by injection point/);
+
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_free_workers();",
+ stdout => \$psql_out,
+);
+is($psql_out, 10,
+ 'All parallel workers has been released by the leader after FATAL');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT inj_set_free_workers_detach();
+ SELECT inj_leader_failure_detach();
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
new file mode 100644
index 00000000000..017d5da85ea
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -0,0 +1,34 @@
+/* src/test/modules/test_autovacuum/test_autovacuum--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_autovacuum" to load this file. \quit
+
+/*
+ * Functions for expecting or to interfere autovacuum state
+ */
+CREATE FUNCTION get_parallel_autovacuum_free_workers()
+RETURNS INTEGER STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION trigger_leader_failure(failure_type text)
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+/*
+ * Injection point related functions
+ */
+CREATE FUNCTION inj_set_free_workers_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_set_free_workers_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
new file mode 100644
index 00000000000..7948f4858ae
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -0,0 +1,255 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_autovacuum.c
+ * Helpers to write tests for parallel autovacuum
+ *
+ * Copyright (c) 2020-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_autovacuum/test_autovacuum.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "postmaster/autovacuum.h"
+#include "storage/shmem.h"
+#include "storage/ipc.h"
+#include "storage/lwlock.h"
+#include "utils/builtins.h"
+#include "utils/injection_point.h"
+
+PG_MODULE_MAGIC;
+
+typedef enum AVLeaderFaulureType
+{
+ FAIL_NONE,
+ FAIL_ERROR,
+ FAIL_FATAL,
+} AVLeaderFaulureType;
+
+typedef struct InjPointState
+{
+ bool enabled_set_free_workers;
+ uint32 free_parallel_workers;
+
+ bool enabled_leader_failure;
+ AVLeaderFaulureType ftype;
+} InjPointState;
+
+static InjPointState * inj_point_state;
+
+/* Shared memory init callbacks */
+static shmem_request_hook_type prev_shmem_request_hook = NULL;
+static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+
+static void
+test_autovacuum_shmem_request(void)
+{
+ if (prev_shmem_request_hook)
+ prev_shmem_request_hook();
+
+ RequestAddinShmemSpace(sizeof(InjPointState));
+}
+
+static void
+test_autovacuum_shmem_startup(void)
+{
+ bool found;
+
+ if (prev_shmem_startup_hook)
+ prev_shmem_startup_hook();
+
+ /* Create or attach to the shared memory state */
+ LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
+
+ inj_point_state = ShmemInitStruct("injection_points",
+ sizeof(InjPointState),
+ &found);
+
+ if (!found)
+ {
+ /* First time through, initialize */
+ inj_point_state->enabled_leader_failure = false;
+ inj_point_state->enabled_set_free_workers = false;
+ inj_point_state->ftype = FAIL_NONE;
+
+ /* Keep it in sync with AutoVacuumShmemInit */
+ inj_point_state->free_parallel_workers =
+ Min(autovacuum_max_parallel_workers, max_worker_processes);
+
+ InjectionPointAttach("autovacuum-set-free-parallel-workers-num",
+ "test_autovacuum",
+ "inj_set_free_workers",
+ NULL,
+ 0);
+
+ InjectionPointAttach("autovacuum-trigger-leader-failure",
+ "test_autovacuum",
+ "inj_trigger_leader_failure",
+ NULL,
+ 0);
+ }
+
+ LWLockRelease(AddinShmemInitLock);
+}
+
+void
+_PG_init(void)
+{
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ prev_shmem_request_hook = shmem_request_hook;
+ shmem_request_hook = test_autovacuum_shmem_request;
+ prev_shmem_startup_hook = shmem_startup_hook;
+ shmem_startup_hook = test_autovacuum_shmem_startup;
+}
+
+extern PGDLLEXPORT void inj_set_free_workers(const char *name,
+ const void *private_data,
+ void *arg);
+extern PGDLLEXPORT void inj_trigger_leader_failure(const char *name,
+ const void *private_data,
+ void *arg);
+
+/*
+ * Set number of currently available parallel a/v workers. This value may
+ * change after reserving or releasing such workers.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_set_free_workers(const char *name, const void *private_data, void *arg)
+{
+ ereport(LOG,
+ errmsg("set parallel workers injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->enabled_set_free_workers)
+ {
+ Assert(arg != NULL);
+ inj_point_state->free_parallel_workers = *(uint32 *) arg;
+ }
+}
+
+/*
+ * Throw an ERROR or FATAL, if somebody requested it.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_trigger_leader_failure(const char *name, const void *private_data,
+ void *arg)
+{
+ int elevel;
+ char *elevel_str;
+
+ ereport(LOG,
+ errmsg("trigger leader failure injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->ftype == FAIL_NONE ||
+ !inj_point_state->enabled_leader_failure)
+ {
+ return;
+ }
+
+ elevel = inj_point_state->ftype == FAIL_ERROR ? ERROR : FATAL;
+ elevel_str = elevel == ERROR ? "error" : "fatal";
+
+ ereport(elevel, errmsg("%s, triggered by injection point", elevel_str));
+}
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+ uint32 nworkers;
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+ nworkers = inj_point_state->free_parallel_workers;
+ LWLockRelease(AutovacuumLock);
+
+ PG_RETURN_UINT32(nworkers);
+}
+
+PG_FUNCTION_INFO_V1(trigger_leader_failure);
+Datum
+trigger_leader_failure(PG_FUNCTION_ARGS)
+{
+ const char *failure_type = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ if (strcmp(failure_type, "NONE") == 0)
+ inj_point_state->ftype = FAIL_NONE;
+ else if (strcmp(failure_type, "ERROR") == 0)
+ inj_point_state->ftype = FAIL_ERROR;
+ else if (strcmp(failure_type, "FATAL") == 0)
+ inj_point_state->ftype = FAIL_FATAL;
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid leader failure type : %s", failure_type)));
+
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_attach);
+Datum
+inj_set_free_workers_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = true;
+ inj_point_state->ftype = FAIL_NONE;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_detach);
+Datum
+inj_set_free_workers_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_set_free_workers = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_attach);
+Datum
+inj_leader_failure_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = true;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_detach);
+Datum
+inj_leader_failure_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_leader_failure = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.control b/src/test/modules/test_autovacuum/test_autovacuum.control
new file mode 100644
index 00000000000..1b7fad258f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.control
@@ -0,0 +1,3 @@
+comment = 'Test code for parallel autovacuum'
+default_version = '1.0'
+module_pathname = '$libdir/test_autovacuum'
--
2.43.0
v18-0005-Cost-based-parameters-propagation-for-parallel-a.patchtext/x-patch; charset=US-ASCII; name=v18-0005-Cost-based-parameters-propagation-for-parallel-a.patchDownload
From 14abdef918a73e465900f758204de19982fc4224 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davydov@postgrespro.ru>
Date: Wed, 7 Jan 2026 16:03:20 +0700
Subject: [PATCH v18 5/5] Cost-based parameters propagation for parallel
autovacuum
---
src/backend/commands/vacuum.c | 26 +++-
src/backend/commands/vacuumparallel.c | 130 ++++++++++++++++++
src/include/commands/vacuum.h | 2 +
src/test/modules/test_autovacuum/Makefile | 2 +
.../modules/test_autovacuum/t/001_basic.pl | 83 +++++++++++
.../test_autovacuum/test_autovacuum--1.0.sql | 12 ++
.../modules/test_autovacuum/test_autovacuum.c | 75 ++++++++++
7 files changed, 328 insertions(+), 2 deletions(-)
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index aa4fbec143f..4c40a36523a 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2430,8 +2430,24 @@ vacuum_delay_point(bool is_analyze)
/* Always check for interrupts */
CHECK_FOR_INTERRUPTS();
- if (InterruptPending ||
- (!VacuumCostActive && !ConfigReloadPending))
+ if (InterruptPending)
+ return;
+
+ if (!AmAutoVacuumWorkerProcess())
+ {
+ /*
+ * If we are autovacuum parallel worker, check whether cost-based
+ * parameters had changed in leader worker.
+ * If so, vacuum_cost_delay and vacuum_cost_limit will be set to the
+ * values which leader worker is operating on.
+ *
+ * Do it before checking VacuumCostActive, because its value might be
+ * changed after leader's parameters consumption.
+ */
+ parallel_vacuum_fix_cost_based_params();
+ }
+
+ if (!VacuumCostActive && !ConfigReloadPending)
return;
/*
@@ -2445,6 +2461,12 @@ vacuum_delay_point(bool is_analyze)
ConfigReloadPending = false;
ProcessConfigFile(PGC_SIGHUP);
VacuumUpdateCosts();
+
+ /*
+ * If we are parallel autovacuum leader and some of cost-based
+ * parameters had changed, let other parallel workers know.
+ */
+ parallel_vacuum_propagate_cost_based_params();
}
/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index c2f0a37eef2..06ecffeec42 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -54,6 +54,22 @@
#define PARALLEL_VACUUM_KEY_WAL_USAGE 4
#define PARALLEL_VACUUM_KEY_INDEX_STATS 5
+/*
+ * Only autovacuum leader can reload config file. We use this structure in
+ * parallel autovacuum for keeping worker's parameters in sync with leader's
+ * parameters.
+ */
+typedef struct PVSharedCostParams
+{
+ slock_t spinlock; /* protects all fields below */
+
+ /* Copies of corresponding parameters from autovacuum leader process */
+ double cost_delay;
+ int cost_limit;
+} PVSharedCostParams;
+
+static PVSharedCostParams *pv_shared_cost_params = NULL;
+
/*
* Shared information among parallel workers. So this is allocated in the DSM
* segment.
@@ -123,6 +139,18 @@ typedef struct PVShared
/* Statistics of shared dead items */
VacDeadItemsInfo dead_items_info;
+
+ /*
+ * If 'true' then we are running parallel autovacuum. Otherwise, we are
+ * running parallel maintenence VACUUM.
+ */
+ bool am_parallel_autovacuum;
+
+ /*
+ * Struct for syncing parameters between supportive parallel autovacuum
+ * workers with leader worker.
+ */
+ PVSharedCostParams cost_params;
} PVShared;
/* Status used during parallel index vacuum or cleanup */
@@ -396,6 +424,17 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
pg_atomic_init_u32(&(shared->active_nworkers), 0);
pg_atomic_init_u32(&(shared->idx), 0);
+ shared->am_parallel_autovacuum = AmAutoVacuumWorkerProcess();
+
+ if (shared->am_parallel_autovacuum)
+ {
+ shared->cost_params.cost_delay = vacuum_cost_delay;
+ shared->cost_params.cost_limit = vacuum_cost_limit;
+ SpinLockInit(&shared->cost_params.spinlock);
+
+ pv_shared_cost_params = &(shared->cost_params);
+ }
+
shm_toc_insert(pcxt->toc, PARALLEL_VACUUM_KEY_SHARED, shared);
pvs->shared = shared;
@@ -538,6 +577,53 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
}
+/*
+ * Function to be called from parallel autovacuum worker in order to sync
+ * some cost-based delay parameter with the leader worker.
+ */
+bool
+parallel_vacuum_fix_cost_based_params(void)
+{
+ /* Check whether we are running parallel autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return false;
+
+ Assert(IsParallelWorker() && !AmAutoVacuumWorkerProcess());
+
+ SpinLockAcquire(&pv_shared_cost_params->spinlock);
+
+ vacuum_cost_delay = pv_shared_cost_params->cost_delay;
+ vacuum_cost_limit = pv_shared_cost_params->cost_limit;
+
+ SpinLockRelease(&pv_shared_cost_params->spinlock);
+
+ if (vacuum_cost_delay > 0 && !VacuumFailsafeActive)
+ VacuumCostActive = true;
+
+ return true;
+}
+
+/*
+ * Function to be called from parallel autovacuum leader in order to propagate
+ * some cost-based parameters to the supportive workers.
+ */
+void
+parallel_vacuum_propagate_cost_based_params(void)
+{
+ /* Check whether we are running parallel autovacuum */
+ if (pv_shared_cost_params == NULL)
+ return;
+
+ Assert(AmAutoVacuumWorkerProcess());
+
+ SpinLockAcquire(&pv_shared_cost_params->spinlock);
+
+ pv_shared_cost_params->cost_delay = vacuum_cost_delay;
+ pv_shared_cost_params->cost_limit = vacuum_cost_limit;
+
+ SpinLockRelease(&pv_shared_cost_params->spinlock);
+}
+
/*
* Compute the number of parallel worker processes to request. Both index
* vacuum and index cleanup can be executed with parallel workers.
@@ -763,12 +849,26 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
/* Vacuum the indexes that can be processed by only leader process */
parallel_vacuum_process_unsafe_indexes(pvs);
+ /*
+ * To be able to exercise whether leader parallel autovacuum worker can
+ * propagate cost-based params to parallel workers, wait here until
+ * configuration is changed...
+ */
+ INJECTION_POINT("av-leader-before-reload-conf", NULL);
+
/*
* Join as a parallel worker. The leader vacuums alone processes all
* parallel-safe indexes in the case where no workers are launched.
*/
parallel_vacuum_process_safe_indexes(pvs);
+ /*
+ * ...and then wait until leader guaranteed to propagate new parameters
+ * values to the workers. I.e. tests are expecting, that during processing
+ * of parallel safe index we have called vacuum_delay_point,
+ */
+ INJECTION_POINT("av-leader-after-reload-conf", NULL);
+
/*
* Next, accumulate buffer and WAL usage. (This must wait for the workers
* to finish, or we might get incomplete data.)
@@ -1104,6 +1204,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
VacuumSharedCostBalance = &(shared->cost_balance);
VacuumActiveNWorkers = &(shared->active_nworkers);
+ if (shared->am_parallel_autovacuum)
+ pv_shared_cost_params = &(shared->cost_params);
+
/* Set parallel vacuum state */
pvs.indrels = indrels;
pvs.nindexes = nindexes;
@@ -1131,6 +1234,33 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
/* Prepare to track buffer usage during parallel execution */
InstrStartParallelQuery();
+#ifdef USE_INJECTION_POINTS
+ if (shared->am_parallel_autovacuum)
+ {
+ Assert(VacuumActiveNWorkers != NULL);
+
+ /*
+ * To be able to exercise whether leader parallel autovacuum worker can
+ * propagate cost-based params to parallel workers, wait here until
+ * configuration is changed and leader workers had updated shared state.
+ */
+ INJECTION_POINT("av-worker-before-reload-conf", NULL);
+
+ /* Simulate config reload during normal processing */
+ pg_atomic_add_fetch_u32(VacuumActiveNWorkers, 1);
+ vacuum_delay_point(false);
+ pg_atomic_sub_fetch_u32(VacuumActiveNWorkers, 1);
+
+ /*
+ * Wait until worker guaranteed to consume new parameters values from
+ * the leader and save new value in injection point state.
+ */
+ INJECTION_POINT("autovacuum-set-cost-based-parameter",
+ &vacuum_cost_delay);
+ INJECTION_POINT("av-worker-after-reload-conf", NULL);
+ }
+#endif
+
/* Process indexes to perform vacuum/cleanup */
parallel_vacuum_process_safe_indexes(&pvs);
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index ec5d70aacdc..73125439bed 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -411,6 +411,8 @@ extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
int num_index_scans,
bool estimated_count,
PVWorkersUsage *wusage);
+extern bool parallel_vacuum_fix_cost_based_params(void);
+extern void parallel_vacuum_propagate_cost_based_params(void);
extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
/* in commands/analyze.c */
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
index 4cf7344b2ac..32254c53a5d 100644
--- a/src/test/modules/test_autovacuum/Makefile
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -12,6 +12,8 @@ DATA = test_autovacuum--1.0.sql
TAP_TESTS = 1
+EXTRA_INSTALL = src/test/modules/injection_points
+
export enable_injection_points
ifdef USE_PGXS
diff --git a/src/test/modules/test_autovacuum/t/001_basic.pl b/src/test/modules/test_autovacuum/t/001_basic.pl
index 8bf153d132c..eec0f41b6a6 100644
--- a/src/test/modules/test_autovacuum/t/001_basic.pl
+++ b/src/test/modules/test_autovacuum/t/001_basic.pl
@@ -28,6 +28,11 @@ $node->append_conf('postgresql.conf', qq{
});
$node->start;
+if (!$node->check_extension('injection_points'))
+{
+ plan skip_all => 'Extension injection_points not installed';
+}
+
my $indexes_num = 4;
my $initial_rows_num = 10_000;
my $autovacuum_parallel_workers = 2;
@@ -73,6 +78,9 @@ $node->safe_psql('postgres', qq{
CREATE EXTENSION test_autovacuum;
SELECT inj_set_free_workers_attach();
SELECT inj_leader_failure_attach();
+ SELECT inj_check_av_param_attach();
+
+ CREATE EXTENSION injection_points;
});
# Test 1 :
@@ -166,5 +174,80 @@ $node->safe_psql('postgres', qq{
SELECT inj_leader_failure_detach();
});
+# Test 4:
+# Check whether parallel autovacuum leader can propagate cost-based parameters
+# to parallel workers.
+
+# Disable autovacuum on table during preparation for the next test
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+ UPDATE test_autovac SET col_3 = 0 WHERE (col_4 % 3) = 0;
+ ANALYZE test_autovac;
+});
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_attach('av-leader-before-reload-conf', 'wait');
+ SELECT injection_points_attach('av-leader-after-reload-conf', 'wait');
+ SELECT injection_points_attach('av-worker-before-reload-conf', 'wait');
+ SELECT injection_points_attach('av-worker-after-reload-conf', 'wait');
+});
+
+$node->safe_psql('postgres', qq{
+ ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until leader parallel worker get to the point before vacuum_delay_point
+# and change cost-based config parameter.
+
+$node->wait_for_event('autovacuum worker', 'av-leader-before-reload-conf');
+$node->psql('postgres', qq{
+ ALTER SYSTEM SET autovacuum_vacuum_cost_delay = 10;
+ SELECT pg_reload_conf();
+});
+$node->psql('postgres', qq{
+ SELECT injection_points_wakeup('av-leader-before-reload-conf');
+});
+
+# Wait until leader worker propagates new patameter's value to the other
+# workers and let them to call vacuum_delay_point
+
+$node->wait_for_event('autovacuum worker', 'av-leader-after-reload-conf');
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('av-leader-after-reload-conf');
+ SELECT injection_points_wakeup('av-worker-before-reload-conf');
+});
+
+# Check whether parallel worker has consume new parameter's value from the
+# leader.
+# Aactually, it can happen before worker gets to the injection point, but we
+# want to make everything as deterministic as possible.
+
+$node->wait_for_event('parallel worker', 'av-worker-after-reload-conf');
+$node->psql('postgres',
+ "SELECT get_parallel_autovacuum_worker_param_value('vacuum_cost_delay');",
+ stdout => \$psql_out,
+);
+is($psql_out, 10.0, 'Leader successfully propagated parameter value');
+
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_wakeup('av-worker-after-reload-conf');
+});
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+ SELECT injection_points_detach('av-leader-before-reload-conf');
+ SELECT injection_points_detach('av-leader-after-reload-conf');
+ SELECT injection_points_detach('av-worker-before-reload-conf');
+ SELECT injection_points_detach('av-worker-after-reload-conf');
+ SELECT inj_check_av_param_detach();
+
+ DROP EXTENSION test_autovacuum;
+ DROP EXTENSION injection_points;
+});
+
$node->stop;
done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
index 017d5da85ea..cb0407952d7 100644
--- a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -14,6 +14,10 @@ CREATE FUNCTION trigger_leader_failure(failure_type text)
RETURNS VOID STRICT
AS 'MODULE_PATHNAME' LANGUAGE C;
+CREATE FUNCTION get_parallel_autovacuum_worker_param_value(param_name text)
+RETURNS FLOAT8 STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
/*
* Injection point related functions
*/
@@ -32,3 +36,11 @@ AS 'MODULE_PATHNAME' LANGUAGE C;
CREATE FUNCTION inj_leader_failure_detach()
RETURNS VOID STRICT
AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_check_av_param_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_check_av_param_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
index 7948f4858ae..e96cfda7ae9 100644
--- a/src/test/modules/test_autovacuum/test_autovacuum.c
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -38,6 +38,9 @@ typedef struct InjPointState
bool enabled_leader_failure;
AVLeaderFaulureType ftype;
+
+ bool enabled_check_av_param;
+ double vacuum_cost_delay;
} InjPointState;
static InjPointState * inj_point_state;
@@ -92,6 +95,12 @@ test_autovacuum_shmem_startup(void)
"inj_trigger_leader_failure",
NULL,
0);
+
+ InjectionPointAttach("autovacuum-set-cost-based-parameter",
+ "test_autovacuum",
+ "inj_set_av_parameter",
+ NULL,
+ 0);
}
LWLockRelease(AddinShmemInitLock);
@@ -109,6 +118,9 @@ _PG_init(void)
shmem_startup_hook = test_autovacuum_shmem_startup;
}
+extern PGDLLEXPORT void inj_set_av_parameter(const char *name,
+ const void *private_data,
+ void *arg);
extern PGDLLEXPORT void inj_set_free_workers(const char *name,
const void *private_data,
void *arg);
@@ -205,6 +217,45 @@ trigger_leader_failure(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+/*
+ * Set current setting of "vacuum_cost_delay" parameter.
+ *
+ * Function is called from parallel autovacuum worker.
+ */
+void
+inj_set_av_parameter(const char *name, const void *private_data, void *arg)
+{
+ ereport(LOG,
+ errmsg("set autovacuum parameter injection point called"),
+ errhidestmt(true), errhidecontext(true));
+
+ if (inj_point_state->enabled_check_av_param)
+ {
+ Assert(arg != NULL);
+ inj_point_state->vacuum_cost_delay = *(double *) arg;
+ }
+}
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_worker_param_value);
+Datum
+get_parallel_autovacuum_worker_param_value(PG_FUNCTION_ARGS)
+{
+ const char *param_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
+ double value = 0.0;
+
+#ifndef USE_INJECTION_POINTS
+ elog(ERROR, "injection points not supported");
+#endif
+
+ if (strcmp(param_name, "vacuum_cost_delay") == 0)
+ value = inj_point_state->vacuum_cost_delay;
+ else
+ elog(ERROR,
+ "cannot retrieve parameter %s from injection point", param_name);
+
+ PG_RETURN_FLOAT8((float8) value);
+}
+
PG_FUNCTION_INFO_V1(inj_set_free_workers_attach);
Datum
inj_set_free_workers_attach(PG_FUNCTION_ARGS)
@@ -253,3 +304,27 @@ inj_leader_failure_detach(PG_FUNCTION_ARGS)
#endif
PG_RETURN_VOID();
}
+
+PG_FUNCTION_INFO_V1(inj_check_av_param_attach);
+Datum
+inj_check_av_param_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_check_av_param = true;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_check_av_param_detach);
+Datum
+inj_check_av_param_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+ inj_point_state->enabled_check_av_param = false;
+#else
+ elog(ERROR, "injection points not supported");
+#endif
+ PG_RETURN_VOID();
+}
--
2.43.0
Hi,
I noticed one thing: autovacuum_max_parallel_workers is initialized to 0 in globals.c,
but its GUC default (boot_val) is '2' in guc_parameters.dat. While GUC overrides it on startup,
this mismatch may cause confusion. Perhaps we should modify this to match the approach for max_parallel_workers.
--
Regards,
Man Zeng
www.openhalo.org
Hi,
On Wed, Jan 7, 2026 at 8:51 PM zengman <zengman@halodbtech.com> wrote:
I noticed one thing: autovacuum_max_parallel_workers is initialized to 0 in globals.c,
but its GUC default (boot_val) is '2' in guc_parameters.dat. While GUC overrides it on startup,
this mismatch may cause confusion. Perhaps we should modify this to match the approach for max_parallel_workers.
Good catch, thank you!
I'll fix it in the next version of the patch.
--
Best regards,
Daniil Davydov